ABSTRACT
In root-nodule symbiosis, rhizobial invasion and nodule organogenesis are host controlled. In most legumes, rhizobia enter through infection-threads and nodule primordium in the cortex is induced from distance. But in dalbergoid legumes like Arachis hypogaea, rhizobia directly invade cortical cells through epidermal cracks to generate the primordia. Herein we report the transcriptional dynamics with the progress of symbiosis in A. hypogaea by profiling the transcriptome at 1dpi: invasion; 4dpi: nodule primordia; 8dpi: spread of infection in nodule-like structure; 12dpi: immature nodules containing rod-shaped rhizobia; and 21dpi: mature nodules with spherical symbiosomes. Differentially expressed genes show clear transcriptional shifts at these stages. Expressions of putative orthologues of symbiotic genes in ‘crack-entry’ legume A.hypogaea were compared with their expression in model legumes where rhizobia invade through infection-threads. The notable contrasting features were (i) absence of early induction of NIN and NSP2, (ii) insignificant expression of VPY and (iii) significantly high expression of ERF1, bHLH476, EIN2 and divergent PR-1 genes that produce CAPE peptides. Additionally, homologues for RPG, SymCRK and DNF2 were absent in A. hypogaea genome and for FLOT4, ROP6, RR9, NOOT, and SEN1, their symbiotic orthologues were not detectable. A molecular framework that may guide symbiosis in A. hypogaea is proposed.
INTRODUCTION
Nitrogen fixing root-nodule symbiosis (RNS) allows plants to house bacterial diazotrophs in an intracellular manner (Kistner and Parniske, 2002). RNS occurs in two major forms: legume–rhizobia (Fabaceae) and actinorhizal symbiosis (Fagaceae, Rosaceae, Cucurbitaceae)(Pawlowski and Bisseling, 1996). The Leguminosae (Fabaceae) is the third largest family of flowering plants and are agriculturally and economically important, being second only to the Poaceae (e.g. cereals). This economic importance of the Leguminosae is mainly due to RNS that allows the plant to grow well and produce protein rich seeds in the absence of nitrogen fertilizer in soils.
The establishment of RNS involves rhizobial invasion in the root epidermis and nodule organogenesis in the root cortical cells. The most common invasion strategy is through root hair curling and infection thread (IT) formation where the nodule primordia are induced from a distance (Sprent and James, 2007). Invasion through IT is adapted mostly by temperate legumes e.g. Vicia sp., Trifolium sp., Pisum sp..Model legumes like Lotus japonicus and Medicago truncatula also undertake IT mediated rhizobial invasion (Geurts and Bisseling, 2002; Oldroyd and Downie, 2004, 2006). The alternate mode of rhizobial invasion is known as ‘crack-entry’ where the rhizobia enter through natural cracks at the lateral root base in an intercellular manner. This is a characteristic feature of some subtropical legumes (e.g. Arachis sp., Aeschynomene sp., Stylosanthes sp.) belonging to dalbergoids/genistoids and accounts for approximately 25% of all legume genera (Gage, 2004; Giraud et al., 2007). In these legumes, rhizobia directly access the cortical cells for development of their nodule primordia and the infected cells repeatedly divide to develop the mature nodule (Boogerd and Rossum, 1997; Fabre et al., 2015).
Investigations on model legumes have unravelled the molecular basis of RNS. The host responses are initiated by Nod-factor (NF) receptors LjNFR1/MtLYK3 and LjNFR5/MtNFP (Madsen et al., 2003; Radutoiu et al., 2003; Arrighi, 2006; Smit et al., 2007). Another NF induced receptorLjEPR3 was shown to monitor rhizobial exopolysaccharide (EPS) in L.japonicus, indicating a two-stage mechanism involving sequential receptor-mediated recognition of NF and EPS signals to ensure host symbiont compatibility (Kawaharada et al., 2015). Downstream to NFRs is the ‘SYM pathway’consisting of the receptor kinase LjSYMRK/MtDMI2(Endre et al., 2002; Stracke et al., 2002), the predicted ion-channel proteins LjCASTOR and LjPOLLUX/MtDMI1(Ané et al., 2004; Imaizumi-Anraku et al., 2005), the nucleoporins LjNUP85 and LjNUP133(Kanamori et al., 2006; Saito et al., 2007), the Ca2+/calmodulin-dependent protein kinase LjCCaMK/MtDMI3(Lévy et al., 2004; Tirichine et al., 2006), and the transcription factor LjCYCLOPS/MtIPD3(Messinese et al., 2007; Yano et al., 2008). Nodulation-specific transcription factors (TFs), such as MtNSP1/LjNSP1, MtNSP2/LjNSP2, MtERF1 and MtNIN/LjNIN function downstream of the ‘SYM pathway’ and are involved in transcriptional reprogramming for initiation of RNS (Schauser et al., 1999a; Kaló et al., 2005; Smit et al., 2005; Middleton et al., 2007). Very limited information is available for crack-entry legumes from Dalbergioid/Genistoid clade which are basal in their divergence within the Papilionoideae even though they contain important crop legumes such as Lupinus angustifolius and Arachis hypogaea. Transcriptome analysis in legumes has been a valuable resource for understanding symbiosis-related genes in M. truncatula, L. japonicus, Glycine max, and Cicer arietinum. An earlier report have listed several differentially expressed genes (DEGs) at an early stage of symbiosis in A. hypogaea (Peng et al., 2017). The conservativeness among DEGs identified in such studies has implied common genetic mechanisms of RNS in legume species. Herein we report the transcriptome dynamics with the onset and advancement of symbiosis in A. hypogaea using uninfected roots (UI) as a reference. The transcription profile of the putative orthologues of symbiotic genes in crack-entry legume A. hypogaea is compared and contrasted with the corresponding expression profiles in M. truncatula and L. japonicus that undertake root hair mediated symbiosis.
RESULTS
Progress of symbiosis in Arachis hypogaea
Within three weeks after infection with Bradyrhizobium sp. SEMIA 6144, A. hypogaea roots developed spherical functional nodules. We followed the progress of symbiosis in A. hypogaea for 21 days to identify the distinct stages of development by ultrastructure analysis. There are rosettes of root hairs in the junction of taproot and lateral root that are reported to be important for bacterial invasion in A. hypogaea (Boogerd and Rossum, 1997). Within 1 day post infection (1dpi) rhizobia was found to be adhered to these root hairs (Fig. 1A-C).Within 4dpi, bump like primordial structures were noted at the lateral-root bases (Fig. 1D). The longitudinal sections of these primordia revealed one or more centrally-located defined pockets of rhizobia-infected cells that were surrounded by uninfected cells (Fig. 1E). These pockets of rhizobia infected cells were distinct by having reduced calcofluor-binding ability, indicating that they are thin-walled. The intracellularised rhizobia within the infection pocket was undifferentiated and rod-shaped (Fig. 1F). The infection pockets observed at 4dpi act as infection zone (IZ) founder cells and it is their uniform division and differentiation that give rise to the distinct aeschynomenoid type IZ in mature nodules. There has not been a single case where uninfected primordium was noted, which is in accordance with the proposition of infection preceding development of aeschynomenoid nodules (Fabre et al., 2015). By 8dpi there was visible nodule-like structure at the lateral-root base (Fig. 1G). Ultrastructure analysis revealed that by 8dpi, the compactness of the primordial structure with defined pockets of infected cells was lost and the IZ started growing by division of the infected cells (Fig. 1H-I). By 12dpi there were white spherical nodules (Fig. 1J). At this stage the tissue organization turned aeschynomenoid where there were no uninfected cells in the infection zone (IZ) and the endocytosed rhizobia remained undifferentiated and rod-shaped (Fig. 1K-L). At 21dpi the nodules were mature and functional where the rhizobia differentiated within the plant derived peribacteroid membranes to develop spherical symbiosomes (Fig. 1M-O).
Transcriptome analysis with the progress of symbiosis in Arachis hypogaea
Ultrastructural analysis revealed 5 distinct stages during the progress of symbiosis in A. hypogaea: 1dpi: recognition and invasion; 4dpi: primordia formation; 8dpi: nodule-like structure; 12dpi: immature nodules with rod-shaped rhizobia; and 21dpi: mature nodules with spherical symbiosomes. To probe into the expression of genes associated with the progress of symbiosis, RNA was extracted from these stages along with UI roots. RNA-seq was done in triplicate for these six stages using Illumina single-end sequencing technology (IlluminaHiseq 2000 SR50). The genomic data from Arachis duranensis (AA) and Arachis ipaensis (BB) that are two wild diploid parents of A. hypogaea were used to assess the quality and coverage of the assembled transcriptomes. A total of 1,429,876,614 raw reads of 50bp (~71.5Gb) were generated with an average of 88,029,386 reads per library. This was 600 times the total size of transcript sequences (109.0 Mb) of A. hypogaea for both AA and BB genomes and gave an average coverage of 36 times per library. The proportion of clean reads among the total acquired reads was more than 91.34% (Table 1).The filtered reads were simultaneously mapped to the AA and BB genomes where the overall accepted mapping rate per library ranged from 80.15% to 89.98%, with an average mapping rate of 86.42% with A. duranensis (AA) and 86.65% with A. ipaensis (BB).For both AA and BB genome about 66% reads aligned to a gene exon in an unambiguous way, whereas the rest 33% reads aligned outside exon.
The expression level of each assembled transcript sequence was measured through FPKM (Fragments per kilo-base per million reads) values. The DEGs in the 5 different stages of symbiosis were evaluated by the significance of differences in their expression with respect to UI roots using false discovery rate (FDR) < 0.05, P-value <0.05 and fold change |log2 ratio| ≥ 1 (Supplementary Table 1). Comparison between upregulated and downregulated DEGs at different stages is shown in a Venn-Diagram in (Fig.2A). A total of 2745 genes were up-regulated (↑1296:AA,;↑1449:BB) and a total of 20415 genes are down-regulated (↓9709:AA;↓10706:BB) during symbiosis of which 59 genes (33:AA;26:BB) were upregulated and 2095 genes (1056:AA;1039:BB) were downregulated in all the 5 stages of symbiosis. From the Venn-diagram we identified those genes that were first upreregulated or downregulated at a particular stage though their subsequent regulations could be different. The number of such genes upregulated or downregulated at each stage from AA or BB genome is shown in Supplementary Table 2. Differentially expressed genes show clear transcriptional shifts at these stages and the diverse expression patterns of these genes are indicated in a heatmap (Fig.2B). The major expression profiles are shown in expanded heatmaps and line graphs in (Supplementary fig.1). Hierarchical clustering as well as PCA analysis (Supplementary fig.1C) of the transcriptome indicated 3 distinct expression waves. Cluster 1 consists of 1dpi-4dpi transcripts where rhizobial invasion and primordia formation occurs. Cluster 2 consists of 8dpi-12dpi transcripts where the primordia structurally develop into a nodule and cluster 3 consists of the 21dpi transcripts where nodule matures to its functional form (Fig.2B; Supplimentary fig.1).
Functional analysis of DEGs
GO and KEGG terms that are significantly enriched in our DEGs are indicated in Supplementary fig. 2. Among the 1248 enriched GO terms there was a major representation of defense response genes. 470 and 31 such defense related GO terms were enriched in downregulated and upregulated DEGs respectively (Supplementary Table 1). Accordingly, KEGG analysis of plant-pathogen interaction pathways show that most genes involved in pattern-triggered immunity (PTI) was notably down-regulated (Fig. 3A-B; Supplementary Table 3). The FLS2 mediated MAPK pathway however remained active along with a subset of CNGCs and genes encoding Rboh proteins. A subset of genes involved in the effector triggered immunity (ETI) also remained active during symbiosis, for example the genes encoding R proteins like RPM1, RPS2, RPS5, Pti1 kinase, and the pathway regulators like SGT1, HSP90 and EDS1. Intriguingly there was a significant upregulation ofgene encoding PR-1 proteins which are members of Cysteine-rich secretory proteins, Antigen 5, and Pathogenesis-related 1 proteins (CAP) superfamily (Breen et al., 2017).The PR-1 proteins upregulated during symbiosis clustered away from the PR-1 proteins that were reported to be upregulated in defense responses indicating the symbiosis associated PR-1 proteins to be divergent in nature (Fig. 3C). There are two PR-1 proteins that clustered with defense responsive PRs and these PR-1 genes were not upregulated during symbiosis further confirming the symbiotic PR-1s to be distinct. PR-1 proteins harbour an embedded defence signalling peptide (CAP-derived peptides or CAPE) where CNYxPxGNxxxxxPY is considered as a functional motif that mark cleavage of these bioactive peptides(Breen et al., 2017). The cleavage site is conserved in both classes of PR-1 proteins suggesting the CAPE peptides could also be generated from the divergent PR-1 proteins synthesized during symbiosis (Fig. 3C-D, Supplementary Table 7).Since genes encoding CAP proteins are marker genes for the salicylic acid signaling pathway and systemic acquired resistance we also checked the SA/JA pathways to further understand the symbiont responsive signaling in A. hypogaea. As shown in Fig. 3A the JA pathway was completely downregulated but the SA responsive genes like TGA1 and NPR1were up-regulated. Thus symbiotic PR-1gene expression could be justified by the activation of the SA mediated signaling. It needs to be mentioned that our analysis could not locate genes encoding NODULE SPECIFIC CYSTEINE-RICH (NCR) peptides in the DEGs that occurs in legumes belonging to the inverted repeat-lacking clade (IRLC) (e.g. M. truncatula, Pisum sp., and Trifolium sp.) and recently demonstrated in Aeschynomene sp. as well (Van de Velde et al., 2010; Czernic et al., 2015).
Several genes are reported to be expressed in nodulating roots by comparing the transcriptome profiles of nonnodulating and nodulating lines of A. hypogaea (Peng et al., 2017). The list includes known symbiotic genes like NIN, NF-YA, Myb and CLE13 and other genes encoding a receptor kinase, a soluble kinase, a F-BOX protein, transcription factors of SHI-family and a lectin (Supplementary fig. 3; Supplementary Table 4). All these genes were represented in our upregulated transcriptome which thereby revalidates the importance of expression of these genes during the onset of symbiosis in A. hypogaea.
Expression profiles of putative orthologues of symbiotic genes
Our final objective was to understand the expression of the putative orthologues of symbiotic genes in A. hypogaea that are characterized in the model legumes M. truncatula and L. japonicus. A total of 71 genes were chosen and classified on the basis of their primary association (Fig. 4; Supplementary Table 5). BLAST search on A.ipanensis and A. duranensis genome identified 68 (63 annotated) out of 71 genes for which the putative orthology was checked by reciprocal BLAST and sequence alignment. No orthologous gene for MtRPG, MtSymCRK and MtDNF2 could be detected in either of these two parental genomes or in our transcriptome. For 63 out of 68 genes the symbiotic orthologue could be identified where the A. hypogaea sequences clustered with other legumes in the corresponding gene trees (Supplementary fig. 4). But in most cases the A. hypogaea genes were placed at the point of divergence of legumes from nonlegumes which is similar to what has been reported for AhSYMRK, AhCCaMK and AhHK1 in respective distance trees(Sinharoy and DasGupta, 2009; Saha et al., 2014; Kundu and DasGupta, 2017b). Separation between the A. hypogaea genes and the other legume genes correlates with the rhizobial colonization by crack-entry and ITs. Exceptions were genes like MtFLOT4, LjROP6, MtRR9, MtNOOT and LjSEN1 where the protein sequences from A. hypogaea were divergent and clustered with nonlegumes. However the expression of all these divergent genes was found to be significantly high during symbiosis indicating that they might have a role in A.hypogaea nodulation. In several cases we noted genomic bias in expression; for example genes like LjNF-YC, MtFLOT2, MtFLOT4, LjROP6, MtbHLH476 and MtRR4 had AA biased expression whereas expression of MtDELLA, MtERF1, LjCHC1 and LjASTRAY was BB biased (Fig. 4). For comparison of expression of different symbiotic geneswe used the microarray data derived from the M. truncatula gene expression atlas (MtGEA)(http://mtgea.noble.org/v2/)(Benedito et al., 2008)& L. japonicus gene expression atlas (LjGEA) (https://ljgea.noble.org/v2/) (Verdier et al., 2013). If a symbiotic gene is characterised from one of these model legumes reciprocal BLAST was done to identify the orthologue in the other (Supplementary Table 5). Both absolute and the relative expression values (log2 fold) were analysed so that high and constitutively expressed genes are not ignored (Fig. 4; Supplementary fig. 5).
In the recognition module, expression of genes encoding LCO-bindingLYR3(Fliegmann et al., 2013) and EPS binding EPR3(Kawaharada et al., 2015) was significantly higher in A. hypogaea than the classical NF receptors (Fig. 4A). Whereas, in the model legumes the classical NF receptors like LjNFR1/MtLYK3 and LjNFR5/MtNFP have a higher expression than these receptors. In the SYM pathway and early signaling most members had constitutive expression in all the 3 legumes irrespective of their mode of bacterial colonisation (Fig. 4B). Exception was gene encoding orthologue of cyclic nucleotide-gated channel MtCNGC (Charpentier et al., 2016) which was significantly upregulated in A. hypogaea. Most of the interactors of NFRs and SYMRK were also constitutively expressed (Fig. 4C). Expression of genes encoding ubiquitin ligase SIE3 (Yuan et al., 2012) (SYMRK interactor) and a UBQ superfamily protein CIP73 (Kang et al., 2011) (CCaMK interactor) was constitutively expressed in A.hypogaea. Genes encoding E3 uniquitin ligase MtPUB1(Mbengue et al., 2010) (NFR1 interactor) and a symbiotic remorin MtSYMREM (Lefebvre et al., 2010) (SYMRK and NFR1 interactor) that were upregulated in all 3 legumes highlighting their importance in nodulation. Among the TFs, upregulation of NIN (Singh et al., 2014) (target of CYCLOPS) expression was noted in the second transcriptional wave at 8dpi and expression of NSP2 was only detectable in mature nodules of A. hypogaea (Fig. 4D). In model legumes expression of NIN and NSP2 were upregulated on bacterial invasion(Schauser et al., 1999b; Kaló et al., 2005). Unlike model legumes, expression of MtERF1 as compare to other TFs was very high in A. hypogaea. The Expression pattern of MtNSP1, LjNFYA/LjNFYC (McDowell et al., 2013) (target of NIN), LjERN1(Cerri et al., 2017) (target of CYCLOPS), MtDELLA (Jin et al., 2016) (bridge between IPD3/CYCLOPS and NSP2), LjIPN2 (Kang et al., 2014) (NSP2 interactor) and LjSIN1(Battaglia et al., 2014) (NF-YC interactor) was similar in A. hypogaea and in model legumes, suggesting that the basic transcriptional network could be conserved between these legumes. In the infection module MtRPG is responsible for rhizobium-directed polar growth of ITs and gene encoding it’s orthologue was not detected in A. hypogaea (Arrighi et al., 2008). MtVPY (ankyrin repeat) is important for infection progression in model legumes (Murray et al., 2011a) and it had significantly low expression in A. hypogaea. On the other hand expression of factors like LjARPC1(Hossain et al., 2012) and LjCEREBERUS (Yano et al., 2009) that are important for the progress of infection was significantly higher in A. hypogaea (Fig. 4E). Other indicated factors that are required for bacterial invasion like LjNAP1(Yokota et al., 2009), LjPIR1(Yokota et al., 2009), LjnsRING (Shimomura et al., 2006), MtFLOT4 (Haney and Long, 2010) and MtFLOT2 (Haney and Long, 2010) were induced in all 3 legumes suggesting their analogous purposes. In the nodule organogenesis module, expression of cytokinin receptor MtCRE1/LjHK1/AhHK1(Gonzalez-Rizzo et al., 2006b; Murray et al., 2007; Kundu and DasGupta, 2017a) and the cytokinin inducible Type-A RRs like MtRR1(Ariel et al., 2012), MtRR4(Gonzalez-Rizzo et al., 2006b; Op den Camp et al., 2011), LjRR5(Murray et al., 2007) and MtRR9 (Op den Camp et al., 2011)was high in all 3 legumes (Fig. 4F). The expression of cytokinin inducible TF MtbHLH476 (Ariel et al., 2012) was however significantly high in A. hypogaea (Fig. 4F). All other factors that are required for the establishment and maintenance of nodule meristems like MtNIP/LATD (Yendrek et al., 2010), MtWOX5 (Osipova et al., 2012), MtNOOT (Couzigou et al., 2012)and MtENOD40(Crespi et al., 1994) have comparable expression pattern between A. hypogaea and model legumes (Fig. 4F). In the differentiation module, both MtDNF2 and MtSymCRK are reported for supressing defense responses during nodulation were not detected in A. hypogaea genome (Bourcy et al., 2013; Berrabah et al., 2014). All other factors that are required for bacteroid differentiation like LjSUNERGOS1(Yoon et al., 2014), LjVAG1(Suzaki et al., 2014), MtDNF1(Wang et al., 2010), MtRSD(Sinharoy et al., 2013), LjSEN1(Hakoyama et al., 2012), LjSST1(Krusell, 2005) and LjFEN1(Hakoyama et al., 2009) have similar expression in all 3 legumes and may have conserved function (Fig. 4G). Among the nodule number regulators, expression of MtEIN2(Varma Penmetsa et al., 2008) (sickle) was distinct in A. hypogaea(Fig. 4H). It plays a key role in a range of plant–microbe interactions and had a significantly high expression in A. hypogaea. All other regulators like MtSUNN (Elise et al., 2005), LjKLAVIER (Miyazawa et al., 2010), MtEFD (Vernie et al., 2008a), LjASTRAY (Nishimura et al., 2002) and MtRDN (Kassaw et al., 2017) has comparable expression pattern in all 3 legumes (Fig. 4H). Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) was done for11 symbiotic genes to prove the reliability of the RNAseq data (Supplementary Fig. 6). For few time-points the fold change of qRT-PCR and DEG analysis did not exactly match but mostly the results highlighted their consistency.
PCA of symbiotic gene expression in Arachis hypogaea and model legumes
PCA analysis was done to check if there was a signature in the pattern of expression of symbiotic genes in crack-entry legume A. hypogaea that contrasts with the model legumes where rhizobial entry is IT mediated. Fig. 5 is a projection of differential expression of symbiotic genes from A. hypogaea, M.truncatula and L.japonicus into first two principal components. Altogether, expression of around 87% genes were found to be aligned along dimension1 (dim1) and dimension 2 (dim2) in the analysis. Expressions of symbiotic genes that show minimal change in expression and are likely to be regulated at post transcriptional level are clustered near the origin. Only for select genes, there were contrasting trends in differential expression between A.hypogaea and both the model legumes together placing them in opposing or adjacent quadrants. These contrasts were interpreted as significant for A.hypogaea symbiosis (Supplementary fig. 7). For example, among the early signaling and SYM pathway genes AhCNGC was distinctly placed away from their counterparts in model legumes. Among early TFs, NIN, NSP2, and SIN1 were distinct. In the infection module, AhVPY and AhCERBERUS were distinct and placed in opposing quadrants with respect to model legumes. Among the interacting proteins, PUB1 scores in both dimensions in model legumes whereas in A.hypogaea it clusters near the origin. In the organogenesis module cytokinin inducible RR1 and ENOD40 and in the nodule differentiation module SST1havea contrasting trend in expression. Among the nodule number regulators expression of EFD was distinct. These factors highlighted by PCA analysis appear to be differentially adapted in A. hypogaea symbiosis.
DISCUSSION
This is the first systematic effort towards transcriptome profiling with the progress of symbiosis in a crack-entry legume A. hypogaea. 3 major transcriptional programs appear to govern the process. The first program is for rhizobial recognition and generation of nodule primordia by 1-4 dpi, the 2ndprogram is for structural development of nodules by 8-12dpi and the 3rd program is for functional maturation of nodules at 21dpi (Fig. 1-2). The comparison of expression of putative orthologues of symbiotically important genes in A. hypogaea with model legumes highlighted the genes that are important or disposable for its crack-entry mediated root nodule symbiosis (Fig. 4-6).
The most significant observation in A. hypogaea symbiotic transcriptome was the over expression of a group of genes encoding a divergent form of cysteine rich PR-1 proteins during the structural and functional development of nodules 8dpi onwards (Fig. 3). PR-1 proteins are ubiquitous across plant species and are among the most abundantly produced proteins in plants in response to pathogen attack. It is used as a marker for salicylic acid-mediated disease resistance in plants(Breen et al., 2017). Although differential expression of defense response genes belonging to GO:0006952 (defense related) and PR-1/PR-10 protein families has previously been reported for M.truncatula RNS(Jardinaud et al., 2016), this is the first report where a divergent group of PR-1 proteins is shown to be associated with nodule development (Fig. 3).PR-1 proteins harbor a caveolin binding motif (CBM) that binds sterol and an embedded Pro-rich C-terminal peptide (CAPE) that is involved in plant immune signaling(Breen et al., 2017). All the symbiotic PR-1s in A. hypogaea has both these conserved features but whether these CAPE peptides are actually derived from PR-1 proteins during symbiosis remains to be understood. It is relevant to mention here that NCR family of peptides are very highly expressed during nodulation in M. truncatula (Van de Velde et al., 2010). These peptides evolved from defensin ancestors and until recently was assumed to be specific to legume species belonging to the IRLC clade where they are responsible for bacterial endoreduplication (Mergaert et al., 2006). Recently, divergent form of NCR peptides were reported to be essential for bacterial endoreduplication associated shape change in Nod-factor independent crack-entry legume A.evenia (Czernic et al., 2015). Intriguingly NCRs were absent in crack-entry legume A. hypogaea where similar to A. evenia, rhizobia change from a rod to spherical shape but unlike Aeschynomene sp. the symbiosis in A. hypogaea is NF-dependent (CHANDLER et al., 1982; Ibáñez and Fabra, 2011). Thus, it is imperative to investigate whether the antimicrobial CAPE peptides were enrolled as symbiosis effectors in A. hypogaea in place of NCRs.
Based on the comparative expression analysis of symbiotic genes, we propose a simple molecular framework where we highlighted those genes in A. hypogaea that are either conserved or divergent from the model legumes be it in sequence or in expression pattern (Fig. 6). The high expression of LCO binding receptor LYR3 as compared to the classical NF receptors indicated NF signalling could be mediated through this receptor in A. hypogaea (Fig. 4). Intriguingly NFR1 and LYR3 were not reported in A. evenia, which is a NF-independent crack-entry legume as opposed to A. hypogaea which is NF-dependent (Ibáñez and Fabra, 2011; Fabre et al., 2015; Chaintreuil et al., 2016). Expression pattern of genes belonging to SYM pathway and early signalling in A. hypogaea were found to be similar to model legumes with only exception being AhCNGC. Significant upregulation of AhCNGC suggests its possible importance in mediating symbiotic calcium oscillations in the SYM pathway of crack-entry legumes.
Several observations indicated change in expression pattern of symbiotic genes in A. hypogaea in the absence of epidermal IT formation. For example TFs like AhNIN and AhNSP2 are only expressed at the later stages of symbiosis in A. hypogaea indicating that unlike in model legumes these TFs may not have an early role in bacterial entry (Fig. 4). However, the cortical roles for these TFs could be conserved between IT and crack-entry legumes. Factors like MtVPY and MtRPG have a role in polar growth process of IT in model legumes (Arrighi et al., 2008; Murray et al., 2011b). That explains the absence of RPG and insignificant expression of VPY in A. hypogaea transcriptome (Fig. 4). The contrasting expression pattern of LjCERBERUS in A. hypogaea indicated its divergent function during rhizobial invasion through epidermal cracks. In NF-dependent symbiosis, membrane raft proteins like MtFLOT2 and MtFLOT4 are important for IT initiation and elongation(Haney and Long, 2009). While orthologues of both these FLOTs are absent in NF-independent crack-entry legume A.evenia (Chaintreuil et al., 2016), substantial expression was detected in A. hypogaea transcriptome suggesting them to be recruited for other purposes (Fig. 3). The orthologue of EPS binding receptor MtEPR3 is absent in A.evenia but upregulated during symbiosis in A. hypogaea suggesting it to have functions other than regulating IT progression.
Apart from NCRs, there are other features that contrast the process of differentiation in A. hypogaea and A.evenia. For example, neither of the topoisomerases LjSUNERGOS and LjVAG1 and the homocitrate synthase LjFEN1 is detectable in A.evenia (Chaintreuil et al., 2016). On the other hand MtDNF2, a phospholipaseC and MtSymCRK a non-RD receptor kinase that are required for suppressing defense response during bacteroid differentiation are absent in both the legumes belonging to the dalbergoid clade (Fig.6), thus indicating these genes to be disposable for crack entry mediated root nodule symbiosis.
Comparative analysis of DEGs between A. hypogaea and model legume highlighted the predominance of cytokinin and ethylene signaling during A. hypogaea nodulation. Two component cytokinin receptor HK1 has a central role in nodule organogenesis of both A. hypogaea and model legumes (Gonzalez-Rizzo et al., 2006a; Murray et al., 2007; Kundu and DasGupta, 2017b). Although PCA analysis indicated AhHK1, LjHK1 (LHK1) and MtCRE1 to have similar expression pattern, its downstream effectors showed altered pattern of expression during A. hypogaea symbiosis (Fig. 5). Expressions of type-B RR like MtRR1, which is cytokinin responsive transcription factor and responsible for modulating downstream factors like MtNSP1and MtbHLH476was found to have a distinct expression pattern in A. hypogaea in comparison to its model legume counterparts (Fig. 5). In accordance, AhbHLH476 was found to be very highly expressed during A. hypogaea nodulation (Fig. 4). Another cytokinin responsive factor AhENOD40 was also found to be distinctly placed in a different quadrant in PCA analysis (Fig. 5). The distinct role of cytokinin signaling during A. hypogaea nodulation is in accordance to the previous report where silencing of AhHK1 resulted in delayed nodulation associated with problem in nodule differentiation (Kundu and DasGupta, 2017b).
During nodulation ethylene responsive transcription factors play a decisive role by controlling cell division and differentiation (Asamizu et al., 2008; Vernie et al., 2008b). Previous report on A. hypogaea transcriptomics highlighted the upregulation of several AP2-domain containing ethylene responsive TFs during nodulation (Peng et al., 2017). Similarly, our transcriptomic analysis also indicated significantly high expression of the symbiotic orthologue of ERF1 (Fig.4). In L. japonicus LjERF1 is a positive regulator of nodulation and downregulates the expression of defense gene LjPR-10 during symbiosis(Asamizu et al., 2008). Intriguingly in A.hypogaea high expression of ERF1 is associated with high expression of PR-1s indicating that the ethylene signalling network is differently recruited during A.hypogaea symbiosis (Fig. 5).In consistence with such proposition the expression of EIN2, the master regulator of ethylene signalling was significantly high in A.hypogaea, and the pattern of expression of EFD, a negative regulator of nodulation was distinctly different from model legumes (Fig. 4-5). The differential role of ethylene signalling during crack entry nodulation strongly supports the fact that ethylene signalling inhibits intracellular infection via infection threads while promoting intercellular infection via crack-entry (Vernie et al., 2008b).
In summary, the transcriptional dynamics with the progress of symbiosis in A. hypogaea highlighted the factors that are disposable or essential for the inception and progress of symbiosis in a crack entry legume.
MATERIALS AND METHODS
Plant Materials and Sample Preparation
Five different developmental stages of A. hypogaea total infected roots, nodules and uninfected roots were used in this study (UI, 1DPI, 4DPI, 8DPI, 12DPI and 21DPI). A. hypogaea JL24 strain seeds (from ICRISAT, INDIA) were surface sterilized and soaked into sterile water for germination. Germinated seeds were then transferred in pots containing sterile vermiculite and soilrite at 25°C growth room for 7 days before inoculation with Bradyrhizobium sp. SEMIA 6144 (from Adriana Fabra, Universidad Nacional de Rio cuarto, Cordoba, Argentina) grown in liquid Yeast-Mannitol broth supplemented with 100mM CaCl2 at 28°C(A600= 0.5–0.7). Samples are harvested, cleaned and freezed in liquid nitrogen. Frozen samples are stored at −80°C for RNA isolation.
Phenotypic analysis and microscopy
Images of whole-mount nodulated roots were captured using a Leica stereo fluorescence microscope M205FA equipped with a Leica DFC310FX digital camera (Leica Microsystems). Detached nodules were embedded in Shandon cryomatrix (Thermo scientific) and sliced into 30-µm thick secti ons with a rotary cryomicrotome CM1850 (Leica Microsystems). For confocal microscopy, sample preparation was done according to Haynes and associates(Haynes et al., 2004). Sections were stained with Calcofluor (Life Technologies), Propidium Iodide (Life Technologies) and Syto9 (Life Technologies). Images were acquired with a Leica TCS SP5 II AOBS confocal laser scanning microscope (Leica Microsystems). For confocal and scanning electron microscopy, sample preparation was done according to Kundu et al. (Kundu and DasGupta, 2017b). All digital micrographs were processed using Adobe Photoshop CS5.
Isolation of total RNA
A total 100mg of frozen plant root was ground in liquid nitrogen, and total RNA was isolated using Trizol reagent (Invitrogen, USA). RNA degradation and contamination was detected on 1% agarose gels. RNA concentration was then measured using NanoDrop spectrophotometer (Thermo Scientific).Additionally, RNA integrity was assessed using the Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA). Finally, the samples with RNA integrity number (RIN) values above 8 were used for library construction.
Library construction and Sequencing
18 RNA library was prepared using an IlluminaTruSeq stranded mRNA sample preparation kit by MGX-Montpellier GenomiX core facility (MGX) France (https://www.mgx.cnrs.fr/). The protocol first requires the selection of polyadenylated RNAs on oligodT magnetic beads. Selected RNAs are chemically fragmented and the first strand cDNA is synthesized in the presence of actinomycin D. The second strand cDNA synthesis is incorporating with dUTP in place of dTTP which quenches it to the second strand during amplification. A 3' ends adenylation is used to prevent fragments from ligating to one another during the adapter ligation process. The quantitative and qualitative validation of the library is performed by qPCR, ROCHE Light Cycler 480 and cluster generation and primary hybridization are performed in the cBot with an Illumina cluster generation kit. The sample libraries were sequenced on an IlluminaHiSeq 2000, sequencing by synthesis (SBS) technique performed by MGX, France and 50bp single-end reads for each library were generated (Fuller, 1995).
Illumina Reads Mapping and Assembly
Quality control and assesment of raw Illumina reads in FASTQformat were done by FastQC software (Version 0.11.5) to obtain per base quality, GC content and sequence length distribution. Clean reads were obtained by removing the low quality reads, adapters, poly-N containing reads by using Trimmomatic v0.36 software(Bolger et al., 2014). Clean Reads are simultaneously aligned to the two wild peanut diploid ancestors A. duranensis(AA) and A. ipaensis(BB) reference genome by using TopHat2 version 2.0.13 which is a fast splice junction mapper for RNA-Seq reads (Trapnell et al., 2010; Bertioli et al., 2015). It aligns RNA-Seq reads using the ultra high-throughput short read aligner Bowtie2 version 2.2.3, and then analyzes the mapping results to identify splice junctions between exons(Langmead et al., 2009). The alignment files were combined and analyzed into Trinity for genome-guided assembly (Grabherr et al., 2011). The reference based assembly was compared to its respective transcript files from annotated reference genomes by using BLAT(Kent, 2002). An e-value cutoff of ‘1e−05’ was used to determine a hit. The annotated hits were furthermore analysed in this study. Genome annotation files in generic feature format (GFF) are downloaded from peanut database (https://peanutbase.org/download)(Dash et al., 2016). Estimation of gene expression level of each annotated transcript was performed by StringTie v1.3.3 which takes sorted sequence alignment map (SAM) or binary (BAM) file for each sample along with genome annotation files (Pertea et al., 2015). Resulted gene transfer format (GTF), normalized gene locus expression level as fragments per kilobase million (FPKM), transcripts per million (TPM), and count files for each sample were further analyzed for fold change analysis in gene expression levels.
Identification of DEGs and functional Gene Ontology and KEGG pathway analyses of the DEGs
Before statistical analysis, genes with less than 2 values lower than one count per million (cpm) were filtered out. EdgeR 3.6.7 package was used to identify the differentially expressed genes(Robinson et al., 2010). Data were normalized using "Trimmed mean of M-values (TMM)" method. Genes with adjusted p-value less than 5% (according to the FDR method using Benjamini-Hochberg correction) and|log2 (fold change)| >1 was called differentially expressed. Venn-diagram are generated using (http://www.interactivenn.net/)(Heberle et al., 2015) and hierarchical heatmap is generated usingTM4MeV (http://mev.tm4.org and http://www.tigr.org/software/tm4/mev.html)(Howe et al., 2011) the values from the venn diagram (Supplementary Table 2).Detailed functional annotation and explanations of DEGs were extracted from gene ontology database (http://www.geneontology.org/)(Ashburner et al., 2000) and GO functional classification analysis was done using software WEGO (http://wego.genomics.org.cn/cgi-bin/wego/index.pl)(Ye et al., 2006). The GO terms for DEGs in genome annotation were also retrieved from the ‘GFF’ file downloaded at PeanutBase website (http://peanutbase.org). To identify important and enriched pathways involved by the DEGs, the DEGs were assigned to the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways using the web server (http://www.genome.jp/kaas-bin/kaas_main)(Kanehisa and Goto, 2000) against A. duranensis and A. ipaensis gene datasets. Enriched KO and GO terms are obtained by a developed Python script which uses hypergeometric testand Bonferroni corrected P-Value < 0.05.
Identification of Symbiotic orthologous gene in A. hypogaea
Candidate symbiotic genes were identified in A. hypogaea, L. japonicus and M. truncatula using BLASTN searches with reported nucleotide sequence of genes from L. japonicus and M. truncatula. The homologous genes of were searched in A. duranensis and A. ipaensis in PeanutBase (http://peanutbase.org), M. truncatula Mt4.0v1 genome was searched in M.truncatula gene expression atlas(MtGEA) (http://mtgea.noble.org/v2/) and the L. japonicus v3.0 genome was searched in L. japonicus gene expression atlas (LjGEA) (https://ljgea.noble.org/v2/). Initial searches were conducted with E-value = e−5. The results were manually validated for the presence of an orthoologous gene in an open reading frame and searched for orthologues using BLASTP. Orthology of the genes were validated by generating neighbor joining phylogenetic tree using amino acid sequences in MEGA 6.0 obtained from BLASTP (Tamura et al., 2013).
qRT PCR validation
Total RNA (500 ng) was reverse-transcribed by using Super-ScriptIII RT (Life Technologies) and oligo (dT). RNA quantity from each sample in each biological replicate was standardized prior to first-strand cDNA synthesis. qRT-PCR was performed by using Power SYBR Green PCR Master Mix (Applied Biosystems) using primers as designed using software Oligoanalyser (Intergrated DNA Technology) (Supplementary Table S6). Calculations were done using the ΔΔ cycle threshold method using AhActin as the endogenous control. The reaction were run in Applied biosystems 7500 Fast HT platform using protocol: 1 cycle at 50°C for 2 mins, 1 cycle at 95°C for 5 min, 40 cycles at 95°C for 30 sec, 54°C for 30 sec, 72°C for 30 sec followed by melt curve analysis at 1 cycle at 95°C for 1 min, 55°C for 30 sec, and 95°C for 30 sec. A negative control without cDNA template was checked for each primer combination which was designed using OligoAnalyzer 3.1 (https://www.idtdna.com/calc/analyzer). Results were expressed as means standard error (SE) of the number of experiments.
Data Availability
The raw FASTQ files for the 18 libraries were deposited in the Gene expression omnibus (GEO) of NCBI under accession number GSE98997.
Author’s Contribution
Project planning: A.K. and M.D.G. Sample preparation: K.K. and A.K.; Microscopy of symbiosis: A.K.; Preparation of RNA: K.K. Production of Illumina libraries, sequencing and transcriptome assembly: E.D, D.S.; Analysis of transcriptome: K.K. and A.Z.; Analysis of symbiotic transcriptome: K.K and A.K.; Critical analysis of data: P.C and F.C. Writing of the manuscript: A.K., K.K. and M.D.G. All authors approved the manuscript.
Acknowledgement
This work was funded by Grants from Govt. of India: IFCPAR/CEFIPRA (IFC/5103-4/2014/543); DBT-CEIB (Centre of Excellence and Innovation in Biotechnology, BT/01/CEIB/09/VI/10); DBT-IPLS (BT/PR14552/INF/22/123/2010; fellowship to K.K and A.Z.R (IFCPAR/CEFIPRA: IFC/5103-4/2014/543); fellowship to A.K (Council of Scientific and Industrial Research, CSIR-09/028[0756]/2009–EMR–I).