Abstract
Nucleotide-binding site leucine-rich repeat resistance genes (NLRs) allow plants to detect microbial effectors. We hypothesized that NLR expression patterns would reflect organ-specific differences in effector challenge and tested this by carrying out a meta-analysis of expression data for 1,235 NLRs from 9 plant species. We found stable NLR root/shoot expression ratios within species, suggesting organ-specific hardwiring of NLR expression patterns in anticipation of distinct challenges. Most monocot and dicot plant species preferentially expressed NLRs in roots. In contrast, Brassicaceae species, including oilseed rape and the model plant Arabidopsis thaliana, were unique in showing NLR expression skewed towards the shoot across multiple phylogenetically distinct groups of NLRs. The Brassicaceae NLR expression shift coincides with loss of the endomycorrhization pathway, which enables intracellular root infection by symbionts. We propose that its loss offer two likely explanations for the unusual Brassicaceae NLR expression pattern: loss of NLR-guarded symbiotic components and elimination of constraints on general root defences associated with exempting symbionts from targeting. This hypothesis is consistent with the existence of Brassicaceae-specific receptors for conserved microbial molecules and suggests that Brassicaceae species are rich sources of unique antimicrobial root defences.
Introduction
The sessile nature of vascular plants has spurred development of mechanisms for coping with biotic and abiotic stresses and for optimizing uptake of inorganic compounds under low nutrient availability. In response to these challenges, plant roots and shoots have evolved specialized functions above and below ground, where they have also adapted to interact with the distinct microbial communities of the phyllo- or rhizosphere. These diverse plant-microbe interactions range from symbiosis over parasitism to pathogenic infection (Bulgarelli et al. 2013; Fatima et al. 2015; Vandenkoornhuyse et al. 2015).
Reflecting the different characteristics of plant roots and shoots, distinct host-microbe combinations have been used to unravel the molecular components required for trans-species interaction and communication. In plant shoots, the focus has almost exclusively been on pathogenic interactions, where work in the model plant Arabidopsis thaliana (Arabidopsis) from the Brassicaceae family has provided great insight into plant immunity (Jones and Dangl 2006; Nishimura et al. 2010). Passive defences, such as the waxy cuticle on epidermal cells, cell walls and preformed anti-microbial chemicals form the first barriers for microbes and are often sufficient for deterring would-be pathogens (Thordal-Christensen 2003). Microbes that successfully evade these obstacles encounter a large repertoire of resistance (R) proteins in the form of trans-membrane receptor-like proteins and receptor-like kinases on the surface of plant cells, which recognize conserved microbe-associated molecular patterns (MAMPs). Upon activation, these pattern-recognition receptors (PRRs) trigger complex intracellular signalling cascades, such as phytohormone perturbations, accumulation of ions, mitogen-activated protein kinase activation and production of reactive oxygen species, ultimately leading to transcriptional and translational changes that promote the production of defence compounds ( Pel et al. 2012; Muthamilarasan et al. 2013).
To escape this MAMP-triggered immunity (MTI), microbes have evolved effectors that are injected into the plant cell cytoplasm using specialized secretion systems that penetrate the plant cell membrane. Upon translocation, these effectors target components of the defence machinery, suppressing immune signalling and gene expression through degradation, allosteric or covalent modification of host molecules, thus adapting the local environment to be more suitable for microbial growth and improving the chances of successful tissue colonization (Jones and Dangl 2006; Xin et al. 2013; Le Fevre et al. 2015). In response, plant cells employ a family of intracellular R proteins that recognize effectors either by direct interaction, or indirectly through detection of modifications made to host proteins (Khan et al. 2015). Effector-triggered immunity (ETI) activation by an intracellular R protein leads to a stronger immune response than that of MTI and is often associated with localized cell death to limit the spread of biotrophic pathogens (Jones and Dangl 2006; Hofius et al. 2007).
The majority of intracellular R proteins share a similar structure with an amino-terminal signalling domain, followed by a highly conserved nucleotide binding domain (NBD) and a carboxy-terminal leucine-rich repeat (LRR) domain of variable length ( van der Biezen et al. 1998; Takken et al. 2012). This class of R proteins are referred to as nucleotide-binding site leucine-rich repeat (NLR) proteins. The NBD domain class is shared by Apaf1, plant R proteins and CED4 (NB-ARC) and is highly conserved among all NLR proteins. It acts as a molecular switch, and cycles between active ATP-bound and inactive ADP-bound states depending on the activity of the LRR domain. The LRR domain is believed to be directly involved in protein-protein interactions with microbial effectors or host proteins and to function by auto-suppressing the NBD domain of the NLR (Jones and Jones 1997; Takken et al. 2006; Marquenet et al. 2007; Lukasik et al. 2009; Takken et al. 2012). The amino-terminal signalling domain is generally divided into two separate classes based on homology to either the signalling domain of Toll/Interleukin-1 Receptors (TIR) or the presence of a coiled-coil (CC) domain. These two distinct signalling components share common downstream signalling pathways, however both classes have also been observed to activate separate downstream components (Aarts et al. 1998; Falk et al. 1999; Meyers et al. 1999; Pan et al. 2000; Takken et al. 2006; Hofius et al. 2009). While both CC and TIR type NLRs (CNLs and TNLs, respectively) are widely distributed in dicots, canonical TNLs appear to be absent in monocots ( Meyers et al. 1999; Pan et al. 2000; Meyers et al. 2002; Tarr et al. 2009). In addition, variations of the signalling domain-NBD-LRR (NLR) structure can be found in most plant species, with NBD-containing proteins lacking either the amino-terminal signalling domain or the carboxy-terminal LRR domain, or having juxtaposed non-canonical domains, extending their flexibility as signalling components or effector decoys for host proteins (Bonardi et al. 2012; Kroj et al. 2016).
Whilst many NLRs play important roles in Arabidopsis shoot immunity, little is known about how Arabidopsis roots mount immune response against microbes, or what role NLRs play. However, the PRR FLAGELLIN-SENSITIVE2 is fully functional in roots and activates similar downstream MAP-kinase cascades in both root and shoot (Millet et al. 2010). There are reported differences between roots and shoots for the phytohormone salicylic acid, which is considered a requirement for basal defence in leaves against biotrophic pathogens, but does not appear to be as important in root immune responses (Jones and Dangl 2006; Millet et al. 2010).
Unlike the work on Arabidopsis pathogen responses, studies of root-microbe interactions have focused on endosymbiosis. Up to 90% of all terrestrial plants are believed to associate with arbuscular mycorrhizal (AM) fungi to enhance their acquisition of phosphorus and other nutrients. Plant associations with nitrogen-fixing bacteria contained within nodules is restricted to around 10 families, including the agriculturally important Fabaceae (legume) family (Doyle 1998; Gualtieri et al. 2000; Parniske 2008). Arabidopsis belongs to the Brassicaceae family which is one of the few plant families that has lost the capacity for root endosymbiosis with mycorrhizal fungi that is ancestral to the Angiospermae (flowering plants) (Gualtieri et al. 2000; Smith et al. 2010; Delaux et al. 2014). Two model plants from the legume family, Lotus japonicus (Lotus) and Medicago truncatula (Medicago), have been extensively studied for unravelling the genetic pathways required for root nodulation through their symbiotic association with gram-negative soil bacteria collectively referred to as rhizobia (Barker et al. 1990; Handberg and Stougaard 1992). This work has led to the discovery of nodulation factors (NF), a key signal molecule secreted by rhizobia, and several host receptors that perceive and transduce the signal through regulatory components to modulate downstream transcriptional regulation and coordinate nodule organogenesis and infection of these by nitrogen-fixing rhizobia (Long 1989; Schauser et al. 1999; Limpens et al. 2003; E. B. Madsen et al. 2003; Radutoiu et al. 2003; Lévy et al. 2004; Kalo et al. 2005; Smit et al. 2005; Tirichine et al. 2006; Kouchi et al. 2010; Madsen et al. 2010). Similar to NF produced by rhizobia, AM fungi secrete Myc factors to activate symbiotic signalling in the host. Despite their distinct phenotypic characteristics, AM and nodulation pathways share conserved genetic components, likely owing to their common evolutionary origin (Oldroyd and Downie 2006; Parniske 2008; Banba et al. 2008; Singh and Parniske 2012; Guillotin, Couzigou, and Combier 2016).
Despite the history of focusing on pathogenic plant-microbe interactions in plant shoots and on symbiotic interactions in roots, both organs are prone to pathogen infection and would presumably be protected by NLR proteins present in cells subject to effector challenge. Currently, little is known about the expression characteristics of NLRs and, unless they are ubiquitously expressed across all plant organs, NLR gene expression patterns could provide indications about differences in pathogen effector pressures between plant tissues and across plant species. Here we present a meta-analysis of NLR gene expression data, including plant species with and without the capacity for mycorrhizal and/or root nodule symbiosis. The analysis revealed stable root to shoot NLR gene expression ratios within species, with all of the endomycorrhizal plant species examined predominantly expressing NLRs in roots. In contrast, large differences were found between species, with the Brassicaceae family displaying an aberrant shoot-skewed expression, which suggested an unusual mode of plant-microbe interaction for this plant family.
Results
NLR gene expression varies between tissues in a species-specific manner
Individual plant organs have evolved to function in specific environments, where they interact with distinct microbiota (Vandenkoornhuyse et al. 2015). To investigate if NLR expression patterns reflected these tissue differences we identified all putative NLRs in Lotus and Arabidopsis, where expression atlas data was available for multiple tissues (Schmid et al. 2005; Høgslund et al. 2009; Verdier et al. 2013) (Supplemental table 1-2 and Supplemental file 1). We then examined the available expression data and identified genes predominantly expressed in reproductive, shoot, root or root nodule tissues. NLR expression in Lotus shoot and nodule tissues did not show significant differences compared to overall gene expression, but reproductive tissues showed strong depletion of NLR expression and Lotus roots displayed a significant enrichment of NLR expression (Figure 1A-B). For Arabidopsis, reproductive tissues also showed a significant depletion of expressed NLR genes, but Arabidopsis roots did not show enriched NLR gene expression. Instead, Arabidopsis shoots displayed a significant enrichment of NLR gene expression (Figure 1C-D).
To investigate if the contrasting root/shoot NLR gene expression ratios were general for the two species, we examined additional data sets. For Arabidopsis, we quantified NLR root/shoot expression ratios based on two recent RNA-seq experiments including both root and shoot samples in the same experimental series (van Veen et al. 2016; Liu et al. 2016). Both RNA-seq data sets showed a clear shoot skew for Arabidopsis NLRs relative to the average expression ratio for all genes (Figure 1E), and the NLR expression ratios were strongly correlated across array and RNA-seq experiments (Figure 1F). Since no equivalent data sets were available for Lotus, we carried out an RNA-seq experiment including mock and rhizobium inoculated root and shoot samples. For Lotus, the RNA-seq data was also consistent with the array data in showing a pronounced root skewed NLR expression (Figure 1G-H). Since bacterial inoculation could potentially influence NLR root/shoot expression ratios, we compared Lotus inoculated and uninoculated samples, but found no significant differences in the root/shoot NLR expression ratios for neither the array nor the RNA-seq experiment (Supplemental figure 1). NLR root/shoot expression ratios thus showed clear differences between Lotus and Arabidopsis, and these differences were consistent across independent experiments carried out using either array or RNA-seq methodology for transcript quantification, indicating that regulation of NLR gene expression varied between organs in a species-specific manner.
The Brassicaceae family shows aberrant shoot-skewed NLR gene expression
To determine which of these contrasting patterns of NLR gene expression was predominant among flowering plants, we analysed additional species for which root and shoot tissues had been subjected to global expression profiling in the same experiment. These included three legume species (Medicago, Glycine max, Lupinus albus), two Brassicaceae family members (Brassica rapa ssp. pekinensis, Brassica napus) and two monocots (Zea mays, Oryza sativa) (Figure 1I and Supplemental figure 2). We calculated root/shoot expression ratios for whole transcriptomes, including only samples where root and shoot tissues had been analysed in the same experimental series (Supplemental tables 1-2). We identified a total of 2,167 NLR genes across the selected species, and expression data was available for 1,235 out of the 2,167 NLRs (Supplemental table 3). Like Lotus, the three other dicot legumes and the two monocots displayed NLR gene expression skewed towards the root when compared to the overall gene expression pattern (Figure 1I, Supplemental figure 2 and Supplemental table 4). In comparison, the three Brassicaceae species stood out by displaying shoot-skewed NLR gene expression (Figure 1I and Supplemental table 4). Comparisons within either the legume, Brassicaceae or monocot groups did not show any statistically significant differences. However, when we compared between species groups, many comparisons showed significant differences, with the differences between Brassicaceae versus both legumes and monocots highly significant (Figure 1I and Supplemental table 5). Among the flowering plants investigated, shoot-skewed expression of NLR genes was a feature exclusive to the dicot Brassicaceae family, while the remaining monocots and dicot species all displayed root-skewed expression.
The Brassicaceae expression shift is seen across multiple NLR clades
We speculated if the Brassicaceae expression shift could have been caused by the loss of a specialized set of phylogenetically related NLRs evolved specifically to guard the root endosymbiotic machinery or other root specific pathways. To test this hypothesis, we categorized all identified NLRs by aligning their NBDs and constructing a phylogenetic tree based on 2,033 sequences (Figure 2A). In addition to the previously mentioned species, we included the carnivorous and submerged aquatic bladderwort Utricularia gibba from the Asterids clade, which lacks a true root (Ibarra-Laclette et al. 2013). The phylogenetic analysis allowed us to identify five well-supported major NLR clades (Figure 2B, Supplemental file 2, and Supplemental table 7). We also categorized the NLRs based on the presence of TIR, CC or CCR amino terminal signalling domains (Xiao et al. 2001; Meyers et al. 2003; Shao et al. 2016) and compared these results to our phylogenetic analysis (Supplemental figure 3 and Supplemental table 6). Hereafter, we refer to NLRs containing TIR, CC and CCR domains as TNLs, CNLs and RNLs, respectively. NLRs containing neither of the three described domains are referred to as XNLs. Clade 1 was highly enriched in TNLs (708/806), CNLs dominated clade 2 (307/510) and clade 4 (326/385), clade 5 was enriched for RNLs (62/87), and clade 3 contained mainly XNLs (211/245) (Supplemental table 7). The clear correlation between domain structure and the NBD-based phylogeny indicated that the NBD sequences contained sufficient information for inferring the evolutionary history of the plant NLR family, as previously suggested (Pan et al. 2000).
In accordance with previous studies, we did not observe any sequences from monocots in the TNL-enriched clade 1, but among all sequences analysed we did find 7 monocot NLRs that had an identifiable TIR-like domain, which has previously been observed to be juxtaposed irregularly compared to the normal TIR domain (Meyers et al. 2002; Caplan et al. 2013). We did not recover any TIR or TIR-related domain containing NLR sequences from U. gibba either, despite it being a dicot (Pan et al. 2000; Fluhr 2001; Tarr et al. 2009; Ibarra-Laclette et al. 2013). In fact, U. gibba sequences were only found in clades 2 and 4 (Figure 2B and Supplemental table 7).
We then plotted NLR root/shoot ratios for the five NLR clades. Across data from all species, we observed highly significant root skews for the CNL-enriched clade 2 and for the RNL-enriched clade 5 (Figure 2C and Supplemental table 8). When examining the Brassicaceae, legume and monocot species groups separately, we found significant shoot skews for Brassicaceae clades 2, 3 and 4, and significant root skews for monocot clades 2 and 4 and for all legume NLR clades. The mean Brassicaceae root/shoot expression ratios deviated significantly from those of legumes for clades 1-4, and from monocots for clades 3 and 4 (Figure 2D and Supplemental table 8). In contrast, we did not find significant deviations between Brassicaceae and legumes for the RNL-enriched clade 5, where both species groups showed root-skewed expression. Since we observed significant Brassicaceae deviations for multiple NLR clades, a monophyletic group of NLRs was not responsible for the Brassicaceae expression shift. However, there were differences between the NLR clades in the severity of the shift, with the smallest effect seen for the TNL-enriched clade 1.
Comparing the species tree (Figure 2A) to the NBD-based NLR tree (Figure 2B), we noted that clade 1 and 5 in the NBD tree contained mainly dicot members, whereas clades 3 and 4 comprised monocot and dicot members from all species, in line with the species tree. In contrast, clade 2 from the NBD-tree was depleted in dicot Brassicaceae members, while both legume and monocot members were well-represented, indicating a family-specific depletion of a major NLR clade in the Brassicaceae family (Figure 2B and Supplemental table 7).
NLR Clade 2 depletion is not generally associated with loss of mycorrhization
Although the NLR clade 2 depletion observed in the Brassicaceae family (Supplemental table 7) could not explain the Brassicaceae expression shift, it remained possible that NLR clade 2 would generally be depleted across non-mycorrhizal plants, pointing to a potentially specialized function in guarding the endomycorrhizal signalling machinery. To test this hypothesis, we identified and extracted NLR protein sequences from 8 additional non-mycorrhizal plant species and constructed a new phylogenetic tree containing a total of 2,448 NLR sequences (Figure 3A-B, Supplemental table 9, and Supplemental file 3). We found that 120 out of the 415 new NLR sequences were present in clade 2, leading us to reject our hypothesis that this clade had evolved specifically for guarding root endosymbiotic symbiotic components (Figure 3C and Supplemental table 9). After including three additional Brassicaceae species, we still observed a pronounced family-specific Brassicaceae depletion in clade 2, as we only found 20 out of 544 Brassicaceae NLRs belonging to this family (Supplemental table 9). In conclusion, NLR clade 2 depletion is likely Brassicaceae family specific and is not generally associated with loss of the endomycorrhizal pathway.
Discussion
Cytoplasmic NLRs make up the last line of defence against potentially pathogenic microbes that have evaded physical barriers and membrane-localized PRRs to successfully deliver effectors into plant cells. The stable root/shoot NLR expression ratios observed here are consistent with a defence system in which NLR expression patterns are hardwired to match organ-specific effector challenges, in anticipation of microbial challenge, similar to that observed for the plant circadian cycle (Ingle 2011; Wang et al. 2011). Indeed, we also found that rhizobium inoculation of the nodulating legume Lotus did not alter the overall pattern of NLR expression, further underlining the stability within species of NLR root/shoot expression ratios. It was striking that we found an overall root-skew in NLR expression in the majority of plant species. This suggested that roots generally experience a higher level of effector pressure than shoots, despite the fact that NLR function has mainly been characterized in the context of shoot-pathogen interactions (Erb et al. 2009; Nishimura and Dangl 2010). It might not be surprising given the complexity of soil microbial communities, but our data does underline the need for establishing new root pathosystems and for understanding the role of NLRs in root-microbe interactions.
Plants from the Brassicaceae family made up a very conspicuous group of outliers that displayed shoot-rather than root-skewed NLR expression. The Brassicaceae are also outliers in the sense that they have lost the capacity for root endomycorrhization, which remains functional in 80-90% of land plants (Parniske 2008; Delaux et al. 2014). This symbiotic interaction between plant roots and arbuscular mycorrhizal fungi has existed for around 400 million years, coinciding with the appearance of terrestrial plants, and parts of the mycorrhization signalling machinery have been recruited in the ∼110 million year old symbiotic interaction between plants and nitrogen fixing rhizobia (Parniske 2000; Deguchi et al. 2007). NLRs are also found in early land plant species, such as Bryophytes and lycophytes (Xue et al. 2012; Yue et al. 2012; Jacob et al. 2013; Tanigaki et al. 2014), meaning that endomycorrhizal signalling has co-evolved with NLRs through hundreds of millions of years.
It is conceivable that a specialized set of phylogenetically related NLRs could have evolved specifically to guard the root endosymbiotic machinery or other root specific pathways, and that the Brassicaceae NLR expression shift might be caused by the loss of such a group of NLRs. Here, we tested this hypothesis by grouping NLRs according to the sequence homology of their NBD domains, identifying five major clades. While the CNL-enriched NLR clade 2 was strongly depleted in the Brassicaceae, it was well-represented in other non-mycorrhizal plants. In addition, we observed a Brassicaceae shoot skew for all NLR clades, with the smallest shift observed for TNLs, which are absent in the endomycorrhizal monocots rice and maize, and therefore cannot be generally required for protecting the endomycorrhizal signalling machinery. The shift in Brassicacae NLR expression could thus not be attributed to the loss of a single NLR clade, and our data did not support the existence of a specific group of phylogenetically distinct NLRs guarding the root endosymbiotic machinery.
The general expression shift towards the shoot across four major NLR clades suggests a reduced anticipation of effector challenge to root cells relative to shoot cells in the Brassicaceae. We envisage two scenarios, which are not mutually exclusive, that could account for the shift. First, our data is consistent with a model where NLRs were randomly recruited from an expanding NLR complement, regardless of phylogenetic origin, for guarding root specific components. When the guarded pathways became defunct in the Brassicaceae family, it gradually lost the associated root-expressed NLRs across the different NLR clades, leading to the overall shoot skew in NLR expression. Second, rather than passively reducing the effector challenge level to roots by loss of a potentially exposed pathway, the Brassicaceae could have developed family-specific active measures that efficiently deter putative soil pathogens before they have a chance to deploy their effectors, reducing the requirement for NLR protection. One possibility is that the Brassicaceae maintain high levels of antimicrobial glucosinolates in the root apoplasm, and there are indications that root have higher constitutive glucosinolate levels than shoots (Van Dam, Tytgat, and Kirkegaard 2009). Another is that the Brassicaceae have evolved a unique set of highly efficient pattern recognition receptors that quickly eliminate putative root pathogens. For instance, the Ef-Tu and lipopolysaccharide PRRs are thought to be Brassicaceae-specific (Kunze et al. 2004; Ranf et al. 2015).
The root endosymbiosis signalling pathway allows intracellular accommodation of symbiotic mycorrhizal fungi and rhizobia (Madsen et al. 2010; Oldroyd 2013). This could impose severe constraints on the general defence mechanisms employed in roots of plant species that rely on symbiotic interactions for nutrient acquisition, compelling these symbiotic species to depend to a greater extent on NLR effector recognition in roots. We propose that the loss of root endomycorrhizal signalling in the Brassicaceae family offers the most parsimonious explanation for the Brassicaceae NLR expression shift. Its loss would both have removed a potentially heavily NLR-guarded pathway and eliminated constraints impeding development of more effective general root defence systems. This hypothesis is consistent with both scenarios described above, agrees with the discovery of apparently Brassicaceae-specific PRRs (Kunze et al. 2004; Ranf et al. 2015), and suggests that Brassicaceae, and perhaps other non-mycorrhizal plants, may be rich sources of unique PRRs and antimicrobial root metabolites.
Materials and methods
Identification of putative NLR genes
To allow identification of putative NLR genes, protein sequences were downloaded as indicated (Supplemental table 1). Annotation versions were chosen for compatibility with the available microarray or RNA-seq data to allow subsequent expression analysis. This is why the latest versions were not used in all cases. NLR genes were then identified in a three-step procedure. First, candidate genes were selected using HMMER 3.1b1 (Eddy 2011) based on the NB-ARC PFAM protein domain PF00931. Second, the candidate list was filtered by performing a search for conserved protein domains using CDD (Marchler-Bauer et al. 2011), requiring that the selected putative NLR genes contain, in addition to the NB-ARC domain, either LRR, TIR, PLN00113, PLN03194, or PLN03210 domains. Third, all NLR gene sequences were manually curated to identify and remove false positives. The total number of identified NLR genes in each of the 18 species is shown in Supplemental table 3, with sequences available in Supplemental file 4.
Lotus RNA-seq
L. japonicus ecotype Gifu (Handberg and Stougaard 1992) seeds were surface sterilized, germinated and grown in conditions as described previously (Kawaharada et al. 2015). Three biological replicates per sample were analyzed with each consisting of 10 seedlings grown on 1/4 B&D plates for 10 days before inoculation of the roots with 750 μL of an M. loti R7A suspension (OD600 = 0.02) or water. Three days post-inoculation roots and shoots were separated and total RNA was isolated using a NucleoSpin® RNA Plant kit (Machery-Nagel) according to the manufacturer’s instructions. RNA quality was assessed with on an Agilent 2100 Bioanalyser and samples were sent to GATC Biotech (http://gatc-biotech.com/) for library preparation and sequencing. Sequencing data have been deposited at the NCBI Short Read Archive with BioProject ID PRJNA384655 and are available for analysis on Lotus Base (Mun et al. 2016).
Analysis of NLR gene expression data
For tissues-specific gene expression enrichment analysis (Figure 1 A-D), we classified genes as being enriched in a specific tissue group, if the average expression level in a that group was higher than the average of all other tissue groups, and at least two times higher than that of at least one other tissue group.
In order to evaluate root/shoot expression ratios, available expression data was downloaded as indicated in Supplemental table 2. Samples IDs along with expression values are available in Supplemental file 1. For Lotus and B. rapa, probes were reassigned to the updated annotation using BLAST to match probe and cDNA sequences (e-value cut-off 0.001), assigning only the best matching probe to a gene. For Lotus, Medicago and soybean, samples representing identical or closely related plant accessions were used in the analysis. For rice and maize, data from a number of different accessions were used, but only data where both root and shoot samples had been assayed within the same experiment were used to ensure the comparability of samples from the two tissues. For B. napus, the analysis was based on raw RNA-seq reads. RNA-seq data files were downloaded from the NCBI short read archive (https://www.ncbi.nlm.nih.gov/sra) and reads from each library were assembled using Trinity (--full_cleanup) (Haas et al. 2013) followed by clustering using cd-hit-est v.4.6.6 (-M 16000-T 8) (Fu et al. 2012). Next, the longest open reading frames were identified for each transcript and the corresponding protein sequences were used for identification of NLRs as described. Reads were mapped back to the gene set output from cd-hit-est using STAR (--runMode genomeGenerate --genomeChrBinNbits 14) parameters for index generation and standard options for mapping (Dobin et al. 2013). Finally reads mapping to multiple locations were filtered out followed by summarizing read counts per gene for each sample. For all species, expression data from the genes with the 15% lowest expression levels were filtered out, and the log2 NLR root/shoot expression ratios were normalized by subtracting the mean value for all genes. Expression ratios were plotted using ggplot2 in R version 3.1.2.
The significance of differences in mean expression ratios between all genes and NLR genes were evaluated using Student’s t-test (Supplemental table 4). Next, the significances of interspecies differences in root/shoot expression ratios were evaluated using one-way ANOVA followed by Tukey’s multiple comparison test as implemented in GraphPad Prism 6 (Supplemental table 5). Differences in the average root/shoot expression by NLR gene clade or domain based on the phylogenetic tree shown in Figure 2B, were evaluated using one-way ANOVA followed by Tukey’s multiple comparison test, or Student’s t-test, as implemented in GraphPad Prism 6 (Supplemental tables 6 and 8).
Construction of NLR protein phylogeny
Sequences of the NB-ARC domains of identified R genes were extracted using a python script, based on domains as identified by the CCD search, and aligned using Clustal Omega v1.2.3 (Sievers et al. 2011). Sequences were then filtered for low coverage positions (50% cut-off) and sequences lacking more than 50% of the aligned NB-ARC domain were removed. Phylogenetic trees were constructed in IQ-Tree v.1.5.2 and evaluated using the ultrafast bootstrap approximation approach (UFBoot) implemented the software package (Minh et al. 2013; L.-T. Nguyen et al. 2015). The resulting tree was colored by species using colorTree v1.1 and visualized using Dendroscope v3.5.7 ( Chen et al. 2009; Huson et al. 2012). See Supplemental files 2 and 3 for NB-ARC domain alignments of the trees described in Figures 2B and 3B respectively, along with bootstrap analysis. See Supplemental file 4 for sequences for all NLRs used to construct the phylogenetic trees, and Supplemental file 5 for a general overview of all NLRs used in the study.
Author contributions
DM, VG, AB, TM, WB, and SUA analysed data. SK carried out the Lotus RNA-seq experiment. SUA designed and supervised the study. DM and SUA wrote the manuscript.
Acknowledgements
This work was supported by the Danish National Research Foundation grant no. DNRF79. The authors wish to acknowledge all research groups contributing expression data used in our meta-analysis.
Footnotes
Abbreviations: Arabidopsis, AM, CC, CNL, ETI, Lotus, LRR, MAMP, Medicago, NB-ARC, NBD, NF, NLR, PRRs, RNL, TIR, TNL, XNL.