Abstract
Background Understanding how complex traits evolve through adaptive changes in gene regulation remains a major challenge in evolutionary biology. Over the last ~50 million years, Earth has experienced climate cooling and ancestrally tropical plants have adapted to expanding temperate environments. The grass subfamily Pooideae dominates the grass flora of the temperate regions, but the role of cold-response gene regulation in the transitioning from tropical to temperate climate remains unexplored.
Results To establish if molecular responses to cold are conserved throughout the phylogeny, we assembled the transcriptomes of five Pooideae species spanning early to later diverging lineages, and compared short- and long-term cold responsive genes using 8633 high confidence ortholog groups with resolved gene tree topologies. We found that a majority of cold responsive genes were specific to one or two lineages, an observation that we deem incompatible with a cold adapted Pooideae ancestor. However, all five species shared short-term cold response in a small set of general stress genes as well as the ability to down-regulate the photosynthetic machinery during cold temperatures.
Conclusions Our observations indicate that the different Pooideae lineages have assembled cold response programs in parallel by taking advantage of a common potential for cold adaptation.
Background
Adaptation to a changing climate is essential for long term evolutionary success of plant lineages. During the last ~50 million years of climate cooling (Fig. 1c), several plant species adapted to temperate regions. A key step in this transitioning was the integration of novel temperate climate cues, such as seasonal fluctuations in temperature, in the regulatory network controlling cold stress responses. Here we used the temperate grass subfamily Pooideae as a model system for studying gene regulatory evolution of cold stress.
The temperate grass flora is dominated by members of the subfamily Pooideae [1], and the most extreme cold environments are inhabited by Pooideae species. The ancestors of this group were, however, most likely adapted to tropical or subtropical climates [2, 3]. Many Pooideae species experience cold winters (Fig. 1b) and although a recent study inferred adaptation to cooler environments at the base of the Pooideae phylogeny [4], it is still not known whether the Pooideae’s most recent common ancestor (MRCA) already was adapted to cold stress, or if adaptation to cold evolved independently in the Pooideae lineages.
Pooideae is a large subfamily comprising 4200 species [5], amongst them economically important species such as wheat and barley. Given the commercial importance of this group, various aspects of adaptation to temperate climate such as flowering time, cold acclimation, and frost and chilling tolerance have been studied (reviewed by [6–13]). These studies are, however, confined to a handful of species in the species rich, monophyletic clade “core Pooideae” [14] and recently also to its sister clade, containing the model grass Brachypodium distachyon [15–17]. It is thus unknown how adaptation to temperate climate evolved in earlier diverging Pooideae lineages to promote the success of this subfamily in temperate regions.
Environmental stress is assumed to be a strong evolutionary force, and the colonization of temperate biomes by Pooideae was likely accompanied by adaptation to cold conditions. A MRCA already adapted to cold (the ancestral hypothesis) offers a plausible basis for the ecological success of the Pooideae subfamily in the northern temperate regions [1]. However, paleoclimatic reconstructions infer a generally warm climate, and a very limited abundance of temperate environments, during the time of Pooideae emergence, around 50 million years ago (Mya) [18–22]. Indeed, it was not before ca. 33.5 Mya, during the Eocene-Oligocene (E-O) transition, that the global climates suddenly began to cool [23, 24] (Fig. 1 c). Climate cooling at the E-O transition coincided with the emergence of many temperate plant lineages [25] and may have been an important selection pressure for improved cold tolerance in Pooideae [26, 27]. If the E-O cooling event has been the major evolutionary driving force for cold adaptation in Pooideae grasses, those findings lend support for lineage specific evolution of cold adaptation (the lineage specific hypothesis), as all major Pooideae lineages had already emerged by the time of the E-O transition [2, 28] (Fig. 1a).
A restricted number of plant lineages successfully transitioned into the temperate region, emphasizing the difficulties in evolving the coordinated set of physiological changes needed to withstand low temperatures [29]. During prolonged freezing, plants need to maintain the integrity of cell membranes to avoid osmotic stress [30]. Cold and freezing tolerance is associated with the ability to cold acclimate, which is achieved through a period of extended, non-freezing cold triggered by the gradually lower temperature and day-length in the autumn. During cold acclimation, a suite of physiological changes governed by diverse molecular pathways results in an increase in the sugar content of cells, change in lipid composition of membranes and synthesis of anti-freeze proteins [13, 31]. Also, low non-freezing temperatures may affect plant cells by decreasing metabolic turnover rates, inhibiting the photosynthetic machinery and decreasing stability of biomolecules (e.g. lipid membranes) [10, 12]. Several studies have used transcriptomics to compare cold stress response, however, they focused on closely related taxa or varieties within model species [17, 32–36]. As such, these studies were not able to investigate evolutionary mechanisms underlying adaptation to cold climates of entire clades.
Here, we used de novo comparative transcriptomics across the Pooideae phylogeny to study the evolution of cold adaptation in Pooideae. Specifically, we aim to establish if molecular responses to cold are conserved in the Pooideae subfamily or if they are the result of lineage specific evolution. The transcriptomes of three non-model species (Nardus stricta, Stipa lagascae and Melica nutans), which belong to early diverging lineages, were compared to the transcriptomes of the model grass Brachypodium distachyon and the core Pooideae species Hordeum vulgare (barley). 8633 high confidence ortholog groups with resolved gene tree topologies were used to identify cold-response genes. We found that only a small number of genes were cold responsive in all the investigated Pooideae species, which suggested that their common ancestor only possessed few and possibly key, preliminary cold response mechanisms, and that evolution of cold responses evolved primarily independently in different Pooideae lineages.
Results
To investigate the evolution of cold response in Pooideae, we sampled leaf material in five species before and after subjecting them to a drop in temperature and shorter days (Fig. 1d). RNA-sequencing (RNA-Seq) was used to reveal the short and long term cold response of transcripts, and the conservation of these responses was analyzed in the context of ortholog groups.
De novo transcriptome assembly identified 8633 high confidence ortholog groups
The transcriptome of each species was assembled de novo resulting in 146k-282k contigs, of which 68k-118k were identified as containing coding sequences (CDS, Table S1). Ortholog groups (OGs) were inferred by using the protein sequences from the five de novo assemblies, as well as the reference genomes of L. perenne, H. vulgare, B. distachyon, Oryza sativa, Sorghum bicolor and Zea mays. The five assembled Pooideae species were represented with at least one transcript in 24k-33k OGs (Table S1).
Gene trees were generated for each OG and a set of high confidence OGs (HCOGs) was identified by filtering based on the topology of the gene trees (see Methods). This resulted in 8633 high confidence ortholog groups (HCOGs) containing transcripts from at least three of the five studied species (Table S1, Table S2).
De novo assembly followed by ortholog detection resulted in higher numbers of monophyletic species-specific paralogs than the number of paralogs in the reference genomes of H. vulgare and B. distachyon. This apparent overestimation of paralogs was almost certainly the result of the de novo procedure assembling alleles or alternative transcript isoforms into separate contigs. We also observed some cases where the number of paralogs were under-estimated compared to the references, which may be due to low expression of these paralogs or the assembler collapsing paralogs into single contigs. Since the de novo assembly procedure did not reliably assemble paralogs, we chose to represent each species in each HCOG by a single read-count value equal to the sum of the expression of all assembled paralogs. By additionally setting counts for missing orthologs to zero, we created a single cross species expression matrix with HCOGs as rows and samples as columns (Table S3).
A dated species tree of the Pooideae
Dated gene trees were generated using prior knowledge about the divergence times of Oryza-Pooideae [37] and Brachypodium-Hordeum [28]. Based on 3914 gene trees with exactly one sequence from each of the five Pooideae and rice (see Methods), a dated species tree was estimated using the mean divergence times of the gene trees (Fig. 1a). In the most common gene tree topology, S. lagascae or M. nutans formed a monophyletic clade, but topologies where either S. lagascae or M. nutans diverged first were also common (Fig. S1). Due to this uncertainty regarding the topology, S. lagascae and M. nutans branches were collapsed to a polytomy in the consensus species tree.
Expression clustering indicated a common global response to cold
To investigate broad scale expression patterns in cold response, we clustered all samples (including replicates) after scaling the expression values of each gene to remove differences in absolute expression between species (see Methods, Fig. 2a). This clustering revealed the differential effects of the treatments and resulted in a tree with replicates, and then time points, clustering together. An exception was time points W4 and W9, which tended to cluster together and by species, indicating that responses after 4 and 9 weeks were very similar. The fact that time points mostly clustered together before species indicated a common response to cold across species. We also observed a clear effect of the diurnal rhythm, with time points sampled in the morning (W0, W4 and W9) forming one cluster and time points sampled in the afternoon (D0 and D1) forming another.
Cold responsive genes were primarily species specific
We next examined similarities in short and long term cold response between species by analysing changes in gene expression from before cold treatment to eight hours and 4-9 weeks after cold treatment (Fig. 1d). For all species pairs, there was a low, but statistically significant, correlation between the expression fold changes of orthologs in HCOGs (Fig. 2b). A similar pattern was observed when investigating the number of orthologs classified as differentially expressed in pairs of species (FDR adjusted p-value < 0.05 and fold change > 2, Table S4, see Methods): these numbers were low compared to the number of differentially expressed genes (DEGs) in individual species, but higher than expected by chance (Fisher’s exact test p < 0.05, Fig. 2c). Finally, the number of orthologs with differential expression in more than two species were very low (Fig. 2d), with only 83 DEGs common to all five species. Taken together, these observations suggest that cold response in Pooideae is primarily lineage specific, with low but significant similarities between pairs of species both with respect to fold change and differential expression. Noticeably, neither the similarities in differential expression nor the fold change correlations reflected the phylogenetic relationship between the species, that is, the cold responses of related species were not more similar than that of distantly related species.
Shared cold response genes included known abiotic stress genes
Sixteen genes shared the same cold response (short- or long-term) in the same direction (up or down) in all five Pooideae species, thus representing a response to cold that might have been conserved throughout the evolution of Pooideae (Table 1). Several of these shared cold responsive genes belonged to families known to be involved in cold stress or other abiotic stress responses in other plant species. The most common type of response was short-term up regulation, indicating that stress response, as opposed to long-term acclimation response, is potentially more conserved.
Identified cold response genes confirmed previous findings
We compared the cold response genes from our data to a compilation of H. vulgare genes shown to be responsive to low temperature in several previous microarray studies, subsequently referred to as the Greenup genes (table S10 in [38]). We could map 33 of these 55 genes to unique OGs, of which 11 were HCOGs. We observed significant similarity in cold response between the 33 Greenup genes and the short-term cold response observed in our data (Fig. 3); for all five species (p < 0.05, see Methods). However, this similarity was noticeably larger in H. vulgare than in the other four species. This comparison showed that our transcriptome data was consistent with previous findings in H. vulgare, and that cold response genes identified in H. vulgare exhibits some cold response in other Pooideae.
Photosynthesis was down-regulated under cold stress
To identify biological processes that evolved regulation during different stages of Pooideae evolution, we targeted gene sets that were exclusively differentially expressed in all species within a clade in the phylogentic tree (i.e. branch specific DEGs), and tested these for enrichment of Gene Ontology (GO) biological process annotations (Fig. 4a). For the genes that were differentially expressed in all our species (Pooideae base [PB] in Fig 4b), we found enrichments for annotations related to response to abiotic stimulus, photosynthesis and metabolism. Dividing the branch specific DEGs into up- or down-regulated genes revealed up-regulation of signal transduction (two pseudo response regulators and diacylglycerol kinase 2 (DGK2)) and abiotic stimulus (Gigantea, LEA-14, DnaJ and DGK2), and down-regulation of photosynthesis and metabolism. For the genes that were exclusively differentially expressed in all species except N. stricta (early split [ES] in Fig. 4b), down-regulated genes were again enriched for GO annotations related to metabolism and photosynthesis.
Cold response genes were associated with positive selection on amino acid content
For each HCOG, we tested for positive selection in coding sequences at each of the internal branches of the species tree. The tests were only performed on the branches where the gene tree topology was compatible with the species tree topology (see Methods). 16-18% of the HCOGs showed significant signs of positive selection (P < 0.05) depending on the branch (Fig. 4b). Next, we tested for overrepresentation of positive selection among the branch specific DEGs. There was a tendency that gain of cold response was associated with positive selection at the early split (ES) and late split (LS) branches (P = 0.077 and P = 0.072, respectively) (Fig. 4b).
Discussion
The ecological success of the Pooideae subfamily in the northern temperate regions must have critically relied on adaptation to colder temperatures. However, it is unclear how this adaptation evolved within Pooideae. To test whether molecular responses to cold are conserved in the Pooideae subfamily, we applied RNA-seq to identify short- and long-term cold responsive genes in five Pooideae species ranging from early diverging lineages to core Pooideae species. Since three of the species lacked reference genomes, we employed a de novo assembly pipeline to reconstruct the transcriptomes and showed that this pipeline could recover a set of H. vulgare genes previously identified as cold responsive (Fig. 3). In order to compare the five transcriptomes, we compiled a set of 8633 high confidence ortholog groups with resolved gene tree topologies. Gene expression clustering based on these ortholog groups arranged samples according to replicates, then time points and finally species, indicating that cold response was the primary signal in the data and confirming the soundness of the approach (Fig. 2a).
Lineage specific adaptations to cold climates
A substantial portion of the individual Pooideae transcriptomes responded to cold (1000-3000 genes), however, only a small number of genes responded to cold in all the investigated species (83 genes, Fig. 2d). Even fewer genes responded similarly to cold in all species (e.g. short-term up-regulation, 16 genes, Table 1) and these shared cold response genes primarily included general abiotic stress genes clearly not representative of all the different molecular pathways constituting a fully operational cold response program. We also observed low correlations in expression fold changes between species, a result that was independent of our ability to correctly classify genes as differentially expressed. All these results were based on high confidence ortholog groups that excluded complex families with duplication events shared by two or more species. Since many of the previously described H. vulgare cold responsive genes belonged to such complex families, we could have underestimated the number of shared cold responsive genes. However, we specifically investigated the regulation of these previously described genes using all ortholog groups, and again found that few genes displayed shared cold response across all species (Fig. 3), thus confirming our conclusion that cold response in Pooideae is largely species specific. Taken together, our findings indicated that the most recent common ancestor of the Pooideae possessed no, or only a limited, response to cold, and, consequently, that our data appears more consistent with the lineage specific hypothesis of Pooideae cold adaptation than the ancestral hypothesis.
The drastic cold stress during the E-O transition was likely an important cause for the evolution of cold adaptation in Pooideae. Previous studies have shown that many temperate plant lineages emerged during the E-O transition [25] and that the expansion of well-known cold responsive gene families in Pooideae coincided with this transition [15, 26]. From the dated phylogeny (Fig. 1a) as well as from earlier studies of the Pooideae phylogeny [2, 28], it was clear that all major Pooidaee lineages, including the core Pooideae, had emerged by the late Eocene. Hence, the five lineages studied here experienced the E-O transition as individual lineages (Fig. 1c). Furthermore, we found that closely related species did not share a higher fraction of cold responsive genes than more distantly related species (no phylogenetic pattern, Fig. 2b-d). The observation that the five Pooideae lineages emerged during a relatively warm period before the E-O transition, and the finding that these species harbored high numbers of species specific cold responsive genes with no phylogenetic pattern, together suggested that most of the cold response in Pooideae lineages evolved in parallel during the last 40 M years. During this period, temperatures were constantly decreasing and dramatic cooling events took place, such as the E-O transition and the current Quaternary Ice Age.
Our results suggested that the Pooideae lineages evolved cold response in parallel using, to a large degree, unrelated genes. This implies that different genes can be co-opted into the functional cold response of the Pooideae. It is worth noticing, however, that although we observed many species specific cold response genes, all species pairs displayed a statistically significant correlation in cold response across all HCOGs (Fig. 2b) and a statistically significantly overlap in cold responsive genes (Fig. 2c). This may reflect that some genes code for proteins with biochemical functions more suited to be recruited for cold response than others [39, 40], and that different species thus have ended up co-opting orthologous genes into their cold response program more often than expected by chance.
An adaptive potential in the Pooideae ancestor
Multiple independent origins of cold adaptation raise the question whether connecting traits exists in the evolutionary history of the Pooideae that can explain why the Pooideae lineages were able to shift to the temperate biome. The transcripts that were cold responsive across all five species (Table 1) represented genes that might have gained cold responsiveness in the Pooideae most recent common ancestor and contributed to increase the potential of Pooideae lineages to adapt to a cold temperate climate. Several of these conserved genes were known to be involved in abiotic stresses in other plants such as drought or other osmotic stress, which share some physiological effects experienced during freezing. Co-option of such genes into a cold-responsive pathway might have been the key to acquire cold tolerance. In fact, other studies have implied that drought tolerance might have facilitated the shift to temperate biome [26, 41, 42]. Interestingly, most of the conserved genes were short-term cold responsive (Table 1) and this observation strengthened the hypothesis that existing stress genes might have been the first to be co-opted into the cold response program. Also, three of the conserved cold responsive genes (GIGANTEA, PRR95 and AtPRR3-like) were associated with the circadian clock that is known to be affected by cold [43–45]. This might suggest that clock genes have had an important function in the Pooideae cold adaptation, for example by acting as a signal for initiating the cold defense. More generally, transcripts involved in photosynthesis and response to abiotic stimuli were significantly enriched among the genes with cold response in all species (Fig. 4a). An expanded stress responsiveness towards cold stress and the ability to down-regulate the photosynthetic machinery during cold temperatures to prevent photoinhibition might have existed in the early evolution of Pooideae. In conclusion, the conserved stress response genes discussed here may have represented a fitness advantage for the Pooideae ancestor in the newly emerging environment with incidents of mild frost, allowing time to evolve the more complex physiological adaptations required to endure the temperate climate with strong seasonality and cold winters that emerged following the E-O transition [23]. Consistent with this, Schubert et al. (unpublished) showed that the fructan synthesis and ice recrystallization inhibition protein gene families known to be involved in cold acclimation in core Pooideae species [10] evolved around the E-O split, whereas also earlier evolving Pooideae species show capacity to cold acclimate.
Evolution of coding and regulatory sequences
The molecular mechanisms behind adaptive evolution are still poorly understood, although it is now indisputably established that novel gene regulation plays a crucial role [46]. The evolution of gene regulation proceeds by altering non-coding regulatory sequences in the genome, such as (cis-) regulatory elements [47], and has the potential to evolve faster than protein sequence and function. The high number of species specific cold response genes observed in this study is thus most consistent with the recruitment of genes with existing cold tolerance functions by means of regulatory evolution. However, previous studies have also pointed to the evolution of coding sequences [27] as underlying the acquisition of cold tolerance in Pooideae. To investigate possible coding evolution, we tested for the enrichment of positive selection among branch specific cold responsive genes (Fig. 4b). Although not statistically significant, there was a tendency for positive selection in genes gaining cold response in a period of gradual cooling preceding the E-O event. Thus, we saw evidence of both coding and regulatory evolution playing a role in cold adaptation in Pooideae, and that these processes may have interacted. Finally, gene family expansion has previously been implied in cold adaptation in Pooideae [15, 26]. As previously discussed, the conservative filtering of ortholog groups employed in this study removed complex gene families containing duplication events shared by two or more species. Interestingly, out of the 33 previously described H. vulgare cold responsive genes (Fig. 3), as many as 22 were not included in the high confidence ortholog groups, the main reason being that they belonged to gene families with duplications. This observation thus confirms that duplication events are a relatively common feature of cold adaptation. Although de novo assembly of transcriptomes from short-read RNA-Seq data is a powerful tool that has vastly expanded the number of target species for conducting transcriptomic analysis, the approach has limited power to distinguish highly similar transcripts such as paralogs. Further insight into the role of duplication events in Pooideae cold adaptation would therefore benefit immensely from additional reference genomes.
Conclusion
Here we investigated the cold response of five Pooideae species, ranging from early diverging lineages to core Pooideae species, to elucidate evolution of adaptation to cold temperate regions. We primarily observed species specific cold response that seems to have evolved chiefly after the B. distachyon lineage and the core Pooideae diverged, possible initiated by the drastic temperature drop during the E-O transition. However, we do also see signs of conserved response that potentially represents a shared potential for cold adaptation that explain the success of Pooideae in temperate regions. This included several general stress genes with conserved short-term response to cold as well as the conserved ability to down-regulate the photosynthetic machinery during cold temperatures. Taken together, our observations are consistent with a scenario where many of the biochemical functions needed for cold response were present in the Pooideae common ancestor, and where different Pooideae lineages have assembled, in parallel, different overlapping subsets of these genes into fully functional cold response programs through the relatively rapid process of regulatory evolution.
Methods
Plant material, sampling and sequencing
To address our hypothesis, we selected five species to cover the phylogenetic spread of Pooideae. The selected species also represent major, species rich lineages or clades in the Pooideae subfamily, or belong to very early diverging lineages [5]. Seeds were collected either in nature: Nardus stricta (collected in Romania, [46.69098, 22.58302], July 2012) and Melica nutans (collected in Germany, [50.70708, 11.23838], June 2012); or acquired from germplasm collections: Stipa lagascae (PI 250751, U.S. National Plant Germplasm System (U.S.-NPGS) via Germplasm Resources Information Network [GRIN]), Brachypodium distachyon (line ‘Bd1-1’, W6 46201, U.S.-NPGS via GRIN) and Hordeum vulgare (line ‘Igri’, provided by Prof. Åsmund Bjørnstad, Department of Plant Sciences, Norwegian University of Life Sciences, Norway). Seeds were germinated and initially grown in a greenhouse at a neutral day length (12 hours of light), 17°C and a minimum artificial light intensity of 150 μmol/m2s. Because the seedlings of the phylogenetically diverse species grew at different rates, the sampling was based on developmental stages rather than time. Plants were grown until three to four leaves had emerged for M. nutans, S. lagascae, B. distachyon and H. vulgare, or six to seven leaves for N. stricta (which is a cushion forming grass that produces many small leaves compared to its overall plant size). Depending on the species, this process took one (H. vulgare), three (B. distachyon and S. lagascae), six (M. nutans) or eight (N. stricta) weeks from the time of sowing. Subsequently, plants were randomized and distributed to two cold chambers with short day (8 hours of light), 6°C and a light intensity of 50 μmol/m2s. Leaf material for RNA isolation was collected i) in the afternoon (8 hours of light) before cold treatment (D0) and 8 hours after cold treatment (D1) and ii) in the morning (at lights on) before cold treatment (W0), 4 weeks after cold treatment (W4) and 9 weeks after cold treatment (W9) (Fig. 1d). Flash frozen leaves were individually homogenized using a TissueLyser (Qiagen Retsch) and total RNA was isolated (from each leaf) using RNeasy Plant Mini Kit (Qiagen) following the manufacturer’s instructions. The purity and integrity of total RNA extracts was determined using a NanoDrop 8000 UV-Vis Spectrophotometer (Thermo Scientific) and 2100 Bioanalyzer (Agilent), respectively. For each time point, RNA extracts from five leaves sampled from five different plants were pooled and sequenced as a single sample. In addition, replicates from single individual leaves were sequenced for selected timepoints (see Table S1 and “Differential expression” below). Two time points lacked expression values: W9 in B. distachyon (RNA integrity was insufficient for RNA sequencing) and W0 in S. lagascae (insufficient supply of plant material). Samples were sent to the Norwegian Sequencing Centre, where strand-specific cDNA libraries were prepared and sequenced (paired-end) on an Illumina HiSeq 2000 system. The raw reads are available in the ArrayExpress database (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-5300.
Transcriptome assembly and ortholog inference
Using Trimmomatic v0.32 [48], all reads were trimmed to a length of 120 bp, Illumina TruSeq adapters were removed from the raw reads, low quality bases trimmed using a sliding window of 40 bp and an average quality cut-off of 15 and reads below a minimum length of 36 bp were discarded. Read quality was controlled using fastqc v0.11.2. For each species, transcripts were assembled de novo with Trinity v2.0.6 [49] (strand specific option, otherwise default parameters) using reads from all samples. Coding sequences (CDS) were identified using TransDecoder rel16JAN2014 [50]. Where Trinity reported multiple isoforms, only the longest CDS was retained. Ortholog groups (OGs) were constructed from the five de novo transcriptomes and public reference transcriptomes of H. vulgare (barley_HighConf_genes_MIPS_23Mar12), B. distachyon (brachypodium v1.2), O. sativa (rap2), Z. mays (ZmB73_5a_WGS), S. bicolor (sorghum 1.4) and L. perenne (GenBank TSA accession GAYX01000000) using OrthoMCL v2.0.9 [51]. All reference sequences except L. perenne were downloaded from http://pgsb.helmholtz-muenchen.de/plant/plantsdb.jsp. A summary of the results is provided in Table S1.
High confidence ortholog groups
To compare gene expression across Pooideae, we identified ortholog groups containing one gene from each species that all descended from a single gene in the Pooideae ancestor. As the ortholog groups (OGs) inferred using orthoMCL sometimes cluster more distantly related homologs as well as include both paraphyletic and monophyletic paralogs, we further refined the OGs by phylogenetic analysis. Several approaches to phylogenetic refinement has been proposed previously (see e.g. [52]). Here we first aligned protein sequences within each OG using mafft v7.130 [53] and converted to codon alignments using pal2nal v14 [54]. Gene trees were then constructed from the codon alignments using Phangorn v1.99.14 [55] (maximum likelihood GTR+I+G). Trees with apparent duplication events before the most recent common ancestor of the included species were split into several trees. This was accomplished by identifying in-group (Pooideae) and out-group (Z. mays, S. bicolor and O. sativa) clades in each tree, and then splitting the trees so that each resulting sub-tree contained a single out-group and a single in-group clade. Finally, we only retained the trees were all species in the tree formed one clade each (i.e. only monophyletic paralogs), B. distachyon and H. vulgare formed a clade and at least three of the five studied species were included. These trees constituted the high confidence ortholog groups (HCOGs).
Species tree
Ortholog groups with a single ortholog from each of the five de novo Pooideae species and O. sativa (after splitting the trees, see “High confidence ortholog groups”) were used to infer dated gene trees. To this end, BEAST v1.7.5 [56] was run with an HKY + Γ nucleotide substitution model using an uncorrelated lognormal relaxed clock model. A Yule process (birth only) was used as prior for the tree and the monophyly of the Pooideae was constrained. Prior estimates for the Oryza-Pooideae (53 Mya [SD 3.6 My], [37]) and Brachypodium-Hordeum (44.4 Mya [SD 3.53 My], [28]) divergence times were used to define normally distributed age priors for the respective nodes in the topology. MCMC analyses were run for 10 million generations and parameters were sampled every 10.000 generation. For each gene tree analysis, the first 10 percent of the estimated trees were discarded and the remaining trees were summarized to a maximum clade credibility (MCC) tree using TreeAnnotator v1.7.5. The topology of the species tree was equal to the most common topology among the 3914 MCC trees, with internal node ages set equal to the mean of the corresponding node age distributions of the MCC gene trees.
Differential expression
Reads were mapped to the de novo transcriptomes using bowtie v1.1.2 [57], and read counts were calculated with RSEM v1.2.9 [58]. In HCOGs, read counts of paralogs were summed (analogous to so called monophyly masking [59]) and missing orthologs were assumed to not be expressed (i.e. read counts equal to zero). To identify conserved and diverged cold response across species, we probed each HCOG for differentially expressed genes (DEGs). Specifically, DEGs were identified using DESeq2 v1.6.3 [60] with a model that combined the species factor and the timepoint factor (with timepoints W4/9 as a single level). Pooled samples provided robust estimates of the mean expression in each time point. To also obtain robust estimates of the variance, the model assumed common variance across all timepoints and species within each HCOG, thus taking advantage of both biological replicates available for individual time points within species and the replication provided by analysing several species. For each species, we tested the expression difference between D0 and D1 (short-term response) and the difference between W0 and W4/9 (long-term response) (Fig. 1d). B. distachyon lacked the W9 samples and long-term response was therefore based on W4 only. S. lagascae lacked the W0 sample and long-term response was therefore calculated based on D0. As a result, the observed diurnal effect (Fig. 2a) might have resulted in more unreliable estimates of the long term cold response in S. lagascae since for this species the afternoon sample (D0) was used to replace the missing morning sample (W0). Genes with a false discovery rate (FDR) adjusted p-value < 0.05 and a fold change > 2 were classified as differentially expressed.
Sample clustering
Sample clustering was based on read counts normalized using the variance-stabilizing transformation (VST) implemented in DESeq2 (these VST-values are essentially log transformed). HCOGs that lacked orthologs from any of the five species, or that contained orthologs with low expression (VST < 3), were removed, resulting in 4981 HCOGs used for the clustering. To highlight the effect of the cold treatment over the effect of absolute expression differences between species, the expression values were normalised per gene and species: First, one expression value was obtained per timepoint per gene by taking the mean of the replicates. Then, these expression values were centered by subtracting the mean expression of all timepoint. Distances between all pairs of samples were calculated as the sum of absolute expression difference between orthologs in the 4981 HCOGs (i.e. manhattan distance). The tree was generated using neighbor joining [61].
Comparison with known cold responsive genes
A set of H. vulgare genes independently identified as cold responsive were acquired from supplementary table S10 in [38]. These genes were found to be responsive to cold in three independent experiments with Plexdb accessions BB64 [62], BB81 (no publication) and BB94 [38]. The probesets of the Affymetrix Barley1 GeneChip microarray used in these studies were blasted (blastx) against all protein sequences in our OGs. Each probe was assigned to the OG with the best match in the H. vulgare reference. If several probes were assigned to the same OG, only the probe with the best hit was retained. Correspondingly, if a probe matched several paralogs within the same OG, only the best match was retained. DESeq2 was used to identify short-term response DEGs for all transcripts in all OGs (i.e. this analysis was not restricted to the HCOGs), and these were compared to DEGs from [38]. The statistical significance of the overlap between our results and those reported in [38] was assessed for each species by counting the number of genes that had the same response (up- or down-regulated DEGs) and comparing that to a null distribution. The null distribution was obtained from equivalent counts obtained from 100 000 trials where genes were randomly selected from all expressed genes (mean read count > 10) with an ortholog in H. vulgare.
Gene ontology enrichment tests
Gene Ontology (GO) annotations for B. distachyon were downloaded from Ensembl Plants Biomart and assigned to the HCOGs. The TopGO v2.18.0 R package [63] was used to calculate statistically significant enrichments (Fisher’s exact test, p < 0.05) of GO biological process annotations restricted to GO plant slim in each set of branch specific DEGs using all annotated HCOGs as the background. Branch specific DEGs were those genes that were exclusively differentially expressed in all species within a clade in the phylogenetic tree.
Positive selection tests
Each of the HCOGs were tested for positive selection using the branch-site model in codeml, which is part of PAML v4.7 [64]. We only tested branches for positive selection in HCOGs meeting the following criteria: (i) The tested branch had to be an internal branch also in the gene tree (i.e. there was at least two species below the branch). (ii) The species below and above the tested branch in the gene tree had to be the same as in the species tree or a subset thereof. (iii) The first species to split off under the branch had to be the same as in the species tree (for the early split, either S. lagascae or M. nutans was allowed). We then used the Hypergeometric test to identify statistically significant overrepresentation of positive selection among branch specific DEGs (see “Gene ontology enrichment tests”) at the Pooideae base (PB), the early split (ES) and the late split (LS) branches.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Availability of data and material
The raw reads, the assembled transcripts and the raw read counts are available in the ArrayExpress database (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-5300.
Competing interests
The authors declare that they have no competing interests.
Funding
The research was funded by grants from the Nansen Foundation to SF and the TVERRforsk program at the Norwegian University of Life Sciences (NMBU) to SF, TRH and SRS. This work was part of the PhD projects of MS and LG funded by NMBU.
Authors’ contributions
All authors designed the experiment. M. S. performed the growth experiments, sampled and prepared RNA for sequencing, helped designing the data analysis pipeline, contributed to the positive selection analysis and performed the phylogenetic analyses. L.G. developed, implemented and conducted the data analysis. All authors interpreted the results. M.S. and L.G. wrote the manuscript with input from S.F., T.R.H., and S.R.S.
Acknowledgements
We thank Åsmund Bjørnstad and USDA-NPGS GRIN for providing seeds of H. vulgare, and B. distachyon and S. lagascae, respectively. For technical assistance handling plants during growth experiments we thank Øyvind Jørgensen. We are grateful to Erica Leder, Thomas Marcussen, Ursula Brandes and Camilla Lorange Lindberg for helpful comments on earlier versions of this manuscript.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.
- 8.
- 9.
- 10.↵
- 11.
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.
- 17.↵
- 18.↵
- 19.
- 20.
- 21.
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.
- 34.
- 35.
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.↵