Using ribosome profiling to quantify differences in protein expression: a case study in Saccharomyces cerevisiae oxidative stress conditions

William R. Blevins; Teresa Tavella; Simone G. Moro; Bernat Blasco-Moreno; Adrià Closa-Mosquera; Juana Díez; Lucas B. Carey; M. Mar Albà

doi:10.1101/501478

Abstract

Cells respond to changes in the environment by modifying the concentration of specific proteins. Paradoxically, the cellular response is usually examined by measuring variations in transcript abundance by high throughput RNA sequencing (RNA-Seq), instead of directly measuring protein concentrations. This happens because RNA-Seq-based methods provide better quantitative estimates, and more extensive gene coverage, than proteomics-based ones. However, variations in transcript abundance do not necessarily reflect changes in the corresponding protein abundance. How can we close this gap? Here we explore the use of ribosome profiling (Ribo-Seq) to perform differentially gene expression analysis in a relatively well-characterized system, oxidative stress in baker’s yeast. Ribo-Seq is an RNA sequencing method that specifically targets ribosome-protected RNA fragments, and thus is expected to provide a more accurate view of changes at the protein level than classical RNA-Seq. We show that gene quantification by Ribo-Seq is indeed more highly correlated with protein abundance, as measured from mass spectrometry data, than quantification by RNA-Seq. The analysis indicates that, whereas a subset of genes involved in oxidation-reduction processes is detected by both types of data, the majority of the genes that happen to be significant in the RNA-Seq-based analysis are not significant in the Ribo-Seq analysis, suggesting that they do not result in protein level changes. The results illustrate the advantages of Ribo-Seq to make inferences about changes in protein abundance in comparison with RNA-Seq.

Introduction

In recent years high throughput RNA sequencing (RNA-Seq) has become the method of choice for comparing gene expression changes of cells grown under different conditions (Rapaport et al., 2013). The relatively low cost of RNA-Seq, together with the availability of efficient computational methods to process information from millions of sequencing reads, has undoubtedly accelerated our understanding of gene regulation. However, a change in mRNA relative abundance does not always imply a change in the amount of the encoded protein (Schwanhäusser et al., 2011). Filling this gap in understanding is essential to discern the functional changes in the cell upon a given stimulus.

Many studies have shown that mRNA levels only partially explain protein levels in the cell (de Sousa Abreu et al., 2009; Schwanhäusser et al., 2011; Payne, 2015; Ponnala et al., 2014). In yeast, the correlation between mRNA and protein abundance is typically in the range 0.6-0.7 (de Sousa Abreu et al., 2009). In addition, the ratio between protein and mRNA levels may vary across different conditions. For instance, substantial differences in this ratio have been observed during osmotic stress in yeast (Lee et al. 2011) or after the treatment of human cells with epidermal growth factor (Tebaldi et al., 2012). This strongly suggests that measuring changes in mRNA levels may often be insufficient to identify the functional shifts taking place in the cell upon a given stimulus.

Protein quantification is often performed using whole proteome mass spectrometry-based methods (Gerber et al., 2003; Edfors et al., 2016). These methods provide a direct measurement of protein abundance but they also have limitations, especially for the detection of lowly expressed and/or small proteins (Slavoff et al., 2013). An alternative way to estimate protein levels is the sequencing of ribosome-protected mRNA fragments, or ribosome profiling (Ribo-Seq) (Ingolia et al., 2009, 2011; Aspden et al., 2014; Ruiz-Orera et al., 2014). In contrast to RNA-Seq, which measures the total amount of mRNA in the cell, Ribo-Seq only captures those mRNAs that are being actively translated. Although Ribo-Seq measures translation, which is an indirect estimate of protein abundance, it has the advantage over proteomics that virtually any mRNA can be interrogated. In addition, Ribo-Seq reads can be quantified in the same manner as RNA-Seq reads. This implies that we can use the same pipelines as for RNA-Seq to identify differentially expressed genes.

It has been proposed that alterations in the ratio between the relative number of Ribo-Seq and RNA-Seq reads mapping to a given locus, known as the translation efficiency (TE), can be used to identify putative translation activation or repression events (Ingolia, 2016). Numerous recent studies have used ribosome profiling data has been used to study translation regulatory mechanisms (Jungfleisch et al., 2017; Yordanova et al., 2018) or to discover new translated RNA sequences (Michel et al. 2012; Aspden et al. 2014; Ingolia et al. 2014; Ruiz-Orera et al. 2014).

Here we perform differential gene expression analysis using RNA-Seq and Ribo-Seq data during oxidative stress in Saccharomyces cerevisiae, a condition that is known to trigger important regulatory changes both at the transcriptional and translational levels (Shenton et al., 2006; Gerashchenko et al., 2012). We compare the results to proteomics data obtained from the same samples. The results show that the dynamics of total mRNA and translated mRNAs are very distinct, and that most changes in the relative amount of mRNA do not appear to have any consequences at the protein level. The study opens a door for a more generalized use of Ribo-Seq data to measure changes in protein expression across conditions.

Results and Discussion

Quantification of gene expression by Ribo-Seq and RNA-Seq

We extracted ribosome-protected RNA fragments, as well as total polyadenylated RNAs, from Saccharomyces cerevisiae grown in rich medium (normal) and in H₂O₂-induced oxidative stress conditions (stress). We then sequenced ribosome-protected RNAs (Ribo-Seq) as well as complete polyA+ mRNAs (RNA-Seq) using a strand-specific protocol. The Ribo-Seq data corresponded to the translated mRNA fraction (translatome), whereas the RNA-Seq data corresponded to total mRNAs (transcriptome). For comparison we also estimated protein concentrations (proteome) in the two conditions by mass spectrometry (Figure 1).

Figure 1. Experimental design.

Baker’s yeast (S. cerevisiae) was grown in rich medium or oxidative stress conditions. The cultures were used to extract total RNA, ribosome-protected RNA fragments and proteins.

Figure 2. Representative gene expression correlations between RNA sequencing samples.

A. RNA-Seq normal replicate 1 versus Ribo-Seq normal replicate 1. B. RNA-Seq stress replicate 1 versus Ribo-Seq stress replicate 1. C. RNA-Seq normal replicate 1 versus RNA-Seq normal replicate 2. D. Ribo-Seq normal replicate 1 versus Ribo-Seq normal replicate 2. Expression units are CPM in logarithm scale; R: Spearman correlation value. N: normal growth conditions (two replicates N1 and N2); S: stress conditions (two replicates S1 and S2).

After quality control of the sequencing reads we obtained 31-36 million reads for Ribo-Seq and 12-15 million reads for RNA-Seq (Supplementary Table S1). We mapped the reads to the genome and generated a table of gene counts for each of the samples. After filtering out non-expressed genes (see Methods), the table contained data for 5,419 S. cerevisiae genes. Using mass spectrometry (mass spec) we could quantify the protein products of 2,200 genes (see Methods), representing about 40% of the genes quantified by RNA-Seq.

We normalized the RNA-Seq and Ribo-Seq-based table of counts by calculating counts per million (CPM) in logarithmic scale, or log₂CPM (Supplementary Figure S1). The gene normalized expression values showed a very high correlation between biological replicates, with a correlation coefficient large than 0.99 between all pairs of Ribo-Seq or RNA-Seq replicas (Supplementary Table S2). In contrast, normalized protein abundances between pairs of proteomics replicates showed correlation coefficients between 0.83 and 0.93 (Supplementary Table S3), indicating that quantification by proteomics is less reproducible than quantification by RNA-Seq and Ribo-Seq.

Importantly, the Ribo-Seq data correlated better with the proteomics data than RNA-Seq; in the first case the correlation was 0.67-0.71 and in the second one 0.46-0.62 (Figure 3). This supports that notion that Ribo-Seq provides a more accurate view of protein expression than RNA-Seq (Ingolia et al., 2009).

Figure 3. Proteomics shows a stronger correlation with Ribo-Seq than with RNA-Seq data.

A. RNA-Seq versus proteomics, normal growth conditions. B. RNA-Seq versus proteomics, oxidative stress. C. Ribo-Seq versus proteomics, normal growth conditions. D. Ribo-Seq versus proteomics, oxidative stress. CPM: counts per million for RNA-Seq and RNA-Seq data (represented in logarithmic scale, average between replicates). log₂ normalized area: relative abundance for proteomics data (average between replicates). R: Spearman correlation value. Plot and correlations represent 2200 genes for which >3 unique peptides were detected by LCMSMS.

We next clustered the RNA-Seq and Ribo-Seq gene expression values using multidimensional scaling (MDS)(Borg and Groenen, 1997)(Supplementary Figure S2). Remarkably, the Ribo-Seq measurements for the two conditions (normal and stress) were more similar to each other than any of them was to the condition-matched RNA-Seq measurements, and the same thing happened with the RNA-Seq-based measurements. Thus, the sequencing approach employed is expected to have a strong impact in the results.

Next, we calculated the fold change (FC) gene expression difference between conditions, taking the average expression values between replicates of the same experimental condition. In agreement with the results obtained with MDS, the log₂FC distribution based on the Ribo-Seq data had a lower variance than the log₂FC distribution using RNA-Seq data (Figure 4). We considered the possibility that this pattern was due to the number of Ribo-Seq reads being 2-3 times larger than the number of RNA-Seq reads (Supplementary Table S1). To test for this, we subsampled the mapped reads so as to have a similar number of reads in all the RNA-Seq and Ribo-Seq samples (Supplementary Tables S4 and S5). We again observed a lower log₂FC variance for Ribo-Seq than for RNA-Seq (Supplementary Figure S3), indicating that the observed variance difference has a biological origin.

Figure 4. Distribution of gene expression fold change (FC) differences in logarithmic scale.

FC was calculated as the ratio between the number of reads in oxidative stress and normal conditions. We took the average number of reads per gene among the replicates. The standard deviation of log₂FC was 0.44 for Ribo-Seq (RP) and 0.57 for RNA-Seq (RNA).

Differential gene expression analysis

We performed differential gene expression analysis, separately for Ribo-Seq and RNA-Seq data, using multivariable linear regression with the Limma package (Law et al., 2014). Limma provides a list of differentially expressed genes with the corresponding adjusted p-values. We selected genes with an adjusted p-value < 0.05 and a log₂FC larger than one standard deviation; the latter corresponded to a minimum FC of 1.49 for RNA-Seq data and 1.36 for Ribo-Seq data. We used the standard deviation instead of a fixed value to accommodate for the differences in the width of the log₂FC distributions (Figure 4).

We obtained 817 up-regulated genes during oxidative stress using RNA-Seq data, compared to only 92 with Ribo-Seq data. Thus, the vast majority of the genes identified as up-regulated in stress with RNA-Seq data were not significantly up-regulated when using the Ribo-Seq data to do the same analysis. The number of down-regulated genes was 846 and 519 for RNA-Seq and Ribo-Seq, respectively. Overall, only a small fraction of the differentially expressed genes was common to both approaches (5-10%, see below).

The induction of oxidative stress by hydrogen peroxide (H₂O₂) results in an excess of reactive oxygen species (ROS) in the cell. This is known to activate the expression of several protein families including thioredoxins, hexoquinases, and heat shock proteins (Morano et al., 2012). The set of up-regulated genes identified by both RNA-Seq and Ribo-Seq included several members of these families (e.g. HXK2, TDH1, CYC1, HSP10), consistent with transcriptional activation of genes directly involved in stress response.

Attempts to use the same pipeline to identify differentially expressed genes using the proteomics data did not yield significant results. The reproducibility of protein abundance estimates using mass spec data is not as high as the reproducibility of gene expression levels in the case of RNA sequencing data, which decreases the power of differential gene expression analysis using this kind of data (Supplementary Table S3).

Uncoupling between changes at the transcriptome and translatome levels

The correlation between RNA-Seq and Ribo-Seq gene log₂FC values was quite low (0.18), supporting an important disconnect between the two kinds of data (Figure 5). We quantified the number of genes that showed a significant change in the same direction i.e. homodirectional changes. There were 38 genes that were up-regulated during stress using both RNA-Seq and Ribo-Seq data, this is a small number but still more than double the number expected by chance (15 genes). The number of homodirectional down-regulated genes was 89, compared to 55 be expected by chance. In summary, while there was a modest overlap between the stories told by RNA-Seq and Ribo-Seq data (test of proportions p-value < 1.32×10⁵), the majority of the differentially expressed genes were not concordant.

Figure 5. Correlation between gene expression fold changes with RNA-Seq and Ribo-Seq data.

Fold change (FC) gene expression values are shown in logarithmic scale. The X axis corresponds to the RNA-Seq data, or transcriptome, the Y axis to the Ribo-Seq data, or translatome. The number of down-regulated and up-regulated genes is indicated. Coloured dots correspond to differentially expressed genes. In the legend homodirectional means up-regulated, or down-regulated, at the transcriptome and translatome levels; opposite_change is up-regulated at one level and down-regulated at the other one.

Dissecting differential regulation by functional class

To better understand the biological relevance of the above results, we investigated if certain functional classes were significantly enriched among the sets of differentially expressed genes. We used DAVID (Huang et al., 2009) to identify significantly over-represented functional clusters (Figure 4). Only one class, ‘oxidation-reduction process’, was enriched among genes up-regulated during stress both using RNA-Seq and Ribo-Seq data. This is consistent with transcriptional activation of this set of genes upon stress, increasing the signal for both total mRNA and the translated fraction. Three other classes – ‘translation’, ‘ATPase’ and ‘proteasome’ – showed increased mRNA levels during stress, but this was not reflected in an increase in the translated fraction. Thus, it is likely that an important part of these transcripts are stored in a translation inactive form during stress, for example as P-bodies or stress granules (Zid and O’Shea, 2014; Khong et al., 2017; Luo et al., 2018). In this case, an accumulation of transcripts would be detected by RNA-Seq but not by Ribo-Seq, as translation of the transcripts is impaired.

Interestingly, there were functions that only appeared when we performed differential gene expression analysis with the Ribo-Seq data: ‘cell wall’, ‘mitochondrial intermembrane space’ and ‘catalytic activity’ were enriched among up-regulated genes, whereas ‘cell cycle’ was enriched among down-regulated genes (Figure 6). As these classes are not detected by RNA-Seq, they are candidates to be regulated at the translational level only. An alternative possibility is that the storage of some transcripts in stress granules distorts the RNA-Seq patterns to such a degree that some truly up-regulated genes become undetectable with RNA-Seq; they would only be detected when examining actively translated mRNAs with Ribo-Seq.

Figure 6. Significant gene functional classes among differentially expressed genes.

Shown is a 2-D plot of the enrichment score values, in logarithmic scale, provided by the software DAVID for differentially expressed genes using RNA-Seq (transcriptome) or Ribo-Seq (translatome) data. Significant enrichment scores are associated with a p-val < 0.05. Functional classes associated with positive values are significantly enriched among up-regulated genes, and functional classes with negative values are significantly enriched among down-regulated genes. Non-significant enrichment scores are given a value of 0 in the plot.

Translation inhibition of cell cycle genes

In order to further identify possible translational regulatory events we compared the translational efficiency (TE; Ribo-Seq reads divided by RNA-Seq reads) of the different genes in the two conditions using the program Ribodiff (Zhong et al., 2017). This approach is based on the assumption that the number of Ribo-Seq reads is proportional to the amount of translated protein. We detected 470 genes that showed increased TE, and 714 genes that showed decreased TE, in oxidative stress versus normal growth conditions (adjusted p-value < 0.05; see Methods).

We reasoned that genes whose translation becomes more active during stress should have increased TE values but also be classified as upregulated when using Ribo-Seq for differential gene expression analysis. We only found 17 genes fulfilling both conditions (3.6% of the genes with increased TE), indicating that activation of translation probably has a relatively small impact in the response to oxidative stress. In the vast majority of cases the increase in TE could be explained by a decrease in RNA-Seq signal during stress (Supplementary Table S6).

By the same token, genes whose translation is repressed during stress are expected to have decreased TE values but also be classified as down-regulated by Ribo-Seq. We found 246 such genes (34.4% of the genes with decreased TE), suggesting that this mechanism may be more prevalent. Among them there were 12 genes from the cell cycle functional category (Supplementary Table S7). The putative translational repression of these genes did not appear to be mediated by increased translation of upstream ORFs (Gerashchenko et al., 2012), as we did not detect any increase in the number of Ribo-Seq reads mapping to 5’UTR regions when compared to coding sequences in stress conditions.

Concluding remarks

The adaptation of organisms to variations in the environmental conditions is associated with the activation or repression of the expression of particular genes. These changes are usually studied at the level of complete mRNA molecules using microarrays or next generation sequencing. However, changes in mRNA concentration do not necessarily reflect changes in their encoded protein products; rather, uncoupling between total and polysomal mRNA levels has been observed in many different conditions (Tebaldi et al., 2012; Shenton et al., 2006).

Ribo-Seq specifically targets ribosome-protected mRNAs, providing a closer view to protein expression than RNA-Seq, which is for total mRNA sequences. Although Ribo-Seq data is more labour-intensive than RNA-Seq, the protocols are being simplified and its use is rapidly growing (Reid et al. 2015; Xie et al. 2016; Liu et al. 2018; Michel et al. 2018). Here we have used Ribo-Seq data to perform differential gene expression analysis during oxidative stress, and compared the results to RNA-Seq and to proteomics data.

We have shown that gene expression levels inferred from Ribo-Seq data correlate better with protein abundance than those inferred from RNA-Seq data. Remarkably, many of the genes that are classified as differentially regulated using RNA-Seq do not show a similar effect when the Ribo-Seq data is analyzed, strongly suggesting that, for these genes, no significant changes at the protein level take place. The methodological framework we have developed here can be applied to other conditions and help advance our understanding of gene regulation.

Methods

Biological material

We grew S. cerevisiae (S288C) in 500 ml of rich medium (Tsankov et al., 2010). In order to induce oxidative stress, 30 minutes before harvesting we added diluted H₂O₂ to the medium for a final concentration of 1.5 mM. The cells were harvested in log growth phase (OD600 of ~0.25) via vacuum filtration and frozen with liquid nitrogen.

Ribosome profiling

In order to capture ribosome protected mRNAs, cyclohexamide was added one minute before the cells were harvested. Cyclohexamide is commonly used as a protein synthesis inhibitor in order to prevent ribosome run-off and the subsequent loss of ribosome-transcript complexes. One third of each culture was used for ribosome profiling (Ribo-Seq); the rest was reserved for RNA-Seq.

Cells were lysed using the freezer/mill method (SPEX SamplePrep); after preliminary preparations, lysates were treated with RNaseI (Ambion), and subsequently with SUPERaseIn (Ambion). Monosomal fractions were collected; SDS was added to stop any possible RNAse activity, then samples were flash-frozen with N₂(l). Digested extracts were loaded in 7%-47% sucrose gradients. RNA was isolated from monosomal fractions using the hot acid phenol method. Ribosome-Protected Fragments (RPFs) were selected by isolating RNA fragments of 28-32 nucleotides (nt) using gel electrophoresis. The preparation of sequencing libraries for Ribo-Seq and RNA-Seq was based on a previously described protocol (Ingolia et al., 2012). Pair-end sequencing reads of size 35 nucleotides (2×35bp) were produced for Ribo-Seq and RNA-Seq on MiSeq and NextSeq platforms, respectively. The data has been deposited at NCBI Bioproject PRJNA435567 (https://www.ncbi.nlm.nih.gov/bioproject/435567).

Processing of the sequencing data

The RNA-Seq data was filtered using Trimmomatic with default parameters (version 0.36)(Bolger et al., 2014). In the Ribo-Seq data we discarded the second read pair as it was redundant and of poorer quality than the first read, and then used Cutadapt (Martin, 2011) to eliminate the adapters and to trim five and four nucleotides at 5’ and 3’ edges, respectively. Ribosomal RNA was depleted from the Ribo-Seq data in silico by removing all reads which mapped to annotated rRNAs. Ribo-Seq reads shorter than 25 nucleotides were not used.

After quality check and read trimming, the reads were aligned against the S. cerevisiae genome (S288C R64-2-1) using Bowtie 2 (Langmead et al., 2009). For annotation we used a previously generated S. cerevisiae transcriptome containing 6,184 annotated coding sequences plus 1,009 non-annotated assembled transcripts (see Supplementary data). SAMtools (Li et al., 2009) was used to filter out unmapped reads.

We counted the number of reads that mapped to each gene with HTSeq-count (Anders et al., 2015). We used the mode ‘intersection strict’ to generate a table of counts from the data; the procedure removed about 5% of the reads in the case of RNA-Seq, and 8% in the case of Ribo-Seq. Only genes in which the average read count of the two replicates was larger than 10 in all conditions (normal and stress, for RNA-Seq and for Ribo-Seq) were kept. The filtered table of counts contained data for 5,419 genes.

For subsampling the number of mapped reads we used SAMtools (Li et al., 2009). We used the function ‘samtools view’ with option ‘-s 0.X’, where X is the percentage of reads that we wish to keep.

Differential gene expression analysis

The table of counts was normalized to log₂ Counts per Million (log₂CPM), in order to account for the different number of total reads in each sample. Before performing differential gene expression analysis, we normalized the data using Trimmed Mean of M-values (TMM) as implemented is the package edgeR (Robinson et al., 2010). Finally, we applied the Limma voom method (Law et al., 2014) to identify differentially expressed genes, separately for RNA-Seq and Ribo-Seq data (adjusted p-value < 0.05 and |log₂FC| > 1 SD(log2FC)).

We also performed the same kind of analysis for the proteomics data. We used genes which had at least 3 unique peptides and could be quantified in all 6 replicates (1,580 genes); the procedure did not identify any significantly up or down regulated genes, using an adjusted p-value < 0.05.

Quantification of protein abundance by mass spectrometry

For our proteomics experiment, we analysed 3 replicates per condition by LCMSMS using a 90-min gradient in the Orbitrap Fusion Lumos. These samples were not treated with cyclohexamide. As a quality control measure, BSA controls were digested in parallel and ran between each sample to avoid carry-over and assess the instrument performance. The peptides were searched against SwissProt Yeast database, using the Mascot v2.5.1 search algorithm. The search was performed with the following parameters: peptide mass tolerance MS1 7 ppm and peptide mass tolerance MS2 0.5 Da; three maximum missed cleavages; trypsin digestion after K or R except KP or KR; dynamic modifications oxidation (M) and acetyl (N-term), static modification carbamidomethyl (C). Protein areas were obtained from the average area of the three most intense unique peptides per protein group. Considering the data from all 6 samples, we detected proteins from 3,336 genes. We limited our quantitative analysis to a subset of 2,200 proteins which had proteomics hits for at least 3 unique peptides; this filter eliminates noise arising from technical challenges of quantifying lowly abundant proteins with LCMSMS.

Analysis of functional clusters

We identified significantly enriched functional clusters in differentially expressed genes using DAVID (Huang et al., 2009). The analysis was done separately for over- and under-expressed genes and for RNA-Seq and Ribo-Seq derived data. Only clusters with enrichment score ≥ 1.5 and adjusted p-val < 0.05 were retained. In each cluster we chose a representative Gene Ontology (GO) term (Ashburner et al., 2000), with the highest number of genes inside the cluster. Figure 4 integrates the results obtained with the Ribo-Seq and the RNA-Seq data, the log₁₀ fold enrichment of the significant GO terms is plotted.

Analysis of translational efficiency

We searched for genes with significantly increased or decreased translational efficiency (TE)(Ingolia et al., 2009) using the RiboDiff program (Zhong et al., 2017). We selected genes significant at an adjusted p-value < 0.05 and showing log₂(TE_stress/TE_normal) higher than 0.67 or lower than −0.67 (plus or minus the standard deviation of the distribution).

We downloaded S.cerevisiae 5’UTR sequences from the Yeast Genome Database (https://downloads.yeastgenome.org/sequence/S288C_reference/SGD_all_ORFs_5prim_e_UTRs.fsa). We selected 5’UTR sequences longer than 30 nucleotides, removed identical sequences and took the longest 5’UTR per gene when several existed. The resulting annotation file contained the genomic coordinates of the 5’UTRs of 2,424 genes. We recovered 5’UTR sequences for 5 of the 12 cell cycle-related genes that were potentially repressed at the translational level (HTL1, SPC19, CDC26, BNS1, DIB1). In none of these cases the number of Ribo-Seq reads in the 5’UTR divided by the number of Ribo-Seq reads in the coding sequence increased in oxidative stress with respect to normal growth conditions.

Acknowledgements

We acknowledge the Proteomics Unit of Center for Regulatory Genomics and Universitat Pompeu Fabra for their lab support to isolate proteins from the yeast cultures. We are also grateful to Jorge Ruiz-Orera and Robert Castelo for advice during this project. The work was funded by grants BFU2015-65235-P, BFU2015-68351-P and BFU2016-80039-R, from Ministerio de Economía e Innovación (Spanish Government) - FEDER (EU), and from grant PT17/0009/0014 from Instituto de Salud Carlos III – FEDER. We also received funding from the “Maria de Maeztu” Programme for Units of Excellence in R&D (MDM-2014-0370) and from Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya (AGAUR), grant number 2014SGR1121, 2014SGR0974, 2017SGR01020 and, predoctoral fellowship (FI) to W.B. We also acknowledge support from the EU Erasmus Programme to T.T.

Footnotes

↵# Shared first co-authorship

References

↵
Anders, S. et al. (2015) HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics, 31, 166–9.
OpenUrl CrossRef PubMed Web of Science
↵
Ashburner, M. et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet., 25, 25–9.
OpenUrl CrossRef PubMed Web of Science
↵
Aspden, J.L. et al. (2014) Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq. Elife, 3, e03528.
OpenUrl CrossRef PubMed
↵
Bolger, A.M. et al. (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–20.
OpenUrl CrossRef PubMed Web of Science
↵
Borg, I. and Groenen, P.J.F. (1997) Modern multidimensional scaling Springer.
↵
Edfors, F. et al. (2016) Gene-specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol., 12, 883.
OpenUrl Abstract/FREE Full Text
↵
Gerashchenko, M. V. et al. (2012) Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proc. Natl. Acad. Sci., 109, 17394–17399.
OpenUrl Abstract/FREE Full Text
↵
Gerber, S.A. et al. (2003) Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci., 100, 6940–6945.
OpenUrl Abstract/FREE Full Text
↵
Huang, D.W. et al. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc., 4, 44–57.
OpenUrl CrossRef PubMed Web of Science
↵
Ingolia, N.T. et al. (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science, 324, 218–23.
OpenUrl Abstract/FREE Full Text
↵
Ingolia, N.T. (2016) Ribosome Footprint Profiling of Translation throughout the Genome. Cell, 165, 22–33.
OpenUrl CrossRef PubMed
↵
Ingolia, N.T. et al. (2011) Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell, 147, 789–802.
OpenUrl CrossRef PubMed Web of Science
↵
Ingolia, N.T. et al. (2014) Ribosome Profiling Reveals Pervasive Translation Outside of Annotated Protein-Coding Genes. Cell Rep., 8, 1365–79.
OpenUrl CrossRef PubMed Web of Science
↵
Ingolia, N.T. et al. (2012) The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat. Protoc., 7, 1534–50.
OpenUrl CrossRef PubMed
↵
Jungfleisch, J. et al. (2017) A novel translational control mechanism involving RNA structures within coding sequences. Genome Res., 27, 95–106.
OpenUrl Abstract/FREE Full Text
↵
Khong, A. et al. (2017) The Stress Granule Transcriptome Reveals Principles of mRNA Accumulation in Stress Granules. Mol. Cell, 68, 808–820.e5.
OpenUrl CrossRef PubMed
↵
Langmead, B. et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 10, R25.
OpenUrl CrossRef PubMed
↵
Law, C.W. et al. (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol., 15, R29.
OpenUrl CrossRef PubMed
Lee, M. V. et al. (2014) A dynamic model of proteome changes reveals new roles for transcript alteration in yeast. Mol. Syst. Biol., 7, 514–514.
OpenUrl
↵
Li, H. et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25, 2078–2079.
OpenUrl CrossRef PubMed Web of Science
↵
Liu, W. et al. (2018) TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data. Nucleic Acids Res., 46, D206–D212.
OpenUrl
↵
Luo, Y. et al. (2018) P-Bodies: Composition, Properties, and Functions. Biochemistry, 57, 2424–2431.
OpenUrl CrossRef
↵
Martin, M. (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 17.1.
↵
Michel, A.M. et al. (2018) GWIPS-viz: 2018 update. Nucleic Acids Res., 46, D823–D830.
OpenUrl
↵
Michel, A.M. et al. (2012) Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Res., 22, 2219–29.
OpenUrl Abstract/FREE Full Text
↵
Morano, K.A. et al. (2012) The response to heat shock and oxidative stress in Saccharomyces cerevisiae. Genetics, 190, 1157–95.
OpenUrl Abstract/FREE Full Text
↵
Payne, S.H. (2015) The utility of protein and mRNA correlation. Trends Biochem. Sci., 40, 1–3.
OpenUrl CrossRef PubMed
↵
Ponnala, L. et al. (2014) Correlation of mRNA and protein abundance in the developing maize leaf. Plant J., 78, 424–440.
OpenUrl CrossRef PubMed
↵
Rapaport, F. et al. (2013) Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol., 14, R95.
OpenUrl CrossRef PubMed
↵
Reid, D.W. et al. (2015) Simple and inexpensive ribosome profiling analysis of mRNA translation. Methods, 91, 69–74.
OpenUrl CrossRef
↵
Robinson, M.D. et al. (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139–140.
OpenUrl CrossRef PubMed Web of Science
↵
Ruiz-Orera, J. et al. (2014) Long non-coding RNAs as a source of new peptides. Elife, 3, e03523.
OpenUrl CrossRef PubMed
↵
Schwanhäusser, B. et al. (2011) Global quantification of mammalian gene expression control. Nature, 473, 337–342.
OpenUrl CrossRef PubMed Web of Science
↵
Shenton, D. et al. (2006) Global translational responses to oxidative stress impact upon multiple levels of protein synthesis. J. Biol. Chem., 281, 29011–21.
OpenUrl Abstract/FREE Full Text
↵
Slavoff, S.A. et al. (2013) Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol., 9, 59–64.
OpenUrl CrossRef PubMed
↵
de Sousa Abreu, R. et al. (2009) Global signatures of protein and mRNA expression levels. Mol. Biosyst., 5, 1512–26.
OpenUrl CrossRef PubMed Web of Science
↵
Tebaldi, T. et al. (2012) Widespread uncoupling between transcriptome and translatome variations after a stimulus in mammalian cells. BMC Genomics, 13, 220.
OpenUrl CrossRef PubMed
↵
Tsankov, A.M. et al. (2010) The Role of Nucleosome Positioning in the Evolution of Gene Regulation. PLoS Biol., 8, e1000414.
OpenUrl CrossRef PubMed
↵
Xie, S.-Q. et al. (2016) RPFdb: a database for genome wide information of translated mRNA generated from ribosome profiling. Nucleic Acids Res., 44, D254–D258.
OpenUrl CrossRef PubMed
↵
Yordanova, M.M. et al. (2018) AMD1 mRNA employs ribosome stalling as a mechanism for molecular memory formation. Nature, 553, 356–360.
OpenUrl CrossRef PubMed
↵
Zhong, Y. et al. (2017) RiboDiff: detecting changes of mRNA translation efficiency from ribosome footprints. Bioinformatics, 33, 139–141.
OpenUrl CrossRef PubMed
↵
Zid, B.M. and O’Shea, E.K. (2014) Promoter sequences direct cytoplasmic localization and translation of mRNAs during starvation in yeast. Nature, 514, 117–121.
OpenUrl CrossRef PubMed

View the discussion thread.

Posted December 19, 2018.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Genomics

Subject Areas

All Articles

Animal Behavior and Cognition (5215)
Biochemistry (11753)
Bioengineering (8752)
Bioinformatics (29201)
Biophysics (14974)
Cancer Biology (12100)
Cell Biology (17413)
Clinical Trials (138)
Developmental Biology (9422)
Ecology (14182)
Epidemiology (2067)
Evolutionary Biology (18309)
Genetics (12245)
Genomics (16804)
Immunology (11869)
Microbiology (28098)
Molecular Biology (11596)
Neuroscience (60975)
Paleontology (451)
Pathology (1871)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2886)
Systems Biology (7340)
Zoology (1651)

[1] ↵
Anders, S. et al. (2015) HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics, 31, 166–9.
OpenUrl CrossRef PubMed Web of Science

[2] ↵
Ashburner, M. et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet., 25, 25–9.
OpenUrl CrossRef PubMed Web of Science

[3] ↵
Aspden, J.L. et al. (2014) Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq. Elife, 3, e03528.
OpenUrl CrossRef PubMed

[4] ↵
Bolger, A.M. et al. (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–20.
OpenUrl CrossRef PubMed Web of Science

[5] ↵
Borg, I. and Groenen, P.J.F. (1997) Modern multidimensional scaling Springer.

[6] ↵
Edfors, F. et al. (2016) Gene-specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol., 12, 883.
OpenUrl Abstract/FREE Full Text

[7] ↵
Gerashchenko, M. V. et al. (2012) Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proc. Natl. Acad. Sci., 109, 17394–17399.
OpenUrl Abstract/FREE Full Text

[8] ↵
Gerber, S.A. et al. (2003) Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci., 100, 6940–6945.
OpenUrl Abstract/FREE Full Text

[9] ↵
Huang, D.W. et al. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc., 4, 44–57.
OpenUrl CrossRef PubMed Web of Science

[10] ↵
Ingolia, N.T. et al. (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science, 324, 218–23.
OpenUrl Abstract/FREE Full Text

[11] ↵
Ingolia, N.T. (2016) Ribosome Footprint Profiling of Translation throughout the Genome. Cell, 165, 22–33.
OpenUrl CrossRef PubMed

[12] ↵
Ingolia, N.T. et al. (2011) Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell, 147, 789–802.
OpenUrl CrossRef PubMed Web of Science

[13] ↵
Ingolia, N.T. et al. (2014) Ribosome Profiling Reveals Pervasive Translation Outside of Annotated Protein-Coding Genes. Cell Rep., 8, 1365–79.
OpenUrl CrossRef PubMed Web of Science

[14] ↵
Ingolia, N.T. et al. (2012) The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat. Protoc., 7, 1534–50.
OpenUrl CrossRef PubMed

[15] ↵
Jungfleisch, J. et al. (2017) A novel translational control mechanism involving RNA structures within coding sequences. Genome Res., 27, 95–106.
OpenUrl Abstract/FREE Full Text

[16] ↵
Khong, A. et al. (2017) The Stress Granule Transcriptome Reveals Principles of mRNA Accumulation in Stress Granules. Mol. Cell, 68, 808–820.e5.
OpenUrl CrossRef PubMed

[17] ↵
Langmead, B. et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 10, R25.
OpenUrl CrossRef PubMed

[18] ↵
Law, C.W. et al. (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol., 15, R29.
OpenUrl CrossRef PubMed

[19] Lee, M. V. et al. (2014) A dynamic model of proteome changes reveals new roles for transcript alteration in yeast. Mol. Syst. Biol., 7, 514–514.
OpenUrl

[20] ↵
Li, H. et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25, 2078–2079.
OpenUrl CrossRef PubMed Web of Science

[21] ↵
Liu, W. et al. (2018) TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data. Nucleic Acids Res., 46, D206–D212.
OpenUrl

[22] ↵
Luo, Y. et al. (2018) P-Bodies: Composition, Properties, and Functions. Biochemistry, 57, 2424–2431.
OpenUrl CrossRef

[23] ↵
Martin, M. (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 17.1.

[24] ↵
Michel, A.M. et al. (2018) GWIPS-viz: 2018 update. Nucleic Acids Res., 46, D823–D830.
OpenUrl

[25] ↵
Michel, A.M. et al. (2012) Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Res., 22, 2219–29.
OpenUrl Abstract/FREE Full Text

[26] ↵
Morano, K.A. et al. (2012) The response to heat shock and oxidative stress in Saccharomyces cerevisiae. Genetics, 190, 1157–95.
OpenUrl Abstract/FREE Full Text

[27] ↵
Payne, S.H. (2015) The utility of protein and mRNA correlation. Trends Biochem. Sci., 40, 1–3.
OpenUrl CrossRef PubMed

[28] ↵
Ponnala, L. et al. (2014) Correlation of mRNA and protein abundance in the developing maize leaf. Plant J., 78, 424–440.
OpenUrl CrossRef PubMed

[29] ↵
Rapaport, F. et al. (2013) Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol., 14, R95.
OpenUrl CrossRef PubMed

[30] ↵
Reid, D.W. et al. (2015) Simple and inexpensive ribosome profiling analysis of mRNA translation. Methods, 91, 69–74.
OpenUrl CrossRef

[31] ↵
Robinson, M.D. et al. (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139–140.
OpenUrl CrossRef PubMed Web of Science

[32] ↵
Ruiz-Orera, J. et al. (2014) Long non-coding RNAs as a source of new peptides. Elife, 3, e03523.
OpenUrl CrossRef PubMed

[33] ↵
Schwanhäusser, B. et al. (2011) Global quantification of mammalian gene expression control. Nature, 473, 337–342.
OpenUrl CrossRef PubMed Web of Science

[34] ↵
Shenton, D. et al. (2006) Global translational responses to oxidative stress impact upon multiple levels of protein synthesis. J. Biol. Chem., 281, 29011–21.
OpenUrl Abstract/FREE Full Text

[35] ↵
Slavoff, S.A. et al. (2013) Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol., 9, 59–64.
OpenUrl CrossRef PubMed

[36] ↵
de Sousa Abreu, R. et al. (2009) Global signatures of protein and mRNA expression levels. Mol. Biosyst., 5, 1512–26.
OpenUrl CrossRef PubMed Web of Science

[37] ↵
Tebaldi, T. et al. (2012) Widespread uncoupling between transcriptome and translatome variations after a stimulus in mammalian cells. BMC Genomics, 13, 220.
OpenUrl CrossRef PubMed

[38] ↵
Tsankov, A.M. et al. (2010) The Role of Nucleosome Positioning in the Evolution of Gene Regulation. PLoS Biol., 8, e1000414.
OpenUrl CrossRef PubMed

[39] ↵
Xie, S.-Q. et al. (2016) RPFdb: a database for genome wide information of translated mRNA generated from ribosome profiling. Nucleic Acids Res., 44, D254–D258.
OpenUrl CrossRef PubMed

[40] ↵
Yordanova, M.M. et al. (2018) AMD1 mRNA employs ribosome stalling as a mechanism for molecular memory formation. Nature, 553, 356–360.
OpenUrl CrossRef PubMed

[41] ↵
Zhong, Y. et al. (2017) RiboDiff: detecting changes of mRNA translation efficiency from ribosome footprints. Bioinformatics, 33, 139–141.
OpenUrl CrossRef PubMed

[42] ↵
Zid, B.M. and O’Shea, E.K. (2014) Promoter sequences direct cytoplasmic localization and translation of mRNAs during starvation in yeast. Nature, 514, 117–121.
OpenUrl CrossRef PubMed

Using ribosome profiling to quantify differences in protein expression: a case study in Saccharomyces cerevisiae oxidative stress conditions

Abstract

Introduction

Results and Discussion

Quantification of gene expression by Ribo-Seq and RNA-Seq

Differential gene expression analysis

Uncoupling between changes at the transcriptome and translatome levels

Dissecting differential regulation by functional class

Translation inhibition of cell cycle genes

Concluding remarks

Methods

Biological material

Ribosome profiling

Processing of the sequencing data

Differential gene expression analysis

Quantification of protein abundance by mass spectrometry

Analysis of functional clusters

Analysis of translational efficiency

Acknowledgements

Footnotes

References

Citation Manager Formats

Subject Area