Abstract
Cell survival requires the control of biomolecule concentration, i.e. they should approach homeostasis. With informative macromolecules, the particular concentration variation ranges depend on each type: DNA is not buffered, but mRNAs and proteins are homeostatically controlled, which leads to the ribostasis and proteostasis concepts. In recent years, in order to understand the particular features of mRNA ribostasis and proteostasis, we have studied variations in mRNA and protein concentrations in different situations in the model organism S. cerevisiae. Here we extend this study by comparing published data from three other model organisms: E. coli, S. pombe and cultured human cells. We describe that mRNA ribostasis is less strict than proteostasis, but seems tightly controlled. A constant ratio appears between the average decay and dilution rates during the cell growth for mRNA, but not for proteins. We postulate that this is due to a trade-off between the cost of synthesis and the response capacity at the transcription level that is not possible at the translation level because the high stability of proteins, compared to that of mRNAs, precludes it. We hypothesize that the middle-place role of mRNA in the Central Dogma of Molecular Biology and its chemical instability make it more suitable than proteins for the fast changes needed for gene regulation.
Introduction
Homeostasis is the state of steady internal conditions maintained by living things. This concept includes controlling concentrations of cellular molecules and macromolecules. With informative macromolecules, the particular concentration variation ranges depend on molecules. The amount of DNA (genes) cannot be changed along the cell cycle, but in the replication phase. Therefore, it is not directly buffered, although its amount determines both nuclei and cell sizes and the transcription process itself (see ref. 29). The importance of the DNA:cell volume ratio has been recently dealt with by A. Amon’s group. These authors showed that optimal cell function requires the maintenance of a narrow range of DNA:cytoplasm ratios, and when cell size exceeds the optimal ratio, the cytoplasmic dilution provokes defects in nucleic acids and protein biosynthesis that cause senescence in both yeasts and human cells (35).
On the contrary, mRNAs and proteins have no fixed amount per cell. They are multiple independent molecules that vary upon demand. This is a fundamental idea behind gene regulation: genes numbers are fixed, but their expression can change. Gene expression follows an obligatory flux called the Central Dogma (12). As the concentrations of both individual species vary, it is possible to think that homeostatic control for mRNAs and proteins is not necessary. However, the total protein concentration in all cells is known to be quite constant (24,34). The total mRNA concentration has not been that well studied, and it is generally assumed to vary within a certain range; i.e. to be homeostatically controlled (5,42). The terms ribostasis and proteostasis are commonly used to respectively refer to RNA (normally meaning mRNA) and protein homeostasis.
In the last decade, in order to understand the particular features of mRNA ribostasis and proteostasis, we have studied variations in the total mRNA and protein concentrations ([mRNA], [protein]) between different growth conditions in the yeast Saccharomyces cerevisiae. We have also studied the mechanisms to keep homeostasis by means of transcription and degradation rates, and their cross-talk mechanisms (5, 15, 17, 30). Another approach to understand the different roles of mRNAs and proteins, and their various homeostasis features, is to compare total [mRNA] and [protein] and those of their synthesis machineries among model organisms ranging from bacteria to human cells. Here we review published data from our own group and other groups on four model organisms for which there is enough information: Escherichia coli, S. cerevisiae, Schizosaccharomyces pombe and human cells in culture related to [mRNA] and [protein], synthesis and stabilities (half-lives), and also for their synthesis machineries: RNA polymerases (RNA pol) and ribosomes. With these data, we have made comparisons between organisms, and also among all these parameters to extract the general rules for the mRNA and protein homeostasis in them. A recent paper followed a similar strategy by comparing omics data from four model organisms: E.coli, S. cerevisiae, cultured mouse and human cells (18). That study, however, focused only on the synthesis rates of mRNAs and proteins, which indicated that biological noise (46) was a key element in the selection of regulatory expression strategies for protein-coding genes.
Using all these quantitative data, we conclude that the evolution of the gene expression flux (Central Dogma) and the chemical properties of RNAs and proteins can explain how mRNA became the main point of gene regulation because of its instability and the central position of the flux. This regulatory role of mRNA made transcription the main point for biological noise. mRNA ribostasis is, therefore, not so strict as proteostasis because of the need for mRNA changes and the comparatively low cost of mRNA transcription.
Materials and Methods
Calculating the transcription rates for eukaryotic RNA polymerases
Transcription rates (TR) are usually defined based on rNTP consumption. In S. cerevisiae they were calculated as 60%, 25% & 15% of the total consumption for RNA pol I, II & III, respectively (59). Another way to define the relative importance of each RNA pol is regarding the rate of transcripts that each produces. In the same yeast, the number of produced rRNAs (ribosomes) has been estimated at 300,000 per cell cycle of 90 min (58) or 2000/min (59), e.g. an RNA pol I transcribes between 33-55 35S rRNA transcripts/s. As 5S rRNA should be coordinated with 35S rRNA, we assume that its synthesis rate should be the equivalent (≈45 transcripts/s). tRNAs have been estimated to be produced at 3,000,000 per cell cycle (58) or about 555 transcripts/s. Thus, RNA pol III transcribes about 600 molecules/s (Figure 1). Synthesis of mRNAs by RNA pol II can be calculated from our previous study (41) but after correcting for the newest data of mRNAs abundances and stabilities (Tables I & II) at about 60 mRNAs/s. Therefore, regarding the transcript number, the respective contributions of RNA pol I, II and III are thus 6.5%, 8.5% and 85%, respectively. These percentages can be recalculated independently of those of rNTP consumption (see above) by taking into account the average sizes of their transcripts. RNA pol III transcribes much shorter molecules (0.13 kb on average for 5S and tRNAs primary transcripts) than RNA pol II (average of 1.5 kb mRNA, ref. 13) and RNA pol I (6.9 kb 35S transcript). The results are offered in the following percentages about transcribed molecules: 6%, 12% and 82% respectively for RNA pol I, II and III (Figure 1). These figures well fit the previous calculations for the transcribed copy numbers of the three RNA pol. Finally as for the number of different transcribed genes, budding yeast RNA pol II transcribes most genes, i.e. >90%, because there are about 6000 protein-encoding genes vs. ≈300 RNA pol III genes, and only one RNA pol I gene (with 100-200 copies).
Results
This paper reviews published quantitative data from RNAs and proteins in four model organisms to conclude that there are universal rules for the proportions of both molecules and their synthesis machineries, except for the much variable total [mRNA]. Then, we postulate that the poor stability of mRNAs and the relatively good stability of proteins condition their cellular functions and explain why most gene regulation occurs at the transcriptional level. Finally, we discuss these differences between mRNAs and proteins from an evolutionary point of view.
Why proteostasis is much stricter than mRNA ribostasis
We collected data published about the abundances of proteins and mRNAs molecules in three free living microorganisms: the eubacteria E. coli, and two distantly related yeasts S. cerevisiae and S. pombe, and from human cells in culture (mainly HeLa cells for most data). Table I shows that in all organisms, mRNAs and proteins have very different average numbers. Proteins are thousands of times more abundant, but have variable factors depending on the organism: about a 103 factor in microorganisms and one of 104 in human cells. The number of each class of molecules scales with cell size. The scale is quite constant for proteins, which remain [protein] uniform (1-3 million molecules/fL), as formerly stated by Milo et al (32-34). However, the [mRNA] is more variable, about 4-fold higher in E. coli than in yeasts and 10-fold lower in human cells. If this major change reflects a functional property (see below), it remains an open question. Despite the high diversity of other ncRNAs in human cells, their low abundance avoids compensating lower human [mRNA] (39). The large number of protein molecules is probably the reason for the conservation of total [protein] between different living beings, and also between distinct physiological situations (strict proteostasis). Cell mass is composed of protein in a large proportion (about 50% of dry weight in most organisms: 22,33). In contrast, total RNA is much less abundant: 4-10% in the dry weight mass in budding yeast (22), and 15-20% in E. coli (47,52). Moreover, most RNA is not mRNA, which constitutes only a small fraction (5-10% of the total RNA in all organisms: 26,39,52,59). Thus, the fraction of cell mass that is mRNA is so tiny (1% at the most) that mRNA ribostasis cannot condition the structural features of cells, unlike proteostasis.
The abundance of proteins and mRNAs should, on the other hand, condition the abundance of their synthetic machineries (RNA polymerases and ribosomes). Ribosomes are much more abundant than RNA pol (5-25×, Table I), although this ratio is much lower than the relative amounts of proteins and mRNAs (>1000×). This difference indicates that synthesis rates are not the only factor to explain the actual difference in quantities: the differential stabilities of mRNAs and proteins also matter (see below). On the other hand, ribosomes are similarly concentrated in eukaryotes (about 4000/fL), but are 5-fold more concentrated in E. coli. This can be caused by the higher concentration of protein molecules, by the shorter generation time of this bacterium or/and by the coupling takes places between transcription and translation.
When putting together the quantitative data for molecules and their synthesis rates in the Central Dogma, several interesting features appear. Figure 1 provides a summary of the data for the budding yeast. The data known for other eukaryotes are quite similar (in relative terms), so we consider that this figure can act as a general profile for a eukaryotic cell. A single DNA molecule (haploid genome) produces RNAs that are in the steady state within the range from 104 to 106 total molecules per cell (less than 105 for mRNAs; see Table I). Proteins are much bigger in number (107 to 108), which conditions the cost of translation. By using published data, we calculate that the total (TR) is about 700 RNAs/second in S. cerevisiae. This synthesis rate is split into three RNA polymerases (see M&M) because transcription is not only devoted to make mRNAs. Other ncRNAs are also transcribed and perform important functions. If we set a limit to the most abundant ncRNAs, those that participate in the translation process, tRNAs and rRNAs, then their transcription in eukaryotes is done by two specialized RNA pol: I & III. They are only slightly less abundant than RNA pol II (Table I) in spite of transcribing much fewer genes because they have to produce a 10-fold larger number of transcripts (Fig. 1). In fact each type of RNA polymerase plays a maximal quantitative role, depending on the analyzed parameter. RNA pol I consumes the most rNTPs, RNA pol III is the biggest producer of transcripts and RNA pol II has the most target genes (see Figure 1 and M&M).
In quantitative terms, protein synthesis is, however, much higher than mRNA synthesis: 13,000 vs. 700 molecules/s (56). The obvious consequence is that translation is much more costly than total transcription, and even more so if only mRNA transcription by RNA pol II is considered. The energy cost of total translation in S. cerevisiae has been calculated to be about 10-20x higher than mRNA transcription (57). This author used old values for median mRNA stability (23 min) and protein stability (30 h). When the most recent and confident data are used (see Table II), translation should be even more costly (30x or more) than mRNA transcription. Thus, energy cost is a key parameter that should condition the strategies of cells for [mRNA] and [protein] homeostasis.
Finally, another question that arises from the quantitative data in Table I are the relative numbers of the three components in the translation process. The ribosomes/mRNAs ratio tends to be similar in microorganisms, at about 4-10 ribosomes/mRNA, which is not far from the in vivo experimental ribosome densities measured in S. cerevisiae and E. coli (6-7 ribosomes/mRNA, ref. 1, 6). This is true even when the [ribosome] decreases because of the different growth temperature (5), which suggests that it is mechanistically constrained. In fact, it has been shown in both E. coli and S. cerevisiae that, when grown at their fastest growth rate (i.e. shortest generation time: GT), ribosomes are saturated with mRNAs (11). It is likely that ribosomes are almost saturated with mRNAs in all microorganisms when growing at their fastest rate. Thus if we multiplied [ribosome] x GT, an almost constant value is obtained (Table II). This is explained by protein production being limited by the availability of free ribosomes (49) and because the GT is conditioned by protein production as proteins represent a high percentage of cell mass (see above). It has been shown in both E. coli and S. cerevisiae that [ribosomes] depend on the GT within a range of fast growth conditions. In E. coli, the percentage of protein devoted to ribosomes (ribosome/dry mass) grows at a growth rate that is almost proportionally between 60 and 24 min GT (34). In the S. cerevisiae grown in different media (57), ribosome x GT is also constant for a wide range of GTs (from 240 to 90 min). In both cases this constant parameter is not maintained at longer GT, where excess ribosomes per mRNA seems to occur, similarly to what happens in human cells (54 & Table 1). In cultured human cells, the [ribosome] x GT is much higher, which suggests different constraints for translation in multicellular organisms. On the other hand, in E. coli and S. cerevisiae, there is an experimental demonstration of certain excess ribosomes appearing even at the fastest GR, and this fraction increasing as GR decreases. It has been suggested that these excess ribosomes are employed when translation demands unexpectedly increase (20,31).
Interestingly, and as mentioned above, [mRNA], is not constantly maintained in the four compared species (Table I), unlike [protein]. However, and strikingly, [mRNA] is approximately correlated inversely to GT (the product of both is almost a constant) when comparing the four organisms (Table II). This is also seen when comparing a single organism at very different GTs. For instance, it has been shown that [mRNA] decreases after a diauxic shift when the GT increases (45) and translation decreases. Thus, it can be suggested that, at least for microorganisms, wide variations in [mRNA] are related to the translation rate control as the ribosomes/mRNA ratio is constant and the [ribosome] determines the maximum translation rate capacity. This explains that both [ribosome] and [mRNA] change in parallel when growing budding yeasts at different growth temperatures (5). Under exponential growth and with minor changes in growth rates, the [mRNA] in S. cerevisiae remains, however, constant (15).
What the numbers of molecules and their turnover rates can tell us about gene expression
mRNAs and proteins also have very different stabilities in all living cells (Table II). mRNA half-lives scale with (GT) between different cells and also between the GTs of a single species. The mRNAs median half-life is usually about 15-20% of the cell’s GT (Table II). This has been already pointed by other authors (34,53,61). Given the much higher GT than the median mRNA half-life transcription by RNA pol II, it is mainly devoted to compensate mRNA degradation in not only S. cerevisiae (41), but in all organisms. We previously postulated that the approximately constant ratio could be due to the need for maintaining an optimal balance between the synthesis cost and response capacity (9). Here we postulate that this might be a general rule for all cells growing at their highest growth rate (Figure 2).
Protein half-lives do not, however, scale with the GT. The protein median half-life/GT ratio is variable among organisms (Table II). Protein half-lives in microorganisms are much longer than GTs. In human cells in culture it takes a similar value. Thus, most translation seems devoted to compensate the dilution caused more by growth rather than by degradation (10,60), except in cultured human cells in which the contributions of both are similar (Table II). Thus, paradoxically, although the response capacity of cells to environmental changes is related to proteins because they are the final goal of gene expression, and even when taking into account that the translation cost of a cell is much higher than the mRNA transcription cost (Figure 1), there is no single optimal balance between synthesis cost and response capacity in proteins for all organisms. Apparently, it depends on the particular type of organism, bacteria, yeasts or mammalian cells, and on the particular GT. The cost of making proteins is high, but is not affected much by their half-lives because degradation is only a small part of protein disappearance (especially in free-living microorganisms), contrarily to mRNA (Figure 2). This can be ultimately due to the intrinsic higher chemical stability of proteins and should have been a key factor during primitive life evolution, which selected protein-based cells vs. RNA-based ones. Therefore, the intrinsic (and very convenient for living cells) high stability of proteins has precluded an evolutionary search for an optimal balance between synthesis cost and response capacity. Response capacity in proteins is, therefore, usually based on post-translational modifications. It should be noted that only in long GT cells does it seem that translation is devoted, in part, to compensate protein degradation. In non-dividing cells, translation is quantitatively much less important because only protein degradation should be compensated, which leads to changes in protein stability to avoid long-lived proteins from accumulating (63). Thus in non-dividing cells, the global transcription/translation cost balance differs considerably from actively growing cells.
We conclude that the marked strictness of global proteostasis and the high stability of most proteins, compared mRNAs, explain why evolution has selected mainly regulatory mechanisms at the mRNA level for most genes.
Biological noise and epigenetics
In a recent paper, Hausser et al (18) compared the abundances and synthesis rates for individual genes in four model organisms, and concluded that biological noise is a key element in the selection of regulatory expression strategies for protein-coding genes. It is known that intrinsic noise is due mainly to transcription, and not to translation (37,46), which is due chiefly to fluctuations in the mRNA levels that arise from the stochastic activation of gene promoters (3). They argue that because the intrinsic noise for two proteins with identical abundance is higher for those with lower [mRNA] (e.g. less TR because mRNA stability does not strongly influence [mRNA]), cells can choose an expression strategy with high noise (low TR) or low noise (high TR) levels. This is not strictly true because the genes with the same [mRNA] can have different TRs due to their varying stabilities, and this is an important expression strategy for a gene (43). In fact, it has been suggested that both the birth and death of mRNAs can affect noise (3). In any case, a higher TR means a higher cost. Although the transcription cost is much lower than the translation cost (Fig. 1), it is important in evolutionary terms because the translation cost is fixed for a given protein with a given abundance to play its biological role. The transcription cost, however, is elective for every gene by striking an equilibrium between precision and economy depending on TR.
This argument adds another advantage for regulating at the mRNA level instead at the protein level. In fact, as living beings need regulation at the TR level, the appearance of (noisy) cis-trans mechanisms and (low-noise) epigenetic mechanisms at gene promoters could be a consequence of selecting the optimal noise level in each promoter during evolution. Additionally, regulation of noise at the translation level cannot be selected by evolution because the heritability of the cis-trans mechanisms operating in the mRNA molecule is quite low given the large number and instability of these molecules compared to DNA.
The history of RNA, its place in the middle of the expression flux and its chemical properties should have been the origins for selecting transcriptional regulation, including epigenetic mechanisms, instead of translational regulation. Therefore, the equilibrium between the noise and cost caused by the election of a given TR should have been a crucial factor to select an expression strategy for each gene.
A historical and functional perspective of the Central Dogma
In the RNA world, early life used RNA for both genetic information and catalytic ability (Dworkin et al 2003). RNA is more unstable and mutable than DNA and proteins (25). Therefore, RNA high turnover has always been a key factor of its function. When proteins appeared in primitive cells, they substituted RNAs for most structural and catalytic functions because of their versatility and stability. After DNA appeared later (14), RNA was positioned as an obligate intermediate in the gene expression flux (Central Dogma) and became the main point of gene regulation given its instability and central position (mRNA) in the flux. A theoretical question is why mRNA remains in the middle and was not replaced completely with DNA for information storage and with protein for catalytic functions to make life with a simpler Central Dogma (DNA→protein). One possible answer to this question is that cells have taken advantage of the natural instability of mRNA, instead of protecting it as they did for other stable RNAs (rRNA, tRNA, etc.), because it allows for better more flexible regulatory mechanisms. Although the mRNA amount can be regulated at both the synthesis (TR) and degradation rates (half-life) (43), there is a strong preference for regulation at the synthesis level (23). This also made the TR the main point for biological noise.
mRNA ribostasis is, therefore, not as strict compared to proteostasis because of the need for mRNA changes and the comparatively low cost of mRNA transcription. Given the high protein stability and also the cost of protein synthesis being much higher than the cost of mRNA transcription, we conclude that the feasibility of protein turnover as a general way to regulate gene expression is very poor.
Declaration of interest statement
The authors report no conflict of interest
Acknowledgements
We thank Vicent Pelechano for his helpful discussion. This work has been supported by grants from the Spanish Ministry of Economy and Competitiveness, and European Union funds (FEDER) [BFU2016-77728-C3-1-P to S. C.], [BFU2016-77728-C3-3-P and BFU2015-71978-REDT to J.E.P-O], from the Regional Valencian Government [PROMETEO II 2015/006 to J.E.P-O]. The funding for open access charge: [BFU2016-77728-C3-3-P].