Abstract
Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase (Rubisco) is not only the dominant enzyme in the biosphere, responsible for the vast majority of carbon fixation, but also one of the best characterized enzymes. Enhanced Rubisco catalysis is expected to increase crop yields, but a substantially improved enzyme has evaded bioengineers for decades. Based on correlations between Rubisco’s kinetic parameters, it is widely posited that tradeoffs stemming from the catalytic mechanism strictly constrain Rubisco’s maximum catalytic potential. Though compelling, the reasoning that established that view was based on data from only ≈20 organisms. Here we re-examine these tradeoffs with an expanded dataset including data from >200 organisms. We find that most correlations are substantially attenuated, with the inverse relationship between carboxylation kcat and specificity SC/O being a key example. However, the correlation predicted by one tradeoff model is stronger and more significant in our expanded dataset. In this model, increased catalytic efficiency (kcat/KM) for carboxylation requires a similar increase in catalytic efficiency for the competing oxygenation reaction, evidenced here by a strong power-law correlation between those catalytic efficiencies. In contrast to previous work, our results imply that Rubisco evolution is constrained mostly by the physicochemical limits of O2/CO2 recognition, which should reframe efforts to understand and engineer this very central enzyme.
Significance All plants, algae and cyanobacteria rely on the Calvin-Benson-Bassham (CBB) cycle for growth. Rubisco is the central enzyme of the CBB cycle and the most abundant enzyme on Earth. While it is often claimed that Rubisco is slow, its catalytic rate is just below the average enzyme. Yet it is surprising that Rubisco is not faster given its centrality and abundance. Previous analyses of Rubisco kinetic parameters raised doubts that the enzyme can be improved. Here we examine a new compendium of Rubisco kinetic parameters for evidence of proposed constraints on Rubisco catalysis. Only one proposal is strongly supported by the data, which argues for re-evaluation of our understanding of one of the most impactful proteins in Earth’s history.
Introduction
Rubisco is the primary carboxylase of the Calvin-Benson-Bassham (CBB) cycle - the carbon fixation cycle responsible for growth throughout the green lineage and many other autotrophic taxa - and the ultimate source of nearly all carbon atoms entering the biosphere (1). Typically, 20-30% of total soluble protein in C3 plant leaves is Rubisco (2). As Rubisco is so highly expressed and plants are the dominant constituents of planetary biomass (3), it is often said that Rubisco is the most abundant enzyme on Earth (1). As Rubisco is ancient (>2.5 billion years old), abundant, and remains central to contemporary biology, one might expect it to be exceptionally fast. But Rubisco is not fast (4–8). Typical carboxylation kcat values range from 1-10 s-1 and all known Rubiscos are capable of reacting “wastefully” with O2 in a process called oxygenation (Figure 1A-B). Improved Rubisco catalysis is expected to increase crop yields (9), but a substantially improved enzyme has evaded bioengineers for decades (10). The multiple evolution of CO2 concentrating mechanisms, which ensure Rubisco operates near its maximum rate, also raises doubts about whether Rubisco catalysis can be strictly improved (11).
Rubisco is a notoriously complex enzyme that is inhibited by its five-carbon substrate, ribulose 1,5,-bisphosphate (RuBP) and requires a covalent modification for activation - carbamylation of an active-site lysine (12, 13). Moreover, Rubisco depends on chaperones for folding, assembly and catalysis (12, 14–16). Once folded and activated, all known Rubiscos catalyze both carboxylation and oxygenation of RuBP through a multistep mechanism (Figure 1A, S1). Both carboxylation and oxygenation of RuBP are energetically favorable, but only carboxylation is considered productive because it incorporates carbon from CO2 into precursors that can generate biomass. Oxygenation is often portrayed as counterproductive as it occupies Rubisco active sites and yields a product (2-phosphoglycolate, 2PG) that is not part of the CBB cycle and must be recycled through metabolically-expensive photorespiration at a loss of carbon (17, 18). Despite the fact that many autotrophs depend on Rubisco carboxylation for growth, all known Rubiscos are relatively slow carboxylases and fail to exclude oxygenation (Figure 1A-B).
Over the decades since its discovery, various nomenclature has been used to describe the kinetics of Rubisco carboxylation and oxygenation. Here we use kcat,C and kcat,O to denote the turnover numbers (maximum per active site catalytic rates in units of s-1) for carboxylation and oxygenation respectively. KC and KO denote the Michaelis constants (half-saturation concentrations in μM) for carboxylation and oxygenation. The specificity factor SC/O = (kcat,C/KC) / (kcat,O/KO) is a unitless measure of the relative preference for CO2 over O2 (Figure 1A-C).
The long-standing observation of correlations between the specificity factor SC/O and other Rubisco kinetic parameters (19–21) is often cited to motivate the notion that tradeoffs inherent to the catalytic mechanism strictly constrain Rubisco’s carboxylation potential (4, 5, 7, 21). Indeed, we expect any physicochemical tradeoffs embedded in the Rubisco catalytic mechanism to manifest as correlations between kinetic parameters (Figure 1C) precisely because Rubisco is so central to autotrophic life and has, therefore, experienced substantial selection pressure (22).
Two tradeoff models have been proposed to explain the observed correlations (4, 5). Although the proposed models are substantively different, both models imply limitations on the concurrent improvement of the maximum carboxylation rate (kcat,C) and specificity (S) of natural Rubiscos (5). While these hypotheses appeal to physical and chemical intuition, they are based on data from only ≈20 organisms. Here we take advantage of the accumulation of new data - more than 200 Rubisco variants have been characterized since 2010 - to examine whether new data evidence the same correlations. We find that most of the previously-reported correlations between Rubisco kinetic parameters are substantially weakened by the addition of new data. However, correlation between the catalytic efficiency for carboxylation (kcat,C/KC) and the catalytic efficiency for oxygenation (kcat,O/KO) remains very strong and statistically significant (5), which may have important implications for efforts to understand and engineer Rubisco.
An extended dataset of Rubisco kinetic parameters
To augment existing data, we collected literature data on ≈250 Rubiscos including representatives of clades and physiologies that had been poorly represented in earlier datasets e.g. diatoms, ferns, CAM plants and anaerobic bacteria (Figure 2A). We collected kinetic parameters associated with carboxylation and oxygenation - S, KC, kcat,C, KO and kcat,O - as well as measurements of the RuBP Michaelis constant (half-maximum RuBP concentration, KRuBP) and experimental uncertainty for all values where available. All data considered were measured at 25°C and near pH 8 to ensure that measured values are comparable (Methods).
The resulting dataset contains Rubisco kinetic parameters from a total of 286 distinct species including 319 distinct SC/O values, 275 kcat,C values, 310 KC values, 198 kcat,O values and 256 KO values (Figure 2B). In 198 cases there was sufficient data to calculate catalytic efficiencies for carboxylation (kcat,C/KC) and oxygenation (kcat,O/KO, Methods). Though the data include measurements of some Form II, III and II/III Rubiscos, they remain highly focused on the Form I Rubiscos found in cyanobacteria, diatoms, algae and higher plants, which make up > 95% of the dataset (Figure 2B). As such, we focus primarily on the kinetic parameters of Form I Rubiscos here.
Rubisco kinetic parameters display very narrow dynamic range, with multiplicative variation (the standard deviation in log scale, denoted σ*) being well below one order-of-magnitude for all parameters (Figure 2C). As compared to other enzymes for which multiple kcat measurements are available, Rubisco displays extremely low variation in kcat,C (σ* = 0.4, Figure S4). Specificity SC/O displays the least variation (σ* = 0.2) of all parameters, though this may be due in part to overrepresentation of C3 plants in the dataset, which occupy a narrow range of SC/O ≈ 80-120. Nonetheless, measurements of SC/O for Form I and Form II enzymes are clearly distinct, with values ranging from ≈ 40-160 for Form Is and 7-15 for Form IIs (Figure 2C, SI).
Reanalysis of correlations between kinetic parameters of Form I Rubiscos
As in (4, 5) we performed a correlation analysis to investigate relationships between Rubisco kinetic parameters. SC/O, kcat,C, KC and other kinetic parameters are mathematically related to the microscopic rate constants of the Rubisco mechanism (SI). Given common assumptions, this multi-step mechanism can be mathematically simplified so that log of measured kinetic parameters are proportional to effective transition state barriers (Figure 1B, SI). Therefore we expect to observe strong power-law (log-linear) correlations between pairs of kinetic parameters when when three conditions are met: (I) the energy barriers associated with each parameter trade-off with each other - e.g. if lowering the effective barrier to CO2 addition (ΔG1,C) requires that the barrier to O2 addition (ΔG1,O) to decrease as well (Figure 1B-C); (II) those tradeoffs affect the rate of net carboxylation by Rubisco by affecting either carboxylation or oxygenation rates appreciably; and (III) the selection pressure imposed on Rubisco evolution was sufficient to reach limits imposed by this tradeoff (as diagrammed in Figure 1D). As Rubisco is the central enzyme of photoautotrophic growth, we assume here that it evolved under strong selection. As such, we interpret weakened correlations as calling into question whether the proposed tradeoffs strongly constrain Rubisco evolution.
We also note that some level of correlation is expected because measured kinetic parameters (e.g. kcat,C and KC) are mathematically interrelated through the microscopic mechanism of Rubisco as it is commonly understood (SI). For example, when we derive expressions for kcat,C and KC from the Rubisco mechanism, they share common factors that could drive correlation even in the absence of any tradeoff (SI). Similarly, SC/O is defined as (kcat,C/KC) / (kcat,O/KO) and might correlate with kcat,C for this reason. Yet because SC/O has physiological importance as it determines the ratio of carboxylation to oxygenation rates (RC/RO) in the limit of low CO2 and O2 (SI), high SC/O might be independently selected for. Because we cannot predict a priori which parameters will correlate, we examine log-scale correlations between all pairs of parameters, as shown in Figure 3.
Correlations between kcat,C and SC/O as well as kcat,C and KC were previously highlighted to support the hypothesis that Rubisco evolution is constrained by the enzyme’s catalytic mechanism (4, 5). However, these correlations are substantially attenuated by the addition of new data. In general the set of kinetic parameters do not appear to fall along a one-dimensional curve (Figures 3 and 4). Figure 4A replots the focal correlation of Tcherkez et al. (4) - kcat,C vs. SC/O - and shows that these parameters are much less correlated in the extended dataset (R < 0.6 and extremely sensitive to outliers). Similarly, Figure 4B shows that the focal correlation of Savir et al. (5) - kcat,C vs KC - is weakened, with R ≈ 0.66 as opposed to R ≈ 0.9. By far the strongest correlation is between the catalytic efficiencies for carboxylation and oxygenation, kcat,C/KC and kcat,O/KO (R = 0.93, Figure 3). We discuss possible explanations for this very strong correlation in detail below.
Principal components analysis (PCA) of Rubisco kinetic parameters was previously used to interrogate constraints on Rubisco evolution. It was argued that Rubisco adaptation is constrained to a one-dimensional landscape because the first principal component (PC1) explained > 90% of the variance in Rubisco kinetics. In a one-dimensional landscape model all kinetic parameters are tightly interrelated so that changing one (e.g. kcat,C) forces all others to assume predetermined values (5). However, the extended dataset is not well-approximated as one-dimensional. While the orientation of PC1 is not substantially altered by the addition of tenfold more measurements, it now explains ≈70% instead of >90% of the variance in Rubisco kinetics (5). More than 2 principal components are required to explain >90% of the variation in the extended dataset (SI). We proceed to ask whether the lower correlations in the extended dataset is consistent with the specific tradeoffs hypothesized to constrain Rubisco evolution.
Re-evaluation of Energy Partitioning Models
Two mechanistic tradeoff models were advanced in (4, 5). Savir et al. 2010 (5) cast these proposals in energetic terms by relating the measured catalytic parameters to effective transition state barrier heights (Figure 1B, SI). This energetic interpretation of the first mechanistic model (4, 5) - that increased specificity towards CO2 necessitates a slower maximum carboxylation rate (Figure S1) - is not supported by the extended dataset. This model was previously supported by an inverse relationship between kcat,C and kcat,C/KC. Since kcat,C/KC is exponentially related to the first effective carboxylation barrier (ln(kcat,C/KC) ∝ -ΔG1,C) and kcat,C to the second (ln(kcat,C) ∝ -ΔG2,C), the previously-observed relationship was taken to imply that the effective carboxylation barriers must sum to a constant (ΔG1,C + ΔG2,C = C, Figure 5A, (5)). The extended dataset does not, however, conform to the reported power law (Figure 5B). Moreover, correlation is not improved by restricting focus to C3 plants for which data is abundant and the measured leaf CO2 concentration varies by only 20-30% (Figure 5C, (24, 25)).
Absence of correlation does not necessarily imply the absence of a tradeoff. Rather, if the Rubisco mechanism couples kcat,C and kcat,C/KC, much decreased correlation over the extended dataset (R < 0.3) could result from several factors (SI) including bias in data collection leading to undersampling of faster Rubiscos (e.g. from cyanobacteria) or, alternatively, insufficient selection pressure (as diagrammed in Figure 1D).
The second mechanistic tradeoff model (5) - wherein faster CO2 addition entails faster O2 addition as well - is extremely well-supported by the addition of new data (Figure 6). This model was previously supported by a power-law relationship between kcat,C/Kc and kcat,O/KO with an exponent of 0.5 (kcat,O/KO ∝(kcat,C/KC)0.5). As kcat,C/KC is exponentially related to the first effective carboxylation barrier (ln(kcat,C/KC) ∝ -ΔG1,C) and kcat,O/KO to the first effective oxygenation barrier (ln(kcat,O/KO) ∝ -δG1,O), the power-law relationship was taken to imply that decreasing the barrier to CO2 addition will also decrease the barrier to O2 addition (0.5 ΔG1,C - ΔG1,O = C, Figure 6A).
The extended dataset evidences clear power-law correlation between kcat,C/KC and kcat,O/KO (Figure 6B). While some Form II enzymes appear to be strictly inferior to the Form I enzymes on these axes, there is a clear “front” in the kcat,C/KC vs. kcat,O/KO plot. Most measurements lie along a robust line of positive correlation in a log-log plot (22). Fitting the Form I enzymes gives a remarkably high-confidence (R = 0.93, P < 10-10) power-law relationship with kcat,C/KC ∝ (kcat,O/KO)1.06 (Figure 6B). The ratio of kcat,C/KC to kcat,O/KO is defined as the SC/O and so an exponent of 1 implies constant specificity. Subdividing the Form I enzymes by host physiology (e.g. C3 plants, C4 plants, cyanobacteria, etc.) reveals that all groups with sufficient data display a strong and statistically-significant power-law relationship between kcat,C/KC and kcat,O/KO (Figure 6C, SI (26)). The power-law exponent differs consistently from the value of 0.5 given by (5). We now find a roughly 1: 1 relationship of ΔG1,C - ΔG1,O = C, meaning that a decrease in the CO2 addition barrier is associated with an equal decrease in the barrier to O2 addition. We estimate a 95% confidence interval (CI) of 0.98-1.24 for the exponent of this power law relationship for Form I enzymes, about twice the previously reported value.
Implications for the mechanism of CO2/O2 discrimination by Rubisco
Figure 6 shows that the difference between the first effective barriers to carboxylation and oxygenation is roughly constant, such that they are constrained to vary in proportion (ΔG1,C - ΔG1,O = constant). A roughly 1:1 correlation between effective barriers to CO2 and O2 addition suggests that a single factor controls both. We offer a model based on the known catalytic mechanism of Rubisco that could produce a 1:1 relationship between barriers. In this model, the RuBP-bound Rubisco active site fluctuates between reactive and unreactive states (Figure 7A). The fraction of enzyme in the reactive state is denoted ϕ. In the unreactive state neither oxygenation or carboxylation is possible. In the reactive state, either gas can react at an intrinsic rate that does not vary across Rubiscos of the same class.
This model can be phrased quantitatively as and where ΔG*1,C and ΔG*1,O are the intrinsic reactivities of the enediolate of RuBP to CO2 and O2 respectively (SI, Figure S8). ϕ is likely determined by the degree of enolization of RuBP - i.e. ϕ = f (ΔGE) where ΔGE is the equilibrium constant of on-enzyme enolization of RuBP (SI). Given this model, we expect to observe a power-law relationship with exponent 1.0 between and (SI). As specificity SC/O = (kcat,C/KC) / (kcat,O/KO), SC/O should be roughly constant under this model. Though SC/O varies the least of all measured Rubisco kinetic parameters (Figure 2C), it is not constant. Rather, SC/O varies over 3-4 fold among Form I Rubiscos and more than tenfold over the entire dataset (Figure 2C). However, Rubiscos isolated from hosts belonging to the same physiological grouping - e.g. C3 or C4 plants - do display a characteristic and roughly constant SC/O value independent of kcat,C/KC (Figure 7B).
The implication of these data and model is that ϕ varies within the various Rubisco groups, perhaps by varying ΔGE, which does not appear in the expression for SC/O (SI). In contrast, the difference between intrinsic reactivities ΔG*1,O − ΔG*1,C appears to vary between the groupings. This would produce roughly constant SC/O among C3 plants while allowing large variation in SC/O between C3 plants, cyanobacteria and proteobacterial Form I Rubiscos. Characteristic variation in SC/O between groups of Form I Rubiscos might be understood via the conformational proofreading model (27). In this model, intentionally reducing complementarity between enzyme and substrate can lead to increased specificity if the change affects off-target substrates (e.g. O2) even more than it affects on-target ones (e.g. CO2). A full derivation of this model and discussion of its potential implications is given in the SI.
Discussion
We collected and analyzed >200 literature measurements of Rubisco kinetic parameters (Figure 2A). The collection is quite biased, with the readily-purified Rubiscos of land plants making up ≈80% of the data (Figure 2B). Better sampling of Rubisco diversity including more algal, bacterial and archaeal Rubiscos would greatly improve our understanding of the evolution and capacity of this enzyme (28). Despite incomplete coverage, some trends are clear. All Rubisco kinetic parameters display limited dynamic range, with standard deviations in log-scale being less than one order-of-magnitude in all cases (Figure 2C). Rubisco is also noteworthy in displaying much less variability in kcat than any other enzyme for which multiple measurements are available (Figure S4). The highest measured kcat,C at 25°C is 14 s-1 (S. elongatus PCC 7942) and the enzyme with the greatest affinity for CO2 has KC ≈ 3.3 μM at 25°C (G. sulphuraria). Many Rubiscos are quite slow oxygenators with half of measurements having kcat,O < 1 s-1 (Figure 2A). Similarly, many Rubiscos have relatively low affinity for O2 - the median KO is 465 μM. The Rubisco of the diatom Thalassiosira weissflogii, for example, has a KO ≈ 2 mM (29), corresponding to roughly ten times the ambient O2 concentration (30).
Specificity SC/O varies the least of all Rubisco parameters (Figure 2C). Form I Rubiscos are typically much more CO2-specific than their Form II, III and II/III counterparts (Figure 7B, SI). This might be explained by the prevalence of Form II, III and II/III enzymes in bacteria and archaea that fix CO2 in anaerobic conditions, where oxygenation should not appreciably compete with carboxylation. We note, however, that there is substantial variation among measurements of the model Form II Rubisco from R. rubrum. This and the general paucity of data on non-Form I Rubiscos indicates that more measurements are required to evaluate stereotyped differences within and between Form II, III and II/III Rubiscos.
In order to understand the limits of the Rubisco catalytic mechanism, we examined correlations between Rubisco kinetic parameters (Figure 3). Given the wide range of organisms studied, various methods of measurement applied in inconsistent buffer conditions, etc. we focused here on log-scale correlations among Form I Rubiscos for which abundant data is available (Figure 2B). This approach enables us to examine constraints within that group, but does not directly address trends between the Rubisco isoforms (e.g. stereotyped differences between Form I, II, II/III and III) as data is scant for these comparisons. Overall, the addition of new measurements weakens correlations between Rubisco kinetic parameters (Figure 3). The first principal axis of variation in the data (PC1) explains substantially less of the variance in Rubisco kinetics than in previous analyses (~70% as compared to > 90%, SI). Increased variation within the extended dataset manifests as weaker correlations between some pairs of kinetic parameters. A plot of kcat,C against SC/O (Figure 4A) shows that the focal correlation of (4) is not strongly supported by the data. Similarly, a plot of kcat,C vs KC shows that the focal correlation of (5) is substantially weakened (Figure 4B). Our understanding of weakened correlations is that the tradeoff model of (4) does not capture as much of the variation in the Rubisco mechanism as could be previously surmised. Another interpretation, however, is that natural Rubiscos are not “perfectly optimized” - that factors other than Rubisco limit autotrophic growth and natural selection has not pushed Rubiscos to the limits of their catalytic capacity (Figure 1C).
Examining the two mechanistic tradeoff models in (4, 5) we showed that only one is supported by our larger dataset (Figures 5, 6). The chemical intuition underlying both proposals is that the intrinsic difficulty of binding and discriminating between CO2 and O2 requires the enzyme to differentiate between carboxylation and oxygenation transition states. The requirement of TS discrimination is a direct consequence of two common assumptions that are supported by experimental evidence (31). Briefly, it is assumed that addition of either gas is irreversible and that there is no binding site for CO2 or O2 and, thus, no so-called Michaelis complex (4, 5, 31–33). If CO2 bound a specific site on Rubisco before reacting, KC could be modulated by mutation without substantially affecting the kinetics of subsequent reaction steps. In the less likely case that gas addition is substantially reversible, (33, 34) we would expect to find Rubiscos that evolved enhanced selectivity by energy-coupled kinetic proofreading. Energy coupling would enable amplification of selectivity determined by differential CO2 and O2 off-rates (35). The fact that no such Rubiscos have been found suggests that gas addition is irreversible or that the off-rates of CO2 and O2 are incompatible with kinetic proofreading in some other way. For this reason we suggest that the higher specificity of C3 plant and red algal Rubiscos is achieved through a mechanism like conformational proofreading, which does not require energy coupling or reversible gas addition (5, 27).
Tcherkez et al. 2006 suggest that high specificity (i.e. large SC/O) is realized via a late carboxylation TS which is maximally discriminable from the oxygenation TS (4). As a late TS resembles the carboxyketone carboxylation intermediate, specific Rubiscos must tightly bind the carboxyketone, which throttles the subsequent hydration and cleavage steps (Figure S2). The extraordinarily tight binding of the carboxyketone analog CABP to plant Rubisco provides strong support for this model. Savir et al. 2010 articulates a related model, noting that kcat,C and kcat,C/KC are inversely correlated in their dataset (5). Since kcat,C/KC is related to effective barrier to enolization and CO2 addition and kcat,C is related to the effective barrier to hydration and cleavage (Figure 1B), Savir et al. argued that a lower effective barrier to CO2 addition entails a higher barrier for the subsequent steps (i.e. a lower kcat,C, Figure 5A). In both of these descriptions, the initial steps of carboxylation are negatively coupled to the subsequent steps in a manner that produces the correlations. However, those correlations - between SC/O and kcat,C, KC and kcat,C and kcat,C/KC and kcat,C - are attenuated by the addition of new measurements (Figures 3, 4) which calls these proposals into question. Importantly, we do not argue that the chemical logic advanced by (4) is incorrect, but rather that the assembled data do not support such a tradeoff being optimized by the evolution of Form I Rubiscos.
The second tradeoff model posited by (5) is that faster CO2 addition to the Rubisco-RuBP complex necessarily allows faster O2 addition. This was evidenced by a positive power-law correlation between the catalytic efficiencies for carboxylation and oxygenation (kcat,C/KC and kcat,O/KO respectively), which can be understood as a positive coupling of the effective barrier to enolization and gas addition (Figure 6A, SI). We have shown that the extended dataset strongly supports this power-law relation and suggests that there exists a front along which lowering the effective CO2 addition barrier (enabling faster carboxylation) requires a roughly equal reduction in the effective O2 addition barrier (i.e. causing faster oxygenation as well). A power law relation with an exponent of 1.0 can be seen as resulting from an active site that fluctuates between a reactive and unreactive state (Figure 7A). In this model, the average occupancy of the reactive state dictates the rate of CO2 and O2 addition and throttles the subsequent steps of carboxylation and oxygenation equally (Figure 7). This model can be mapped onto the Rubisco mechanism by noting that RuBP must be enolized before CO2 or O2 can react, suggesting that the occupancy of the reactive state (ϕ) is related to the degree of enolization of RuBP (SI, Figure S8). One implication of this model is that SC/O is roughly constant. While SC/O does vary over roughly tenfold across the entire dataset and 3-4 fold across Form I enzymes (Figure 1B), Rubiscos from the same physiological groupings do display roughly constant SC/O values independent of kcat,C/KC (Figure 7B). More measurements of bacterial Form I, II and III enzymes will be crucial to evaluate the generality of this observation.
We note, however, that the power-law relationship between kcat,C/KC and kcat,O/KO may not have an exponent of exactly 1.0. The Form I data are consistent with exponents ranging from 0.98-1.2 and fits to data from C3 and C4 plants gives exponents ranging from 0.8-1.0 (Figure 6C). Therefore, in order to better resolve the evolutionary constraints imposed on Rubisco kinetics, we suggest several avenues for future research. First, the kinetics of non-plant Rubiscos should be characterized much more thoroughly. These should include the Form II, III and II/III enzymes of bacteria and archaea as well as Form I enzymes of cyanobacteria and diverse Eukaryotic autotrophs (28). Ideally these enzymes would be sampled from accumulated genomic data in a manner that maximizes sequence and phylogenetic diversity (36) and characterized for their binding (e.g. of RuBP and CABP) and catalytic activity (i.e. measuring kcat,C, KC, kcat,O, KO and SC/O) as a function of temperature and pH as in (37, 38). These data would likely resolve whether different Rubisco isoforms have characteristic differences in their catalytic potential.
Furthermore, it is important to revisit the classic experiments undergirding our understanding of the Rubisco catalytic mechanism, especially those supporting the central assumptions that (a) there is no Michaelis complex for CO2 or O2 and (b) that gas addition is irreversible (31, 33, 34). Other carboxylases, for example crotonyl-coa carboxylase/reductase, accept CO2 as a substrate without any apparent affinity for O2 (39), which leads us to wonder what makes Rubisco unique in this regard (8). One avenue for deeper study of the Rubisco mechanism would be measurement of carbon and oxygen kinetic isotope effects (KIEs) for a wide variety of Rubiscos. Kinetic isotope effects report indirectly on transition state barrier heights (40, 41) and so investigating the relationship between kinetic isotope effects and kinetic parameters will refine our current understanding of the catalytic mechanism (4).
There remains some disagreement about the precise ordering of the carboxylation mechanism (4, 12, 13) and the mechanism of oxygenation is not well understood (26). Chemical reasoning about the mechanisms of Rubisco carboxylation and oxygenation would benefit from progress in structural biology - intermediates and transition state analogues should be used to capture the active site at various points along the reaction trajectory (12, 26, 42–44). If experiments and structural analyses confirm that the above assumptions hold for all Rubiscos, it would greatly limit our capacity to engineer Rubisco and strongly suggest that alternative strategies for improving carbon fixation should be pursued (45, 46). If, however, these assumptions are invalidated, many enzyme engineering strategies would be viable. Such data and analyses will be instrumental in guiding the engineering of carbon fixation for the next decade.
Methods
Data collection and curation
We reviewed the literature to find Rubisco kinetic data measured at 25°C and near pH 8. Ultimately 61 primary literature studies were included, yielding 319 Sc/O, 275 kcat,C, 310 KC, and 256 Ko values for Rubiscos from 286 distinct organisms (Datasets S1 and S2). We also recorded 51 measurements of the Michaelis constant for RuBP (KRuBP). Experimental error was recorded for all of these values along with the pH, temperature and assumed pKa. Cases where the soluble CO2 concentration was derived in a different manner were specifically noted. Data was filtered as described in SI. kcat,O is usually not measured directly, but is rather inferred as kcat,O = (kcat,C/KC) / (SC/O/KO). When an uncertainty is reported, we assumed that the underlying experimental noise is normally distributed and used 104-fold bootstrapping to estimate 198 kcat,O values and 95% confidence intervals thereof. We used an identical procedure to estimate kcat,C/KC and kcat,O/KO (SI). Datasets S1 and S2 provide all source and inferred data.
Fitting power laws
In contrast to textbook examples with one independent and one dependent variable, there is experimental error associated with both variables in all correlation calculations here. As such we used total least squares linear regression in log scale to fit power-law relationships between Rubisco parameters. Because R2 values of total least squares fits do not convey the explained fraction of Y axis variance, they are challenging to interpret. As such, we report the quality of correlations as Pearson R values. Bootstrapping was used to determine 95% confidence intervals for power-law exponents and prefactors (slopes and intercepts of linear fits in log scale). In each iteration of the bootstrap, data were subsampled to 90% with replacement. total least squares regression was applied to each subsample to determine a point estimate of R, power-law exponent and prefactor. This procedure was repeated 104 times to determine a 95% confidence interval on the above parameters. Python source code is online at github.com/flamholz/rubisco.
Acknowledgements
We would like to thank Uri Alon, Kapil Amarnath, Doug Banda, Arren Bar-Even, Cissi Blikstad, Jack Desmarais, Woodward Fischer, Vahe Galstyan, Laura Helen Gunn, Itai Halevy, Robert Nichols, Elad Noor, Yonatan Savir, Patrick Shih, Daniel Stolper, Dan Tawfik, Guillaume Tcherkez, Tsvi Tlusty and Renee Wang for helpful conversations and comments on the manuscript.