Summary
Cellular signalling processes can exhibit pronounced cell-to-cell variability in genetically identical cells. This affects how individual cells respond differentially to the same environmental stimulus. However, the origins of cell-to-cell variability in cellular signalling systems remain poorly understood. Here we measure the temporal evolution of phosphorylated MEK and ERK dynamics across populations of cells and quantify the levels of population heterogeneity over time using high-throughput image cytometry. We use a statistical modelling framework to show that upstream noise is the dominant factor causing cell-to-cell variability in ERK phosphorylation, rather than stochasticity in the phosphorylation/dephosphorylation of ERK. In particular, the cell-to-cell variability during sustained phosphorylation stems from random fluctuations in the background upstream signalling processes, while during transient phosphorylation, the heterogeneity is primarily due to noise in the intensity of the upstream signal(s). We show that the core MEK/ERK system uses kinetic proof-reading to faithfully and robustly transmits these variable inputs. The MAPK cascade thus propagates cell-to-cell variability at the population level, rather than attenuating or increasing it.
Introduction
The behaviour of eukaryotic cells is determined by the intricate interplay between signalling and regulatory processes. Within a cell each single molecular reaction occurs randomly (stochastically) and the expression levels of molecules can vary considerably in individual cells (Bowsher and Swain, 2012b). These non-genetic differences can and frequently will add up to macroscopically observable phenotypic variation (Balázsi et al., 2011; Spencer et al., 2009; Spiller et al., 2010). Such variability can have organism-wide consequences, especially when small differences in the initial cell populations are amplified among their progeny (Pujadas and Feinberg, 2012; Quaranta and Garbett, 2010). Cancer is the canonical example of a disease caused by a sequence of chance events that may be the result of amplifying physiological background levels of cell-to-cell variability.
Better understanding of the molecular mechanisms behind the initiation, enhancement, attenuation and control of this cellular heterogeneity should help us to address a host of fundamental questions in cell biology and experimental and regenerative medicine. Characterisations of the origins of cell-to-cell variability in biological systems have so far generally related to gene expression (Elowitz et al., 2002; Hilfinger and Paulsson, 2011; Swain et al., 2002). However, cell-to-cell variability can arise without substantial contributions from transcriptional and translational processes and also characterizes signal transduction at the single cell level (Colman-Lerner et al., 2005; Jeschke et al., 2013). It has now become possible to track populations of eukaryotic cells at single cell resolution over time and measure the changes in the abundances of proteins. For example, rich temporal behaviour of p53 (Batchelor et al., 2011; Geva-Zatorsky et al., 2006) and Nf–κb (Ashall et al., 2009; Nelson et al., 2004; Paszek et al., 2010) has been characterized in single-cell time-lapse imaging studies. But because these studies tracked a small number of cells continuously it is difficult to gauge the causes of cell-to-cell variability, due to lack of statistical power, but even this is now becoming possible (Selimkhanov et al., 2014). Alternatively, measurements can be obtained by quantitative flow or image cytometry (Ozaki et al., 2010) where data are obtained at discrete time points but encompass thousands of cells, which allows one to investigate the causes of cell-to-cell variability. In the present study, this latter methodology is applied to mitogen activated protein kinase (MAPK) signalling cascades. MAPK mediated signaling affects cell fate decision making processes (proliferation, differentiation, apoptosis and cell stasis) and cell motility. The mechanisms of MAPK cascades and their role in cellular information processing have been investigated extensively (Aoki et al., 2011; Kiel and Serrano, 2009; Mody et al., 2009; Piala et al., 2014; Sturm et al., 2010; Takahashi et al., 2010; Voliotis et al., 2014). Our aim is to gauge and characterize sources and effects of variability in MAPK signalling, focusing on the extracellular-signal-regulated kinase (ERK) and its response to external stimulation (see Figure 2A). We use quantitative image cytometry to probe the cellular abundancies of active ERK and its cognate kinase MEK in a large number of PC12 cells collected at different times. This is coupled to a detailed Bayesian analysis of mathematical models of the MEK-ERK signalling cascade, where we infer the modes of ERK phosphorylation and dephosphorylation and quantify the temporal effects of different sources of noise on the level of population heterogeneity in activated ERK and MEK.
The molecular causes underlying population heterogeneity are only poorly understood, but two notions have come to dominate the literature: intrinsic and extrinsic causes of cell-to-cell variability (Bowsher and Swain, 2012a; Hilfinger and Paulsson, 2011; Komorowski et al., 2010; Swain et al., 2002; Toni and Tidor, 2013) (see Figure 1A-D). The former refers to the chance events governing the molecular collisions in biochemical reactions. Each reaction occurs at a random time leading to stochastic differences between cells over time. The latter subsumes all those aspects of the system which are not explicitly modelled. This includes the impact of stochastic dynamics in any components upstream and/or downstream of the biological system of interest which may be caused, for example, by the stage of the cell cycle (which will affect cell sizes, transcription activity, and availability of free ribosomes and proteasomes) and the multitude of factors deriving from it. To date the most comprehensive characterisations of the interplay between intrinsic and extrinsic noise in biological systems generally relate to gene expression (Elowitz et al., 2002; Swain et al., 2002), using, for example, dual reporter assays which explicitly separate out extrinsic and intrinsic sources of variability (Hilfinger and Paulsson, 2011). These assays, however, cannot be used in all conditions; it is therefore important to develop alternative approaches that can distinguish between the different noise sources. Here we develop an in silico statistical model selection (Kirk et al., 2013) framework for this purpose and we demonstrate that we can confidently implicate extrinsic noise as the dominant factor giving rise to cell-to-cell variability in MAPK signalling. Further analysis of the MAPK dynamics allows us to highlight and attribute the origins of the extrinsic variability to biological processes upstream. In particular, we will show that the cell-to-cell variability in transient phosphorylation is derived from noise in the intensity of the upstream signals in reaction to the applied external stimulus. During sustained, by contrast, variability stems primarily from noise in the background of the upstream signal as well as in the degradation of the kinase. To further substantiate our results we propose (Silk et al., 2014) and implement new experimental interventions which have allowed us to conclusively rule out any non-negligible impact of intrinsic noise. The workflow adopted in this analysis is summarized in Figure 1E.
Results
Quantifying temporal evolution of cell-to-cell variability
We investigate the causes of cellular heterogeneity in vivo during ERK activation by doubly phosphorylated MEK in PC12 cells. This cell-to-cell variability study is based on measurements of the concentration of doubly phosphorylated MEK and ERK at the single cell level obtained by quantitative image cytometry. Cells are plated in medium containing a fixed amount of neuronal growth factor (NGF) as the stimulus at time t = 0. Every two minutes cells in one well are fixed in order to quantify the concentration of the two proteins of interest providing us with a series of cross sectional snapshots of the joint protein distributions of doubly phosphorylated MEK and ERK (i.e the sum of free and complex bound forms), see Figure 3A.
The observed distributions of the total amount of doubly phosphorylated MEK and ERK are illustrated in Figure 3B, and Figure 3C shows the evolution of the variance, the coefficient of variation and the Fano factor over time for both proteins. The variance over the cell population of the concentration is of the order of 105 and significantly varies with time. Because both the coefficient of variation and the variance of the amount of doubly phosphorylated ERK vary with time we can rule out the possibility that the variability in the protein concentration measurements has been caused by additive or multiplicative measurement noise, see Figure 1A. In addition, the experimental noise in QIC has been estimated by Uda et al. (see Figure S2 in (Uda et al., 2013)) and is found to be negligible compared to the level of cell-to-cell variability Any analysis of the origins of cell-to-cell variability requires us to determine the modes of ERK phosphorylation and dephosphorylation. ERK activation involves phosphorylation at both its tyrosine and threonine phosphorylation sites by its cognate kinase MEK (Ferrell and Bhatt, 1997; Ferrell et al., 2014). Previous studies (Toni et al., 2012) have shown that invivo phosphorylation (as well as dephosphorylation) occurs in two steps where the kinase binds to the protein twice in order to phosphorylate the two sites successively (see Figure 2B). Using a Bayesian model selection (Kirk et al., 2013) approach, we confirm that this distributive mechanism (Ferrell et al., 2014) best captures the observed average behavior in our data (see Supplemental Information). We therefore base our analysis of the origins of cell-to-cell variability on this mechanistic model with 20 model parameters including 12 reaction rates (see Figure 2B Middle and Bottom), 4 parameters describing the impact of the NGF stimulus and upstream signals (see Figure 2B Top) and 4 parameters controlling the initial concentrations of the species involved in the ERK–MEK core system (see Supplemental Information).
Intrinsic noise alone cannot explain the observed variabilities between cells
While it is straightforward to model extrinsic and intrinsic noise, quantifying their relative contributions to real molecular systems has thus far only been possible for systems where tworeporter assays are available(Elowitz et al., 2002; Swain et al., 2002). Here we develop a statistical framework that allows us to obtain quantitative insights into the roles of these two sources of noise for signalling systems where direct measurements are typically not possible.
Extrinsic sources of variability stem from all those elements of the “real system” that are not explicitly modelled; these typically include factors such as inherent differences between the cells in terms of cell-size, stage of cell-cycle, protein concentrations at the start of the experiment, and other biophysical parameters. To capture such effects we allow model parameters to differ between cells (Shahrezaei et al., 2008; Toni and Tidor, 2013): the parameters for each cell are drawn from a log-normal distribution (with “hyper-parameters” (Gelman et al., 2013) for means and variances that will be inferred from the data). The potential sources of extrinsic noise are: differences in the reaction rates between cells, different initial concentrations of ERK and MEK, and differences in the upstream signalling cascades feeding into the MEK dynamics.
Using the Bayesian framework developed in Experimental procedures and Supplemental Information we analyze the roles of intrinsic and extrinsic noise in the single cell data. The resulting statistical model-evidence indicates that the extrinsic noise best explains the data. The evolution of the obtained distributions for MEK and ERK are shown and compared to the data in Figure 4A: only the extrinsic noise model can explain the observed high levels of cell-to-cell variability.
To substantiate this further (and to explore the parameter space more widely) we use Latin hyper-cube sampling to generate a set of 106 parameter vectors and systematically analyse the evolution of the molecular concentrations of MEK and ERK for each of these parameters. Only 20 parameter vectors out of the 106 lead to stable solutions for which the obtained variances of doubly phosphorylated ERK and MEK is higher than 105 (at either 6 or 8 minutes after stimulation; but for none of these parameters do we observe a variance of doubly phosphorylated ERK that is anywhere close to the experimental observations (where the variance is ≈ 3.105).
Variation in initial conditions is also not sufficient to generate the observed cell-to-cell variability; this is easily seen by sampling different values for the the initial concentration of the species involved in the ERK-MEK system according to a log-normal distribution with mean and variance (given by the inferred hyper-parameters for the extrinsic noise case) and simulating the model with intrinsic noise for each of these initial conditions. The total variance, which is the sum of (a) the mean over the different initial conditions of the variance due to the intrinsic noise, and (b) the variance over the different initial conditions of the mean over the intrinsic variability, is shown in Figure 4B. This shows that the variance including variation in initial conditions does not differ appreciably from the variance of intrinsic noise alone.
In a biological system we expect extrinsic and intrinsic sources of noise: the cells are likely to be different in terms of initial molecular concentrations and stage of cell-cycle, and the biochemical reactions occur at random times (Komorowski et al., 2013). We therefore compare the variances of the observed molecular species under extrinsic noise alone with the total variances under both extrinsic and intrinsic noise. From Figure 4B it is apparent that the contribution of intrinsic noise to the total variation is negligible.
An immediate prediction that follows from the above analysis is that the core MEK-ERK system as described here is a reliable and faithful information processing unit: little noise is introduced here, and different signals are mapped onto distinct outcomes in a predictable manner. As a corollary of this we know that the MAPK system does not introduce cell-to-cell variability into the down-stream cellular pathways.
In order to test this prediction and validate the model further we consider the response of the MEK-ERK system to different stimuli; if cell-to-cell variability is due to MEK and ERK dynamics, then the parameterized model developed above should not be able to describe the dynamics. On the contrary, we find that extrinsic noise model can explain the response of the MEK-ERK system to stimulation by EGF, Figure 5A, and different NGF stimulus intensities, Figure 5B (see also also Supplemental Information for a more extensive analysis). Here we have used the hyper-parameters inferred previously except for those that correspond to the upstream dynamics (which are known to depend on the stimulus strength and temporal pattern, see (Fujita et al., 2010)); these and only these were inferred directly from the EGF and NGF timecourses. The model with extrinsic noise shows good qualitative and quantitative agreement between model predictions and the new data obtained for different NGF stimulus levels. Thus our extrinsic noise MEK-ERK model is capable of predicting the response to other stimuli than those used in the model development.
Fluctuations in the upstream reactions and in the degradation rate of the kinase explain most of the cell-to-cell variability
Our Bayesian analysis allows us to assess directly which parameters differ most between cells. For each parameter we have estimates of the coefficient of variation across cells, and the parameters that contribute most to the observed cell-to-cell variability are those for which the inferred coefficient of variation is consistently and significantly different from zero (see Supplemental Information). We find five strongly contributing factors: three model parameters (k1, k2 and k10) and the two initial conditions that describe the level of background activity present in the cell at the point of stimulation. The pulse height, k1, and the background upstream signal, k10, jointly characterise the impact of the NGF stimulus and the upstream reactions on the evolution of active MEK (see Figure 2B) Top. The degradation rate of active MEK (k2) affects the steady state levels of cell-to-cell variability and the role of degradation reactions in determining levels of noise (and thus cell-to-cell variability) has been well documented (Komorowski et al., 2013). In Figures 6A we illustrate the predominant role that the upstream parameters have on the extent of cell-to-cell variability in this system.
In Figure 6B we further show that that other factors — measurements for cell-size and volume, and Hoechst level, (the dye used to quantify nucleic acid levels) — make only negligible contributions to observed levels of cell-to-cell variability. The total amounts of doubly phosphorylated ERK and MEK have the highest partial correlation and we can thus rule out cell-cycle etc. as explanations for, or cause the temporal variability in the amount of active ERK.
Impact of cell-to-cell variability on Cellular Information Processing
We conclude our analysis by investigating the role that noise plays in mediating the response of MAPK signalling cascades to external stimuli. We analyse the level of cell-to-cell variability in the system’s output (i.e. the total amount of doubly phosphorylated ERK) as a function of how variable the inputs (captured by the transient and sustained upstream intensities, k1 and k10, and their respective variances over the cell population and are. We simulate system output for given values of and and compute the ratio where std(ppERKt) is the standard deviation of the output at time t, and std() is the standard deviation of the system’s output at time t if the variance of the input is maximal ( and where μk1 and μk10 are the means over the cell population for, respectively, k1 and k10). This ratio quantifies the change in the level of cell-to-cell variability in the system’s output as the input noise is decreased.
In the first instance we assume that only the input signal strengths (k1 and k10) vary between cells — all other model parameters are fixed to the inferred posterior mean values. The evolution of λ(σk1, σk10, t) over time when varying the variances and is shown in Figure 6C (left column). Before t = 8 minutes, λ(σk1, σk10, t) increases with whereas has no impact on λ(σk1, σk10, t). Conversely, after t = 24 minutes, λ(σk1, σk10, t) increases with but no longer affects output variability. Thus variability in active ERK abundance across the cell population is initially strongly influenced by the variability in pulse height, and subsequently by the variability in the sustained or background signal.
To investigate the effect of the variability in all model parameters on cellular information processing, we also simulate the system under extrinsic noise (varying all model parameters between cells), and compute once more λ(σk1, σk10, t) for different signal variabilities. It is apparent from Figure 6C (right column) that, under the extrinsic noise model, the level of cell-to-cell variability in the system’s output remains substantially high even when the variability in the system’s input has been decreased considerably (λ ∽ 0.45 when σk1 and σk10 are divided by 20). Therefore, the presence of extrinsic noise weakens the influence of the variability in the upstream signal upon the cell-to-cell variability in the system’s output.
To follow on from this, we compute the mutual information between the total amount of ppMEK and the total amount of ppERK at different time points, simulating the system under extrinsic noise or varying only the parameters that seems to be related to most of the cellular variability (k1, k2 and k10). We observe in Figure 6D (Left) that the presence of extrinsic noise decreases the level of transfer of information between the two species of interest. This difference can be easily explained by comparing the joint distribution of the concentration of ppERK and ppMEK when the system is simulated under the full extrinsic noise model or only varying the ‘driving’ parameters (see Figure 6D Right). Even though only varying the ‘driving’ parameters explain the evolution of the variance and correlation between the two proteins, only the full extrinsic noise model captures the shape of the joint distribution.
Discussion
We have used quantitative image cytometry to elucidate the causes of population heterogeneity in the MAPK signaling cascade and presented a comprehensive analysis of cell-to-cell variability in the activation dynamics of the MEK–ERK system to environmental stimuli. Our analysis shows that the in vivo modes of ERK phosphorylation and dephosphorylation are distributive. With a reliable model for the (de–)phosphorylation mechanisms(Toni et al., 2012) in hand, we were then able to dissect the nature of the cell-to-cell variability inherent in the data. Recent MAPK models proposed in the literature (Ferrell et al., 2014; Harrington et al., 2013; Ortega et al., 2006; Sturm et al., 2010; Voliotis et al., 2014) allow for very rich dynamics and a priori it is therefore impossible to make an appeal to the large number of MEK, ERK and other molecules present in the eukaryotic cell, in order to rule out a role for intrinsic noise.
The detailed analysis of these alternative mechanisms gives a clear verdict in favour of extrinsic noise as the dominant factor for the observed cell-to-cell variability in the MEK–ERK system. Few, if any parameters appear to be tightly constrained across the populations of cells considered here. For some parameters we do, in fact, find strong evidence that they vary quite considerably between cells; but the MEK–ERK core system itself adds little to the observed levels of cell-to-cell variability, and is capable of transmitting upstream information faithfully. Thus differences in reactions upstream from the MEK–ERK core are passed on by the cascade to the downstream machinery. We propose that cells employ temporal selection of different noise sources for their intra-cellular information processing. In particular, we show that the cell-to-cell variability during sustained phosphorylation stems from random fluctuations in the background or base-line upstream signalling processes, while during transient phosphorylation, the cellular heterogeneity in ERK activity is primarily due to noise in the intensity of the upstream signal(s). The stage at which a cell is in its cell cycle is an obvious potential cause for cell-to-cell variability, but here we find that this can explain only a fraction of the overall extent of heterogeneity in the abundance of active ERK.
We found that extrinsic noise in the MAPK system considered here tends to attenuate variability in the up-stream signal prior to it arriving at MEK. The distributively operating MEK–ERK systems is furthermore capable of kinetic proof-reading (Hlavacek et al., 2001; Murugan et al., 2012), and the combination of this mechanism with the behaviour observed for the extrinsic noise, makes this a very effective filter for noisy upstream signals, especially at the population-level. Given the importance of MAPK systems in different cell-fate decision making processes such robustness to noise is clearly important. But while kinetic proof-reading confers robustness to all cells similarly, the extrinsic variability will mean that some cells may be better poised to process environmental signals subject to noise than others, which would lend robustness at the population-level, similar to bet-hedging behaviour in evolutionary biology (Kussell and Leibler, 2005; Stumpf et al., 2002). In development and tissue homeostasis (Rué and Arias-Martinez, 2015) (and in regenerative medicine) it may be important to find ways to regulate population-level behaviour further and here other, interand intra-cellular feedback mechanisms that control cell-to-cell variability further (Michailovici et al., 2014).
The study presented here is based on experiments carried out in PC12 cell lines(Greene and Tischler, 1976), which unlike in vitro set-ups, provide the cell physiological context. The activity of up-stream and down-stream processes affecting ERK may depend on cell-type; this has, for example, been shown for nuclear shuttling, where even subtle differences between different cell lines can affect e.g. the activity of nuclear ERK (Harrington et al., 2012). Our deliberate focus on the core MEK–ERK dynamics is less prone to such strong cell-type specificity over the time-scales considered, whereas the potential of feedback from either ERK or any of its many down-stream targets onto the MAPK cascade or proteins further upstream should be carefully considered in different cell-types. The additional richness in behaviour that such feedback (Ortega et al., 2006; Sturm et al., 2010) or explicit consideration of nuclear shuttling (Harrington et al., 2013; Mugler et al., 2013) of ERK and MEK can induce warrants further investigation (Ozaki et al., 2010); here over the time-course considered, and in light of the data available such effects are marginal, but this may change as other or longer stimuli, or more complex temporal stimulation patterns are considered. At single cell level both feedback and shuttling — the latter especially if it induces multi-stability — are therefore clearly worth of further investigation; there, however, we may also have to consider differences between cell-lines or cell types (Harrington et al., 2012). It is important to keep in mind that no model will ever be able to contain all the constituent parts of any biological system of any real-world relevance (Babtie et al., 2014). Therefore extrinsic noise — variation due to factors not explicitly included in the model — will always be an issue for modelling molecular and cellular systems. The present work shows that this need not necessarily limit the usefulness or usability of mechanistic, mathematical models of biological systems. By pinpointing the sources of extrinsic noise, which are typically not obvious a priori, sound statistical modelling is able to provide deeper mechanistic insights and highlight where a model ought to be extended, or whether this is indeed necessary.
Experimental procedures
Experimental data collecting process
The concentrations of molecular species were measured using quantitative image cytometry (QIC) (Ozaki et al., 2010; Saito et al., 2013). PC12 cells were seeded at a density of 104 cells per well in 96-well poly-L-lysinecoated glass-bottomed plates (Thermo Fisher Scientific, Pittsburgh, PA). 24 hours after seeding, the medium was replaced with DMEM containing 25mM HEPES and 0.1 percent of bovine serum albumin. 18 hours after serum starvation, the stimulus is applied by replacing the starvation serum with a medium containing the stimulant (5 or 0.5 or 0.1ng/mL). Our setup carries out stimulation in an incubator and achieved 1-minute interval stimulation at 37°C under 5% CO2 in saturated air humidity. The cells are then fixed with 4 percent paraformaldehyde for 10 minutes and immunostained. Cells were subjected to QIC analysis with mouse antippERK Sigma Aldrich M8159 antibody and rabbit anti–pMEK Cell Signaling Technology 9121. Note that anti–pMEK antibody detects both singly and doubly phosphorylated MEK.
All images were analyzed with Cell Profiler (Kamentsky et al., 2011). The nuclear region was identified based on Hoechst imaging, and the cellular region was identified based on CellMask stained images going out from from the nuclear region. Total cellular signal intensity in nuclear regions and cellular regions were measured for ppERK and pMEK, respectively. We used the cellular region in pixels as the cell size and the intensity of CellMask in the cellular region as a measure of cell volume.
Parameter inference and model evidence
We use a Bayesian approach in order to infer the parameters of the system (see Supplemental Information for a detailed list of the model parameters) and rank the candidate mechanistic models. Bayesian parameter inference is centred around the posterior probability distribution, , which strikes a compromise between prior knowledge, p(θ), about parameter vectors, θ, and the capacity of a parameter to explain the observed data, x*, measured by the likelihood p(x*|θ), via
Here we evaluate the posterior using a sequential Monte Carlo (SMC) sampler proposed by (Del Moral et al., 2006), which is easily parallelized. The output of the algorithm is a set of weighted parameter vectors {θ(i), ω(i)}1≤i≤N. Here the parameter vector associated to the highest weight is called the inferred parameter vector. Technical details about our implementation of the SMC sampler algorithm are given in the Supplemental Information.
The SMC sampler algorithm also enables us to compute the model evidence (Kirk et al., 2013), which is the probability to observe the data x* under the model ℳ (given the alternative models considered),
The model evidence allows us to rank candidate models in terms of their ability to explain the observed data x*: the best model is the one with the highest model evidence. In addition, the Bayes factor assesses the plausibility of two candidate models ℳ1 and ℳ2:
Whenever BF1,2 is larger than 30, the evidence in favour of model ℳ1 is considered very strong (Jeffreys, 1961). We use our own implementation of the SMC sampler algorithm in Python as well as an interface to simulate the models in a computational efficient manner using a GPU accelerated ODE solver (Zhou et al., 2011) and a C++ ODE solver for stiff models (Hindmarsh et al., 2005).
Likelihood functions
At each time point t ∈ 𝕋 = {0, 2, 4, …50} the concentrations of the pMEK and ppERK are measured in Nt different cells. We denote by and the concentration of the two proteins in the i-th cell, 1 ≤ i ≤ Nt, and by and the observed average trajectories. In addition we denote by xt(θ) and yt(θ) the solution of the system of ODE given the parameter vector θ at time t.
Assuming an independent Gaussian measurement error for each time point with constant variance v, the likelihood function for the average data measurements is where Φ(·; m, v) is the probability density function of a normal distribution of mean m and variance v. The variance v is inferred simultaneously with the other parameters.
In order to derive the likelihood function in the intrinsic noise model we use the linear noise approximation (LNA). The LNA provides a system of ODEs which describe how the means and the variances of the molecular species vary over time. These equations are produced using the StochSens package (Komorowski et al., 2012). With ,, and denoting the solutions of the ODEs describing the means and variances for the parameter θ at time t, the likelihood is equal to
Extrinsic noise is modelled by considering that each cell has a different set of parameters. The distribution of each parameter across the cell population is assumed to be log-normal. We assume that these distributions are independent and denote by μθ and the vector of the means and variances of these distribution, respectively. There is no closed-form expression for the probability and we use the so-called Unscented Transform (UT), which, given the first two moments μθ and of the distribution in the parameter space, provides an approximation of the evolution of the means and variances of the two species of interest. We denote by and the resulting mean behaviours of the two species at time t, and by and the associated variances. Assuming that the concentration of the doubly phosphorylated ERK and MEK proteins are log-normally distributed we obtain that the likelihood is
Here Ψ(·; m, v) is the probability density function of a log–normal distribution with mean m and variance v. The Supplemental Information contains additional technical details on the computation and the UT algorithm.
Latin Hypercube sampling
We use Latin Hypercube sampling (LHS) (McKay et al., 1979) to generate 106 parameter vectors in a 20-dimensional space using the Matlab function lhsdesign.
Correlation analysis
In addition to the experimental measurements for the total amount of doubly phosphorylated ERK and MEK our assay also obtained measurements for cell-size, cell volume and Hoechst intensity in each cell. We computed the correlation and partial correlations between these 5 measurements using the R package GeneNet (Schäfer et al., 2001).
Mutual information
The mutual information between two species (ppERK and ppMEK) is computed based on measurements of the protein concentrations in single-cells at different time points. For each time point, we estimate the mutual information using a kernel density estimate of the joint distribution. We use a gaussian kernel with a diagonal covariance matrix and marginal variances equal to 1.06σN-1/5 where σ is the marginal variance of the data and N is the number of data points (Silverman, 1986).
Author contributions
SF, CPB, SK and MPHS designed the study; TK, TT and SK performed the data collection and initial processing; SF, CPB, PK performed modelling and statistical analysis; SF, CPB, PK, SK and MPHS wrote the paper; all authors approved the final version of the manuscript.
Acknowledgements
PK, TK, SK and MPHS acknowledge financial support from the Human Frontiers Science Programme; SF is funded through an MRC Biocomputing fellowship; CPB is a Wellcome Trust Career Development Fellow; SF, CPB, PK, SK and MPHS were also funded through a JST/BBSRC partnering award. MPHS is Royal Society Wolfson Research Merit Award holder.