Abstract
Gene expression is an inherently stochastic process that depends on the structure of the biochemical regulatory network in which the gene is embedded. Here we study the interplay between stochastic gene switching and the widespread negative feedback regulatory loop. Using a simple hybrid model, we find that stochasticity in gene switching by itself can induce pulses in the system. Furthermore, we find that our simple model is able to reproduce both exponential and peaked distributions of gene active and inactive times similar to those that have been observed experimentally. Our hybrid modelling approach also allows us to link these patterns to the dynamics of the system for each gene state.
Cells need to provide an adapted response to external stimuli, which requires the production of the adequate proteins following different temporal patterns. This is achieved through biochemical networks in which a stimulus triggers a cascade of reactions that eventually lead to the activation of transcription factors, proteins that activate or repress the expression of specific gene sets. Thus, the temporal regulation of gene activity will be determined by the structure of the network in which the gene is embedded [1]. A common regulatory structure is the negative feedback loop, in which a transcription factor activates the production of a protein that contributes to its own inhibition. This motif regulates the activity of important transcription factors such as NF-κB [2] and p53 [3] and has been shown to give rise to pulses in the concentration of the proteins of the network (see e.g. [4, 5]) as predicted by mathematical models [6]. The role of this pulsed dynamics is not fully understood though: theoretical studies suggest that they can give rise to a more reliable protein production [7] while experiments show that oscillatory dynamics can determine the cell fate [8].
On the other hand, gene expression is an intrinsically noisy process [9] and models in which the gene state switches randomly between active and inactive states are able to fit well experimental data [10]. By contrast, some genes show peaked distributions of active/inactive times [11⇓–13] that deviate from the exponential distributions obtained with simple random models. These distributions can be obtained using multi-step models mirroring the multiple steps of gene activation [14], but they could also arise from the interplay between the stochastic gene activity and the structure of the regulatory network in which the gene is embedded. Some insights about this interplay have been gained by showing the emergence of oscillations when a gene is an autorepressor [15], of noise-enhanced persistence of biochemical species [16] and how stochasticity dephases genetic oscillators [17]. A major obstacle in this context is the difficulty of providing analytical insights on the nonlinear stochastic systems involved (often modelling species with very low copy numbers). For this reason, we are far from having a complete picture of the type of regulation that emerges from such interaction.
In this Letter we describe the dynamics emerging from the interaction of stochastic gene switching and the widespread negative feedback in a simple network using a hybrid modelling approach, in which only the gene activity is modelled as a stochastic process. Our hybrid simulations show that stochastic gene switching is responsible for most of the dynamical variability and induces pulsed dynamics in the system, though the deterministic model predicts a steady state. We show that even in this simple biochemical network distinct dynamical patterns of gene activity can arise, and how hybrid modelling allows us to gain analytical insight on their origin. We discuss the implications of our results in the end of this Letter.
The model considered here is shown in Fig. 1(a). This model adds a layer of regulation to the one proposed in Ref. [7] and is a simplified version of the biochemical network of NF-κB [18]. It is formed by a gene that can be active G or inactive and an activator A that can activate the gene, similarly to NF-κB [18]. When the gene is active, the inhibitor protein I is produced and provides the negative feedback by both contributing to the gene's inactivation and by forming a complex with A that cannot activate the gene any longer. In what follows we use the same letters both for the names of the biochemical species and for their copy numbers. For the sake of simplicity we consider that we have only one copy of the gene, so and that the total amount of A (free and bound to I) is constant and equal to Atot, as for NF-κB [18]. Finally we assume that the inhibitor undergoes degradation both in the free and in the complex form.
In Figure 1(b) we show a stochastic trajectory of the system obtained using the Gillespie algorithm [19]. It exhibits pulses of the free activator A and the rest of variables, as observed in different networks with negative feedbacks [6]. The parameters used were obtained from Ref. [18] and adjusted to obtain O(104) copies of each protein and inhibitor half-life and activator intra-peak timing of the order of one hour, the typical timescale of these biological oscillators [6]. Pulses are spiky as in models of oscillations of NF-κB [6] although we have found that in stochastic simulations of slightly more complex biochemical networks (e.g. [18]) a wider variety of pulses arises.
Another traditional modelling approach for biochemical networks is to use ordinary differential equations derived from mass-action kinetics [20], but this is inadequate when species with low copy numbers are present (in our case, G). High copy numbers is also a necessary condition to approximate the dynamics of the system by using a Langevin equation, although exact results can only be obtained for linear biochemical networks [21]. For these reasons, in this kind of systems the so-called hybrid simulations [22] in which part of the reactions are modeled by a Langevin equation [23] while the rest are modeled as stochastic processes, are increasingly popular. This procedure gives simulations that mimic the behaviour of the fully stochastic system [22] and significantly reduce the computation time.
Inspired by these algorithms we study our system through a simplified hybrid model in which the only species modelled stochastically is the gene state G. This type of modelling has indeed already been used to simulate complex models of cell signalling, see e.g. [24⇓–26]. But most importantly, this kind of modelling allows us to isolate and identify in a precise way the role of stochastic gene activity in the dynamics.
Using our hybrid approach we model the evolution of A and I as:
This nonlinear dynamical system is driven by the stochastic process of gene switching: where the switching rates will depend on time through the variables A and I. Thus, for the gene state G we can write down the chemical master equations:
Numerical simulations of this hybrid model are performed using a deterministic integrator for Eqs. 1 and 2 and switching the value of G between G = 0 and G = 1 following the Gillespie algorithm, as prescribed in Ref. [27]. For a fully deterministic simulation of the model using mass-action kinetics it would be enough to add to equations 1 and 2 the following equation, which in the equilibrium leads to a Michaelis-Menten like equation for G [1].
In Figure 2(a) we show the evolution in time of the free activator A obtained for fully stochastic, hybrid and deterministic simulations. We can observe that the pulses obtained for the fully stochastic simulations and the simplified hybrid simulations are very similar. Interestingly, we observe that the deterministic simulations lead to the convergence of the system to a steady state. From this we conclude that stochastic gene switching by itself can induce pulses in the network. This is another example of how stochasticity can induce pulses in contexts where deterministic models predict steady states, as observed in models of population dynamics [28] and excitable systems [29].
Simulations show that stochastic gene activity is responsible for most of the variability of the system. In Fig. 2(b) we show the distributions of A, represented by their probability density ρ, for stochastic and hybrid simulations, which are nearly indistinguishable. On the other hand, by using a simple peak detection algorithm we can detect the timing between two consecutive peaks T and their amplitude Apeak. For these calculations we consider only peaks of at least 10% of Atot, the order of magnitude that can be detected in experiments of activator dynamics such as NF-κB [5]. The distributions of these magnitudes are shown in Fig. 2(c) and (d) respectively, and are again nearly indistinguishable: this confirms the crucial role of stochastic gene activity in the dynamic variability of the system and the ability of hybrid model to mimic the fully stochastic simulations (in drastically shorter computation times).
Hybrid modelling also allows us to understand the pulsed dynamics of our network in terms of the the null-clines of the system given by Eqs. 1 and 2. There are three of such nullclines: the one that we obtain by setting , I = fA(A), and the two nullclines that we obtain by setting for G = 0 and G = 1, denoted I = fI,0(A) and I = fI,1(A) respectively. The nullclines and a trajectory for our hybrid model are depicted in Fig. 3. It is easy to see that irrespectively of the parameter values I = fA(A) intersects in exactly one point, , with I = fI,0(A) and in exactly one point, , with I = fI,1(A). Furthermore, an analysis on the direction of the flow determined for each gene state shows that these two fixed points are necessarily stable. Thus, for this simple biochemical network the pulses can be understood as a series of jumps between the fixed points obtained for G = 0 and for G = 1.
Experimental studies in which gene expression is monitored in real time have shown that gene active and inactive times can either be exponentially distributed or be described by peaked Gamma-like distributions [11⇓–13]]. These distributions are important signatures of the underlying stochastic process that drives gene activity. To explore the gene active and inactive distributions that our simple model can generate, we study the dynamics of the system in parameter space by varying randomly each of the parameters within one order of magnitude from the values used in our previous simulations. For each parameter values, we simulated the dynamics and obtained the histograms of the active and inactive times (ton and toff, see Fig. 1(b)). We grouped them in ten clusters according to their coefficient of variation (CV) i.e. the standard deviation divided by the mean. We found that ton and toff can be distributed following quite distinct patterns: we observed distribution with shapes that range from an exponential-like shape, with a global maximum at zero and high CV, to a Gamma-like shape, with a global maximum close to the mean and low CV (see Fig. 4(a) and (b)). Our simple system is then able to recapitulate the experimentally observed distributions. In particular our calculations show that a negative feedback can give rise to peaked distributions and thus can be an alternative to the multi-step process models proposed to explain the distributions observed in experiments [11, 13].
Using the hybrid modelling we can investigate the origin of these patterns. For the sake of simplicity we focus on the distribution of the active times (ton). For our system, if at time t = 0 the system is in the state G, the probability of remaining in the state G can be expressed in terms of the conditional probability PG(t|V0) given the initial condition V0 = (A0, I0) and the probability of finding the system at V0 at the initial gene transition as: where the conditional probability is:
Notice that if the overall switching rate is constant (koffI(t, V0) = cte), as in the random switching model [10], an exponential probability distribution for the gene inactive time will be recovered. Instead, we find that in some cases the probability distribution is nonexponential. In this situation we would expect that the conditional distributions PG(t|V0) that contribute the most to the integral in Eq. 6 should have a relative maximum at tmax ≠ 0. Such relative maximum would satisfy the following equation,
Solving this equation requires the explicit form of I(t, V0). However we know that and as t → ∞. Hence, equation 8 is the equation of the intersection of a monotonically decreasing function (trajectories are always below the nullcline I = fI,1(A) and , see Fig. 3) and a monotonically increasing function koff · (I(t,V0))2 (because the amount of inhibitor grows when G = 1, see again Fig. 3) at the point t = tmax. Our previous nullclines analysis shows that is a stable fixed point with negative eigenvalues with absolute values λfast > λslow. Considering all this, we can roughly approximate where cfast > 0, cslow > 0 and cfast + cslow = 1. For short times (compared with 1/λslow) the derivative scales with λfast and hence the crossing determined by Eq. 8 will only be possible if λfast is sufficiently big (and koff is sufficiently small). It is easy to show that the same argument leads to an equivalent result for the distribution of the inactive times (toff) where λfast is the corresponding fast eigenvalue at the fixed point with G = 0.
We provide a numerical confirmation of the validity of this argument in Figs. 4(c) and (d), where we show that the values of λfast for the corresponding fixed points (for G = 0 and G = 1) are able to discriminate between the two different gene activity patterns, since the larger the eigenvalue is, the more likely is that toff (and ton) are exponentially distributed. Thus, the use of hybrid modelling allows to gain insights on the origin of the patterns of gene activity observed.
Pulsed dynamics is widespread in genetic circuits [30]. Our simple model shows that the interplay of negative feedbacks and stochastic gene switching gives rise to pulsed dynamics even if the fully deterministic simulations predict convergence to a steady state. Furthermore, we have found that in spite of the simplicity of the dynamics arising, our network can display different gene activity patterns. The negative feedback plays a key role in this fine temporal control: without it, the dynamics of gene activation would be purely random. Our results imply that in experiments in which the gene activity patterns are found to be peaked [11⇓–13]] a negative feedback loop might be at work. We think that the increasing availability of experimental data will allow to delineate the contribution to the gene activity dynamics by both the multi-step sequential stochastic process of gene activation and the constraints imposed by the structure of the regulatory biochemical network. As in our work, the use of hybrid modelling can help to provide further analytical insights.
SZ has been partially supported by the Intra-European Fellowships for career development-2011-298447NonLinKB. NM has been fully supported by the Chancellor’s Fellowships granted by the University of Edinburgh.
Footnotes
* zambrano.samuel{at}hsr.it
↵† nacho.molina{at}ed.ac.uk