RT Journal Article SR Electronic T1 Inferring extrinsic noise from single-cell gene expression data using Approximate Bayesian Computation JF bioRxiv FD Cold Spring Harbor Laboratory SP 030155 DO 10.1101/030155 A1 Oleg Lenive A1 Paul DW Kirk A1 Michael PH Stumpf YR 2015 UL http://biorxiv.org/content/early/2015/10/29/030155.abstract AB Background Gene expression is known to be an intrinsically stochastic process which can involve single-digit numbers of mRNA molecules in a cell at any given time. The modelling of such processes calls for the use of exact stochastic simulation methods, most notably the Gillespie algorithm. However, this stochasticity, also termed “intrinsic noise”, does not account for all the variability between genetically identical cells growing in a homogeneous environment. Despite substantial experimental efforts, determining appropriate model parameters continues to be a challenge. Methods based on approximate Bayesian computation can be used to obtain posterior parameter distributions given the observed data. However, such inference procedures require large numbers of simulations of the model and exact stochastic simulation is computationally costly. In this work we focus on the specific case of trying to infer model parameters describing reaction rates and extrinsic noise on the basis of measurements of molecule numbers in individual cells at a given time point.Results To make the problem computationally tractable we develop an exact, model-specific, stochastic simulation algorithm for the commonly used two-state model of gene expression. This algorithm relies on certain assumptions and favourable properties of the model to forgo the simulation of the whole temporal trajectory of protein numbers in the system, instead returning only the number of protein and mRNA molecules present in the system at a specified time point. The computational gain is proportional to the number of protein molecules created in the system and becomes significant for systems involving hundreds or thousands of protein molecules. We employ this algorithm, approximate Bayesian computation, and published gene expression data for Escherichia coli to simultaneously infer the model’s rate parameters and parameters describing extrinsic noise for 86 genes.