TY - JOUR T1 - Managing Uncertainty in Metabolic Network Structure and Improving Predictions Using EnsembleFBA JF - bioRxiv DO - 10.1101/077636 SP - 077636 AU - Matthew B. Biggs AU - Jason A. Papin Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/09/26/077636.abstract N2 - Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel ensemble approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs and is compatible with many automated reconstruction approaches. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contribute disproportionately to the set of predicted essential reactions in a way that is unique to each Streptococcus species. These species-specific network structures lead to species-specific outcomes from small molecule interactions. Through these analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.Author Summary Metabolism is the driving force behind all biological activity. Genome-scale metabolic network reconstructions (GENREs) are representations of metabolic systems that can be analyzed mathematically to make predictions about how a biochemical system will behave as well as to design biochemical systems with new properties. GENREs have traditionally been reconstructed manually, which can require extensive time and effort. Recent software solutions automate the process (drastically reducing the required effort) but the resulting GENREs are of lower quality and produce less reliable predictions than the manually-curated versions. We present a novel method (“EnsembleFBA”) which overcomes uncertainties involved in automated reconstruction by pooling many different draft GENREs together into an ensemble. We tested EnsembleFBA by predicting the growth and essential genes of the common pathogen Pseudomonas aeruginosa. We found that when predicting growth or essential genes, ensembles of GENREs achieved much better precision or captured many more essential genes than any of the individual GENREs within the ensemble. By improving the predictions that can be made with automatically-generated GENREs, we open the door to studying systems which would otherwise be infeasible. ER -