TY - JOUR T1 - Unsupervised extraction of functional gene expression signatures in the bacterial pathogen <em>Pseudomonas aeruginosa</em> with eADAGE JF - bioRxiv DO - 10.1101/078659 SP - 078659 AU - Jie Tan AU - Georgia Doing AU - Kimberley A. Lewis AU - Courtney E. Price AU - Kathleen M. Chen AU - Kyle C. Cady AU - Barret Perchuk AU - Michael T. Laub AU - Deborah A. Hogan AU - Casey S. Greene Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/12/02/078659.abstract N2 - While the large sets of publicly available gene expression data contain substantial information about relationships between mRNA expression profiles and genetic background, environment, and cellular state, cross experiment comparisons of public data are challenged by technical noise that masks biological signals. We previously showed that one could reveal biological signatures within compendia of expression data using an unsupervised neural network algorithm, called ADAGE, which excels at detecting patterns in noisy datasets. Here, we show that the generation and integration of multiple ADAGE models, resulting in an ensemble ADAGE (eADAGE), better identified biological pathways. For the bacterium Pseudomonas aeruginosa, our analysis found that on the order of 1000 samples were needed to build pathway-level gene expression signatures. The P. aeruginosa gene expression compendium contains experiments performed in 78 different media, and we used eADAGE to identify expression signatures associated with medium-type across multiple experiments. We identified a subset of media, including several complex media that were not designed to limit phosphate, in which P. aeruginosa exhibited a phosphate starvation response controlled by PhoB. Furthermore, while it was expected that PhoB activates the phosphate starvation response in low phosphate, our analyses found that PhoB was also active in moderate phosphate concentrations and predicted that activity required a second stimulus provided by a sensor kinase, KinB, which was validated in subsequent experiments including a screen of a histidine kinase knock out collection confirmed the specificity of its role in the activation of the Pho regulon. Algorithms that extract biological signal from large collections of public gene expression data, such as eADAGE, can highlight opportunities to discover mechanisms that are currently unrecognized from public data. ER -