RT Journal Article SR Electronic T1 Unsupervised extraction of functional gene expression signatures in the bacterial pathogen Pseudomonas aeruginosa with eADAGE JF bioRxiv FD Cold Spring Harbor Laboratory SP 078659 DO 10.1101/078659 A1 Jie Tan A1 Georgia Doing A1 Kimberley A. Lewis A1 Courtney E. Price A1 Kathleen M. Chen A1 Kyle C. Cady A1 Barret Perchuk A1 Michael T. Laub A1 Deborah A. Hogan A1 Casey S. Greene YR 2016 UL http://biorxiv.org/content/early/2016/12/02/078659.abstract AB While the large sets of publicly available gene expression data contain substantial information about relationships between mRNA expression profiles and genetic background, environment, and cellular state, cross experiment comparisons of public data are challenged by technical noise that masks biological signals. We previously showed that one could reveal biological signatures within compendia of expression data using an unsupervised neural network algorithm, called ADAGE, which excels at detecting patterns in noisy datasets. Here, we show that the generation and integration of multiple ADAGE models, resulting in an ensemble ADAGE (eADAGE), better identified biological pathways. For the bacterium Pseudomonas aeruginosa, our analysis found that on the order of 1000 samples were needed to build pathway-level gene expression signatures. The P. aeruginosa gene expression compendium contains experiments performed in 78 different media, and we used eADAGE to identify expression signatures associated with medium-type across multiple experiments. We identified a subset of media, including several complex media that were not designed to limit phosphate, in which P. aeruginosa exhibited a phosphate starvation response controlled by PhoB. Furthermore, while it was expected that PhoB activates the phosphate starvation response in low phosphate, our analyses found that PhoB was also active in moderate phosphate concentrations and predicted that activity required a second stimulus provided by a sensor kinase, KinB, which was validated in subsequent experiments including a screen of a histidine kinase knock out collection confirmed the specificity of its role in the activation of the Pho regulon. Algorithms that extract biological signal from large collections of public gene expression data, such as eADAGE, can highlight opportunities to discover mechanisms that are currently unrecognized from public data.