RT Journal Article SR Electronic T1 Application of Global Transcriptome Data in Gene Ontology Classification and Construction of a Gene Ontology Interaction Network JF bioRxiv FD Cold Spring Harbor Laboratory SP 004911 DO 10.1101/004911 A1 Mario Fruzangohar A1 Esmaeil Ebrahimie A1 David L. Adelson YR 2014 UL http://biorxiv.org/content/early/2014/05/08/004911.abstract AB Background Gene Ontology (GO) classification of statistically significant over/under expressed genes is a common method for interpreting transcriptomics data as a first step in functional genomic analysis. In this approach, all significant genes contribute equally to the final GO classification regardless of their actual expression levels. However, the original level of gene expression can significantly affect protein production and consequently GO term enrichment. Furthermore, even genes with low expression levels can participate in the final GO enrichment through cumulative effects.GO terms have regulatory relationships allowing the construction of a regulatory directed network combined with gene expression levels to study biological mechanisms and select important genes for functional studies.Results In this report, we have used gene expression levels in bacteria to determine GO term enrichments. This approach provided the opportunity to enrich GO terms in across the entire transcriptome (instead of a subset of differentially expressed genes) and enabled us to compare transcriptomes across multiple biological conditions. As a case study for whole transcriptome GO analysis, we have shown that during the infection course of different host tissues by streptococcus pneumonia, Biological Process and Molecular Functions’ GO term protein enrichment proportions changed significantly as opposed to those for Cellular Components. In the second case study, we compared Salmonella Enteritidis transcriptomes between low and high pathogenic strains and showed that GO protein enrichment proportions remained unchanged in contrast to a previous case study.In the second part of this study we show for the first time a dynamically developed enriched interaction network between Biological Process GO terms for any gene samples. This type of network presents regulatory relationships between GO terms and their genes. Furthermore, the network topology highlights the centrally located genes in the network which can be used for network based gene selection. As a case study, GO regulatory networks of streptococcus pneumonia and Salmonella enteritidis were constructed and studied.Conclusions In both Streptococcus pneumonia and Salmonella enteritidis, the pathways related to GO terms “Environmental Information Processing”, “Signal transduction” and “two-component system” were associated with increasing pathogenicity, breaching host barriers and the generation of new strains.This study demonstrates a comprehensive GO enrichment based on whole transcriptome data, along with a novel method for developing a GO regulatory network showing overview of central and marginal GOs that can contribute to efficient gene selection.