TY - JOUR T1 - Comprehensive mapping of mammalian transcriptomes identifies conserved genes associated with different cell differentiation states JF - bioRxiv DO - 10.1101/022608 SP - 022608 AU - Yang Yang AU - Yu-Cheng T. Yang AU - Jiapei Yuan AU - Xiaohua Shen AU - Zhi John Lu AU - Jingyi Jessica Li Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/07/15/022608.abstract N2 - Cell identity (or cell state) is established via gene expression programs, represented by “associated genes” with dynamic expression across cell identities. Here we integrate RNA-seq data from 40 tissues and cell types from human, chimpanzee, bonobo, and mouse to investigate the conservation and differentiation of cell states. We employ a statistical tool, “Transcriptome Overlap Measure” (TROM) to first identify cell-state-associated genes, both protein-coding and non-coding. Next, we use TROM to comprehensively map the cell states within each species and also between species based on the cell-state-associated genes. The within-species mapping measures which cell states are similar to each other, allowing us to construct a human cell differentiation tree that recovers both known and novel lineage relationships between cell states. Moreover, the between-species mapping summarizes the conservation of cell states across the four species. Based on these results, we identify conserved associated genes for different cell states and annotate their biological functions. Interestingly, we find that neural and testis tissues exhibit distinct evolutionary signatures in which neural tissues are much less enriched in conserved associated genes than testis. In addition, our mapping demonstrate that besides protein-coding genes, long non-coding RNAs serve well as associated genes to indicate cell states. We further infer the biological functions of those non-coding associated genes based on their co-expressed protein-coding associated genes. Overall, we provide a catalog of conserved and species-specific associated genes that identifies candidates for downstream experimental studies of the roles of these candidates in controlling cell identity.HighlightsComprehensive transcriptome mapping of cell states across four mammalian speciesBoth protein-coding genes and long non-coding RNAs serve as good markers of cell identityDistinct evolutionary signatures of neural and testis tissuesA catalog of conserved associated protein-coding genes and lncRNAs in different mammalian tissues and cell types ER -