TY - JOUR T1 - Functional and non-functional classes of peptides produced by long non-coding RNAs JF - bioRxiv DO - 10.1101/064915 SP - 064915 AU - Jorge Ruiz-Orera AU - Pol Verdaguer-Grau AU - José Luis Villanueva-Cañas AU - Xavier Messeguer AU - M. Mar Albà Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/10/24/064915.abstract N2 - Cells express thousands of transcripts that show weak coding potential. Known as long non-coding RNAs (IncRNAs), they typically contain short open reading frames (ORFs) having no homology with known proteins. Recent studies have reported that a significant proportion of IncRNAs are translated, challenging the view that they are non-coding. These results are based on the selective sequencing of ribosome-protected fragments, or ribosome profiling. The present study uses ribosome profiling data from eight mouse tissues and cell types, combined with ~330,000 synonymous and non-synonymous single nucleotide variants, to dissect the patterns of purifying selection in proteins translated from IncRNAs. Using the three-nucleotide read periodicity that characterizes actively translated regions, we identify about 1,365 translated IncRNAs. About one fourth of them (350 IncRNAs) show conservation in humans; this is likely to produce functional micropeptides, including the recently discovered myoregulin. For other IncRNAs, the ORF codon usage bias distinguishes between two classes. The first has significant coding scores and evidence of purifying selection, consistent with the presence of lineage-specific functional proteins. The second large class, comprising >500 IncRNAs, produces proteins that show no significant purifying selection signatures. We show that the translation of these IncRNAs depends on the transcript expression level and the chance occurrence of ORFs with a favorable codon composition. Some of these IncRNAs may be precursors of novel protein-coding genes, filling a gap in our current understanding of de novo gene birth. ER -