RT Journal Article SR Electronic T1 Functional and non-functional classes of peptides produced by long non-coding RNAs JF bioRxiv FD Cold Spring Harbor Laboratory SP 064915 DO 10.1101/064915 A1 Jorge Ruiz-Orera A1 Pol Verdaguer-Grau A1 José Luis Villanueva-Cañas A1 Xavier Messeguer A1 M. Mar Albà YR 2016 UL http://biorxiv.org/content/early/2016/10/24/064915.abstract AB Cells express thousands of transcripts that show weak coding potential. Known as long non-coding RNAs (IncRNAs), they typically contain short open reading frames (ORFs) having no homology with known proteins. Recent studies have reported that a significant proportion of IncRNAs are translated, challenging the view that they are non-coding. These results are based on the selective sequencing of ribosome-protected fragments, or ribosome profiling. The present study uses ribosome profiling data from eight mouse tissues and cell types, combined with ~330,000 synonymous and non-synonymous single nucleotide variants, to dissect the patterns of purifying selection in proteins translated from IncRNAs. Using the three-nucleotide read periodicity that characterizes actively translated regions, we identify about 1,365 translated IncRNAs. About one fourth of them (350 IncRNAs) show conservation in humans; this is likely to produce functional micropeptides, including the recently discovered myoregulin. For other IncRNAs, the ORF codon usage bias distinguishes between two classes. The first has significant coding scores and evidence of purifying selection, consistent with the presence of lineage-specific functional proteins. The second large class, comprising >500 IncRNAs, produces proteins that show no significant purifying selection signatures. We show that the translation of these IncRNAs depends on the transcript expression level and the chance occurrence of ORFs with a favorable codon composition. Some of these IncRNAs may be precursors of novel protein-coding genes, filling a gap in our current understanding of de novo gene birth.