PT - JOURNAL ARTICLE AU - Sebastian D. Mackowiak AU - Henrik Zauber AU - Chris Bielow AU - Denise Thiel AU - Kamila Kutz AU - Lorenzo Calviello AU - Guido Mastrobuoni AU - Nikolaus Rajewsky AU - Stefan Kempa AU - Matthias Selbach AU - Benedikt Obermayer TI - Comprehensive identification and characterization of conserved small ORFs in animals AID - 10.1101/017772 DP - 2015 Jan 01 TA - bioRxiv PG - 017772 4099 - http://biorxiv.org/content/early/2015/04/09/017772.short 4100 - http://biorxiv.org/content/early/2015/04/09/017772.full AB - There is increasing evidence that non-annotated short open reading frames (sORFs) can encode functional micropeptides, but computational identification remains challenging. We expand our published method and predict conserved sORFs in human, mouse, zebrafish, fruit fly and the nematode C. elegans. Isolating specific conservation signatures indicative of purifying selection on encoded amino acid sequence, we identify about 2000 novel sORFs in the untranslated regions of canonical mRNAs or on transcripts annotated as non-coding. Predicted sORFs show stronger conservation signatures than those identified in previous studies and are sometimes conserved over large evolutionary distances. Encoded peptides have little homology to known proteins and are enriched in disordered regions and short interaction motifs. Published ribosome profiling data indicate translation for more than 100 of novel sORFs, and mass spectrometry data gives peptidomic evidence for more than 70 novel candidates. We thus provide a catalog of conserved micropeptides for functional validation in vivo.