TY - JOUR T1 - Sequence properties underlying gene regulatory enhancers are conserved across mammals JF - bioRxiv DO - 10.1101/110676 SP - 110676 AU - Ling Chen AU - Alexandra E. Fish AU - John A. Capra Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/02/21/110676.abstract N2 - Gene expression patterns and transcription factor DNA binding preferences are largely conserved across mammals; however, there is substantial turnover in active regulatory enhancers between closely related species. We investigated this seeming contradiction by quantifying the conservation of sequence patterns underlying histone-mark defined enhancers across six diverse mammalian species (human, macaque, mouse, dog, cow, and opossum). In each species, we found that machine-learning classifiers based on short DNA sequence patterns could accurately identify many adult liver and developing limb enhancers. We applied these classifiers across species and found that classifiers trained in different species performed nearly as well as classifiers trained on the target species, indicating that the underlying sequence properties predictive of enhancers are largely conserved. We also observed similar cross-species conservation in classifiers trained on human and mouse enhancers validated in transgenic reporter assays, and these classifiers learned predictive sequence properties similar to the classifiers trained on histone-mark defined enhancers. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, which supports the biological relevance of the learned features. These results suggest that, though the genomic regions with enhancer activity change rapidly between species, many of the sequence properties encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution. ER -