RT Journal Article SR Electronic T1 The human functional genome defined by genetic diversity JF bioRxiv FD Cold Spring Harbor Laboratory SP 082362 DO 10.1101/082362 A1 Julia di Iulio A1 Istvan Bartha A1 Emily H.M. Wong A1 Hung-Chun Yu A1 Michael Hicks A1 Naisha Shah A1 Victor Lavrenko A1 Ewen F. Kirkness A1 Martin M. Fabani A1 Dongchan Yang A1 Inkyung Jung A1 William H. Biggs A1 Bing Ren A1 J. Craig Venter A1 Amalio Telenti YR 2016 UL http://biorxiv.org/content/early/2016/10/21/082362.abstract AB Large scale efforts to sequence whole human genomes provide extensive data on the non-coding portion of the genome. We used variation information from 11,257 human genomes to describe the spectrum of sequence conservation in the population. We established the genome-wide variability for each nucleotide in the context of the surrounding sequence in order to identify departure from expectation at the population level (context-dependent conservation). We characterized the population diversity for functional elements in the genome and identified the coordination of conserved sequences of distal and cis enhancers, chromatin marks, promoters, coding and intronic regions. The most context-dependent conserved regions of the genome are associated with unique functional annotations and a genomic organization that spreads up to one megabase. Importantly, these regions are enriched by over 100-fold of non-coding pathogenic variants. This analysis of human genetic diversity thus provides a detailed view of sequence conservation, functional constraint and genomic organization of the human genome. Specifically, it identifies highly conserved non-coding sequences that are not captured by analysis of interspecies conservation and are greatly enriched in disease variants.