RT Journal Article SR Electronic T1 Analysis of protein-coding genetic variation in 60,706 humans JF bioRxiv FD Cold Spring Harbor Laboratory SP 030338 DO 10.1101/030338 A1 Exome Aggregation Consortium A1 Monkol Lek A1 Konrad J Karczewski A1 Eric V Minikel A1 Kaitlin E Samocha A1 Eric Banks A1 Timothy Fennell A1 Anne H O’Donnell-Luria A1 James S Ware A1 Andrew J Hill A1 Beryl B Cummings A1 Taru Tukiainen A1 Daniel P Birnbaum A1 Jack A Kosmicki A1 Laramie Duncan A1 Karol Estrada A1 Fengmei Zhao A1 James Zou A1 Emma Pierce-Hoffman A1 David N Cooper A1 Mark DePristo A1 Ron Do A1 Jason Flannick A1 Menachem Fromer A1 Laura Gauthier A1 Jackie Goldstein A1 Namrata Gupta A1 Daniel Howrigan A1 Adam Kiezun A1 Mitja I Kurki A1 Ami Levy Moonshine A1 Pradeep Natarajan A1 Lorena Orozco A1 Gina M Peloso A1 Ryan Poplin A1 Manuel A Rivas A1 Valentin Ruano-Rubio A1 Douglas M Ruderfer A1 Khalid Shakir A1 Peter D Stenson A1 Christine Stevens A1 Brett P Thomas A1 Grace Tiao A1 Maria T Tusie-Luna A1 Ben Weisburd A1 Hong-Hee Won A1 Dongmei Yu A1 David M Altshuler A1 Diego Ardissino A1 Michael Boehnke A1 John Danesh A1 Roberto Elosua A1 Jose C Florez A1 Stacey B Gabriel A1 Gad Getz A1 Christina M Hultman A1 Sekar Kathiresan A1 Markku Laakso A1 Steven McCarroll A1 Mark I McCarthy A1 Dermot McGovern A1 Ruth McPherson A1 Benjamin M Neale A1 Aarno Palotie A1 Shaun M Purcell A1 Danish Saleheen A1 Jeremiah Scharf A1 Pamela Sklar A1 Patrick F Sullivan A1 Jaakko Tuomilehto A1 Hugh C Watkins A1 James G Wilson A1 Mark J Daly A1 Daniel G MacArthur YR 2015 UL http://biorxiv.org/content/early/2015/10/30/030338.abstract AB Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities. The resulting catalogue of human genetic diversity has unprecedented resolution, with an average of one variant every eight bases of coding sequence and the presence of widespread mutational recurrence. The deep catalogue of variation provided by the Exome Aggregation Consortium (ExAC) can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 79% of which have no currently established human disease phenotype. Finally, we show that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes.