TY - JOUR T1 - The population genetics of human disease: the case of recessive, lethal mutations JF - bioRxiv DO - 10.1101/091579 SP - 091579 AU - Carlos Eduardo G. Amorim AU - Ziyue Gao AU - Zachary Baker AU - José Francisco Diesel AU - Yuval B. Simons AU - Imran S. Haque AU - Joseph Pickrell AU - Molly Przeworski Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/12/04/091579.abstract N2 - Do the frequencies of disease mutations in human populations reflect a simple balance between mutation and purifying selection? What other factors shape the prevalence of disease mutations? To begin to answer these questions, we focused on one of the simplest cases: recessive mutations that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set of 417 Mendelian mutations in 32 genes, reported to cause a recessive, lethal Mendelian disease. We then considered analytic models of mutation-selection balance in infinite and finite populations of constant sizes and simulations of purifying selection in a more realistic demographic setting, and tested how well these models fit allele frequencies estimated from 33,370 individuals of European ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially elevated rate, and other mutation types. Whereas observed frequencies for CpG transitions are close to expectation, the frequencies observed for other mutation types are an order of magnitude higher than expected; this discrepancy is even larger when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into account. In principle, higher than expected frequencies of disease mutations could be due to widespread errors in reporting causal variants, compensation by other mutations, or balancing selection. It is unclear why these factors would affect CpG transitions differently from other mutations, however. We argue instead that the unexpectedly high frequency of non-CpGti disease mutations likely reflects an ascertainment bias: of all the mutations that cause recessive lethal diseases, those that by chance have reached higher frequencies are more likely to have been identified in medical studies and thus to have been included in this study. Beyond the specific application, this study highlights the parameters likely to be important in shaping the frequencies of Mendelian disease alleles.Author Summary What determines the frequencies of disease mutations in human populations? To begin to answer this question, we focus on one of the simplest cases: mutations that cause completely recessive, Mendelian disease. We first review theory about what to expect from mutation and selection in a population of finite size and further generate predictions based on simulations using a realist demographic scenario of human evolution. For a highly mutable type of mutations, involving transitions at CpG sites, we find that the predictions fit observed frequencies of recessive lethal disease mutations well. For less mutable types, however, predictions tend to underestimate the observed frequency. We discuss possible explanations for the discrepancy and point to a complication that, to our knowledge, is not widely appreciated: that there exists ascertainment bias in disease mutation discovery. Specifically, we suggest that alleles that have been identified to date are likely the ones that by chance have drifted to higher frequencies and are thus more likely to have been mapped. More generally, our study highlights the parameters that influence the frequencies of Mendelian disease alleles, helping to interpret the relevance of variants of unknown significance based on their allele frequencies. ER -