Abstract
De novo mutations occur with substantially different rates depending on genomic location, sequence context and DNA strand1–4. The success of many human genetics techniques, especially when applied to large population sequencing datasets with numerous recurrent mutations5–7, depends strongly on assumptions about the local mutation rate. Such techniques include estimation of selection intensity8, inference of demographic history9, and mapping of rare disease genes10. Here, we present Roulette, a genome-wide mutation rate model at the basepair resolution that incorporates known determinants of local mutation rate (http://genetics.bwh.harvard.edu/downloads/Vova/Roulette/). Roulette is shown to be more accurate than existing models1,6. Roulette has sufficient resolution at high mutation rate sites to model allele frequencies under recurrent mutation. We use Roulette to refine estimates of population growth within Europe by incorporating the full range of human mutation rates. The analysis of significant deviations from the model predictions revealed a 10-fold increase in mutation rate in nearly all genes transcribed by Polymerase III, suggesting a new mutagenic mechanism. We also detected an elevated mutation rate within transcription factor binding sites restricted to sites actively utilized in testis and residing in promoters.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
We added additional validation of the Roulette model and showed that SNVs in RNU genes are enriched with recurrent mutations