RT Journal Article SR Electronic T1 A framework to interpret short tandem repeat variations in humans JF bioRxiv FD Cold Spring Harbor Laboratory SP 092734 DO 10.1101/092734 A1 Melissa Gymrek A1 Thomas Willems A1 David Reich A1 Yaniv Erlich YR 2016 UL http://biorxiv.org/content/early/2016/12/09/092734.abstract AB Identifying regions of the genome that are depleted of mutations can reveal potentially deleterious variants. Short tandem repeats (STRs), comprised of repeating motifs of 1-6bp, are among the largest contributors of de novo mutations in humans and are implicated in a variety of human disorders. However, because of the challenges STRs pose to bioinformatics tools, studies of STR mutations have been limited to highly ascertained panels of several dozen loci. Here, we harnessed novel bioinformatics tools and an analytical framework to estimate mutation parameters at each STR in the human genome. We then developed a model of the STR mutation process that allows us to obtain accurate estimates of mutation parameters at each STR by correlating genotypes with local sequence heterozygosity. Finally, we used our method to obtain robust estimates of the impact of local sequence features on mutation parameters and used this to create a framework for measuring constraint at STRs by comparing observed vs. expected mutation rates. Constraint scores identified known pathogenic variants with early onset effects. Our constraint metrics will provide a valuable tool for prioritizing pathogenic STRs in medical genetics studies.