TY - JOUR T1 - Short template switch events in human evolution cause complex mutation patterns JF - bioRxiv DO - 10.1101/038380 SP - 038380 AU - Ari Löytynoja AU - Nick Goldman Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/04/15/038380.abstract N2 - Background Resequencing efforts are uncovering the extent of genetic variation in humans and provide data to study the evolutionary processes shaping our genome. One recurring puzzle in both intra-and inter-species studies is the high frequency of complex mutations comprising multiple nearby base substitutions or insertion-deletions. We devised a generalized mutation model to study the role of template switch events in the origin of such mutation clusters.Results Applied to the human genome, our model detects thousands of template switch events during the evolution of human and chimp from their common ancestor, and hundreds of events between two independently sequenced human genomes. While many of these are consistent with the inter-strand template switch mechanism proposed for bacteria, our model also identifies new types of mutations that create short inversions, some flanked by paired inverted repeats. This local template switch process creates numerous complex mutation patterns, including secondary structures, and explains multi-nucleotide mutations and compensatory substitutions without invoking positive selection. Detection of these complex mutations with current resequencing methodologies is difficult and we find many erroneous variant annotations in human reference data.Conclusions Previously unexplained short template switch events account for a large number of complex mutation patterns in human evolution, without invoking complicated and speculative mechanisms or implausible coincidence. We show that clustered sequence differences are challenging for mapping and variant calling methods. Template switch events such as those we have uncovered may have been neglected as an explanation for complex mutations because of biases in commonly used analyses. Incorporation of our model into analysis pipelines will lead to improved understanding of genome variation and evolution. ER -