PT - JOURNAL ARTICLE AU - Andrew P Morgan AU - J Matthew Holt AU - Rachel C McMullan AU - Timothy A Bell AU - Amelia M-F Clayshulte AU - John P Didion AU - Liran Yadgary AU - David Thybert AU - Duncan T Odom AU - Paul Flicek AU - Leonard McMillan AU - Fernando Pardo-Manuel de Villena TI - The evolutionary fates of a large segmental duplication in mouse AID - 10.1101/043687 DP - 2016 Jan 01 TA - bioRxiv PG - 043687 4099 - http://biorxiv.org/content/early/2016/04/29/043687.short 4100 - http://biorxiv.org/content/early/2016/04/29/043687.full AB - Gene duplication and loss are major sources of genetic polymorphism in populations, and are important forces shaping the evolution of genome content and organization. We have reconstructed the origin and history of a 127 kbp segmental duplication, R2d, in the house mouse (Mus musculus). R2d contains a single protein-coding gene, Cwc22. De novo assembly of both the ancestral (R2d1) and the derived (R2d2) copies reveals that they have been subject to non-allelic gene conversion events spanning tens of kilobases. R2d2 is also a hotspot for structural variation: its diploid copy number ranges from zero in the mouse reference genome to more than 80 in wild mice sampled from around the globe. Hemizgyosity for high-copy-number alleles of R2d2 is associated in cis with meiotic drive, suppression of meiotic crossovers, and copy-number instability, with a mutation rate in excess of 1 per 100 transmissions in laboratory populations. We identify an additional 57 loci covering 0.8% of the mouse genome with patterns of sequence variation similar to those at R2d1 and R2d2. Our results provide a striking example of allelic diversity generated by duplication and demonstrate the value of de novo assembly in a phylogenetic context for understanding the mutational processes affecting duplicate genes.