PT - JOURNAL ARTICLE AU - Brent S. Pedersen AU - Aaron R. Quinlan TI - Who’s who? Detecting and resolving sample anomalies in human DNA sequencing studies with <em>peddy</em> AID - 10.1101/074385 DP - 2016 Jan 01 TA - bioRxiv PG - 074385 4099 - http://biorxiv.org/content/early/2016/09/09/074385.short 4100 - http://biorxiv.org/content/early/2016/09/09/074385.full AB - The potential for genetic discovery in human DNA sequencing studies is greatly diminished if DNA samples from the cohort are mislabelled, swapped, contaminated, or include unintended individuals. Unfortunately, the potential for such errors is significant since DNA samples are often manipulated by several protocols, labs or scientists in the process of sequencing. We have developed peddy to identify and facilitate the remediation of such errors via interactive visualizations and reports comparing the stated sex, relatedness, and ancestry to what is inferred from each individual’s genotypes. Peddy predicts a sample’s ancestry using a machine learning model trained on individuals of diverse ancestries from the 1000 Genomes Project reference panel. Peddy’s speed, text reports and web interface facilitate both automated and visual detection of sample swaps, poor sequencing quality and other indicators of sample problems that, were they left undetected, would inhibit discovery.Software Availability https://github.com/brentp/peddyDemonstration (Chrome suggested) http://home.chpc.utah.edu/∼u6000771//plots/ceph1463.html