PT - JOURNAL ARTICLE AU - Fernando Racimo AU - Gabriel Renaud AU - Montgomery Slatkin TI - Joint estimation of contamination, error and demography for nuclear DNA from ancient humans AID - 10.1101/022285 DP - 2015 Jan 01 TA - bioRxiv PG - 022285 4099 - http://biorxiv.org/content/early/2015/07/10/022285.short 4100 - http://biorxiv.org/content/early/2015/07/10/022285.full AB - When sequencing an ancient DNA sample from a hominin fossil, DNA from present-day humans involved in excavation and extraction will be sequenced along with the endogenous material. This type of contamination is problematic for downstream analyses as it will introduce a bias towards the population to which the contaminating individuals belong. Quantifying the extent of contamination is a crucial step as it allows researchers to account for possible biases that may arise in downstream genetic analyses. Here, we present an MCMC algorithm to co-estimate the contamination rate, sequencing error rate and demographic parameters - including drift times and admixture rates - for an ancient nuclear genome obtained from human remains, when the putative contaminating DNA comes from present-day humans. We assume we have a large panel representing the putative contaminating population (e.g. European, East Asian or African). The method is implemented in a C++ program called ‘Demographic Inference with Contamination and Error’ (DICE). The program can also be used to determine the most likely population to which the contaminant DNA belongs. We applied it to simulations and Neanderthal genome data, and we recover accurate estimates of all parameters, even when the average sequencing coverage is low (0.5X) and the per-read contamination rate is high (25%).