TY - JOUR T1 - dRep: A tool for fast and accurate genome de-replication that enables tracking of microbial genotypes and improved genome recovery from metagenomes JF - bioRxiv DO - 10.1101/108142 SP - 108142 AU - Matthew R. Olm AU - Christopher T. Brown AU - Brandon Brooks AU - Jillian F. Banfield Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/02/13/108142.abstract N2 - The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that sequentially applies a fast, inaccurate estimation of genome distance and a slow but accurate measure of average nucleotide identity to reduce the computational time for pair-wise genome set comparisons by orders of magnitude. We demonstrate its use in a study where we separately assembled each metagenome from time series datasets. Groups of essentially identical genomes were identified with dRep, and the best genome from each set was selected. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using the typical co-assembly method. Documentation is available at http://drep.readthedocs.io/en/master/ and source code is available at https://github.com/MrOlm/drep. ER -