RT Journal Article SR Electronic T1 dRep: A tool for fast and accurate genome de-replication that enables tracking of microbial genotypes and improved genome recovery from metagenomes JF bioRxiv FD Cold Spring Harbor Laboratory SP 108142 DO 10.1101/108142 A1 Matthew R. Olm A1 Christopher T. Brown A1 Brandon Brooks A1 Jillian F. Banfield YR 2017 UL http://biorxiv.org/content/early/2017/02/13/108142.abstract AB The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that sequentially applies a fast, inaccurate estimation of genome distance and a slow but accurate measure of average nucleotide identity to reduce the computational time for pair-wise genome set comparisons by orders of magnitude. We demonstrate its use in a study where we separately assembled each metagenome from time series datasets. Groups of essentially identical genomes were identified with dRep, and the best genome from each set was selected. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using the typical co-assembly method. Documentation is available at http://drep.readthedocs.io/en/master/ and source code is available at https://github.com/MrOlm/drep.