RT Journal Article SR Electronic T1 Centrifuge: rapid and sensitive classification of metagenomic sequences JF bioRxiv FD Cold Spring Harbor Laboratory SP 054965 DO 10.1101/054965 A1 Daehwan Kim A1 Li Song A1 Florian P. Breitwieser A1 Steven L. Salzberg YR 2016 UL http://biorxiv.org/content/early/2016/05/24/054965.abstract AB Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4,078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI non-redundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer based indexing schemes, which require far more extensive space. Centrifuge is available as free, open-source software from www.ccb.jhu.edu/software/centrifuge