TY - JOUR T1 - Using pseudoalignment and base quality to accurately quantify microbial community composition JF - bioRxiv DO - 10.1101/097949 SP - 097949 AU - M. Reppell AU - J. Novembre Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/01/09/097949.abstract N2 - Pooled DNA from multiple unknown organisms arises in a variety of contexts, for example microbial samples from ecological or human health research. Determining the composition of pooled samples can be difficult, especially at the scale of modern sequencing data and reference databases. Here we propose the novel pooled DNA classification method Karp. Karp combines the speed and low-memory requirements of k-mer based pseudoalignment with a likelihood framework that uses base quality information to better resolve multiply mapped reads. In this text we apply Karp to the problem of classifying 16S rRNA reads, commonly used in microbiome research. Using simulations, we show Karp is accurate across a variety of read lengths and when samples contain reads originating from organisms absent from the reference. We also assess performance in real 16S data, and show that relative to other widely used classification methods Karp can reveal stronger statistical association signals and should empower future discoveries. ER -