TY - JOUR T1 - Improved long read correction for de <em>novo</em> assembly using an FM-index JF - bioRxiv DO - 10.1101/067272 SP - 067272 AU - James M. Holt AU - Jeremy R. Wang AU - Corbin D. Jones AU - Leonard McMillan Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/08/02/067272.abstract N2 - Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging “hybrid” assemblies that use long reads for scaffolding and short reads for accuracy. To this end, we describe a novel application of a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We show that our method efficiently produces significantly higher quality corrected sequence than existing hybrid error-correction methods. We demonstrate the effectiveness of our method compared to state-of-the-art hybrid and long-read only de novo assembly methods. ER -