TY - JOUR T1 - Sensitive protein sequence searching for the analysis of massive data sets JF - bioRxiv DO - 10.1101/079681 SP - 079681 AU - Martin Steinegger AU - Johannes Söding Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/10/07/079681.abstract N2 - Sequencing costs have dropped much faster than Moore’s law in the past decade, and sensitive sequence searching has become the main bottleneck in the analysis of large (meta)genomic datasets. While previous methods sacrificed sensitivity for speed gains, the parallelized, open-source software MMseqs2 overcomes this trade-off: In three-iteration profile searches it reaches 50% higher sensitivity than BLAST at 83-fold speed and the same sensitivity as PSI-BLAST at 270 times its speed. MMseqs2 therefore offers great potential to increase the fraction of annotatable (meta)genomic sequences. ER -