PT - JOURNAL ARTICLE AU - Martin Steinegger AU - Johannes Söding TI - Sensitive protein sequence searching for the analysis of massive data sets AID - 10.1101/079681 DP - 2016 Jan 01 TA - bioRxiv PG - 079681 4099 - http://biorxiv.org/content/early/2016/10/07/079681.short 4100 - http://biorxiv.org/content/early/2016/10/07/079681.full AB - Sequencing costs have dropped much faster than Moore’s law in the past decade, and sensitive sequence searching has become the main bottleneck in the analysis of large (meta)genomic datasets. While previous methods sacrificed sensitivity for speed gains, the parallelized, open-source software MMseqs2 overcomes this trade-off: In three-iteration profile searches it reaches 50% higher sensitivity than BLAST at 83-fold speed and the same sensitivity as PSI-BLAST at 270 times its speed. MMseqs2 therefore offers great potential to increase the fraction of annotatable (meta)genomic sequences.