RT Journal Article SR Electronic T1 Sensitive protein sequence searching for the analysis of massive data sets JF bioRxiv FD Cold Spring Harbor Laboratory SP 079681 DO 10.1101/079681 A1 Martin Steinegger A1 Johannes Söding YR 2016 UL http://biorxiv.org/content/early/2016/10/20/079681.abstract AB Sequencing costs have dropped much faster than Moore’s law in the past decade, and sensitive sequence searching has become the main bottleneck in the analysis of large (meta)genomic datasets. While previous methods sacrificed sensitivity for speed gains, the parallelized, open-source software MMseqs2 overcomes this trade-off: In three-iteration profile searches it reaches 50% higher sensitivity than BLAST at 83-fold speed and the same sensitivity as PSI-BLAST at 270 times its speed. MMseqs2 therefore offers great potential to increase the fraction of annotatable (meta)genomic sequences.