TY - JOUR T1 - ABySS 2.0: Resource-Efficient Assembly of Large Genomes using a Bloom Filter JF - bioRxiv DO - 10.1101/068338 SP - 068338 AU - Shaun D Jackman AU - Benjamin P Vandervalk AU - Hamid Mohamadi AU - Justin Chu AU - Sarah Yeo AU - S Austin Hammond AU - Golnaz Jahesh AU - Hamza Khan AU - Lauren Coombe AU - Rene L Warren AU - Inanc Birol Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/08/07/068338.abstract N2 - The assembly of DNA sequences de novo is fundamental to genomics research. It is the first of many steps towards elucidating and characterizing whole genomes. Downstream applications, including analysis of genomic variation between species, between or within individuals critically depends on robustly assembled sequences. In the span of a single decade, the sequence throughput of leading DNA sequencing instruments has increased drastically, and coupled with established and planned large-scale, personalized medicine initiatives to sequence genomes in the thousands and even millions, the development of efficient, scalable and accurate bioinformatics tools for producing high-quality reference draft genomes is timely.With ABySS 1.0, we originally showed that assembling the human genome using short 50 bp sequencing reads was possible by aggregating the half terabyte of compute memory needed over several computers using a standardized message-passing system (MPI). We present here its re-design, which departs from MPI and instead implements algorithms that employ a Bloom filter, a probabilistic data structure, to represent a de Bruijn graph and reduce memory requirements.We present assembly benchmarks of human Genome in a Bottle 250 bp Illumina paired-end and 6 kbp mate-pair libraries from a single individual, yielding a NG50 (NGA50) scaffold contiguity of 3.5 (3.0) Mbp using less than 35 GB of RAM, a modest memory requirement by today’s standard that is often available on a single computer. We also investigate the use of BioNano Genomics and 10x Genomics’ Chromium data to further improve the scaffold contiguity of this assembly to 42 (15) Mbp. ER -