RT Journal Article SR Electronic T1 LINKS: Scaffolding genome assemblies with kilobase-long nanopore reads JF bioRxiv FD Cold Spring Harbor Laboratory SP 016519 DO 10.1101/016519 A1 René L. Warren A1 Benjamin P. Vandervalk A1 Steven J.M. Jones A1 Inanç Birol YR 2015 UL http://biorxiv.org/content/early/2015/03/13/016519.abstract AB Motivation: Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. Established and emerging long read technologies show great promise in this regard, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics preprocessing before they could be of value. We present LINKS, the Long Interval Nucleotide K-mer Scaffolder algorithm, a solution that makes use of the information in error-rich long reads, without the need for read alignment or base correction. We show how the contiguity of an ABySS E. coli K-12 genome assembly could be increased over five-fold by the use of beta-released Oxford Nanopore Ltd. (ONT) long reads and how LINKS leverages long-range information in S. cerevisiae W303 ONT reads to yield an assembly with less than half the errors of competing applications. Re-scaffolding the colossal white spruce assembly draft (PG29, 20 Gbp) and how LINKS scales to larger genomes is also presented. We expect LINKS to have broad utility in harnessing the potential of long reads in connecting high-quality sequences of small and large genome assembly drafts.Availability: http://www.bcgsc.ca/bioinformatics/software/linksContact: rwarren@bcgsc.ca