Abstract
Motivation Quality control (QC) tools are critical in DNA sequencing analysis because they increase the accuracy of sequence alignments and thus the reliability of results. Oxford Nanopore Technologies (ONT) QC is currently rudimentary, generally based on whole read average quality. This results in discarding reads that contain regions of high quality sequence. Here we propose Prowler, a multi-window approach inspired by algorithms used to QC short read data. Importantly, we retain the phase and read length information by optionally replacing trimmed sections with Ns.
Results Prowler was applied to mammalian and bacterial datasets, to assess effects on alignment and assembly respectively. Compared to Nanofilt, alignments of data QC’ed with Prowler had lower error rates and more mapped reads. Assemblies of Prowler QC’ed data had a lower error rate than Nanofilt QCed data however this came at some cost to assembly contiguity.
Availability and implementation Prowler is implemented in Python and is available at: https://github.com/ProwlerForNanopore/ProwlerTrimmer
Contact e.ross{at}uq.edu.au
Supplementary information Supplementary data are available at Bioinformatics online.
Competing Interest Statement
The authors have declared no competing interest.