RT Journal Article SR Electronic T1 Recycler: an algorithm for detecting plasmids from de novo assembly graphs JF bioRxiv FD Cold Spring Harbor Laboratory SP 029926 DO 10.1101/029926 A1 Roye Rozov A1 Aya Brown Kav A1 David Bogumil A1 Naama Shterzer A1 Eran Halperin A1 Itzhak Mizrahi A1 Ron Shamir YR 2016 UL http://biorxiv.org/content/early/2016/05/09/029926.abstract AB Plasmids are central contributors to microbial evolution and genome innovation. Recently, they have been found to have important roles in antibiotic resistance and in affecting production of metabolites used in industrial and agricultural applications. However, their characterization through deep sequencing remains challenging, in spite of rapid drops in cost and throughput increases for sequencing. Here, we attempt to ameliorate this situation by introducing a new plasmid-specific assembly algorithm, leveraging assembly graphs provided by a conventional de novo assembler and alignments of paired- end reads to assembled graph nodes. We introduce the first tool for this task, called Recycler, and demonstrate its merits in comparison with extant approaches. We show that Recycler greatly increases the number of true plasmids recovered while remaining highly accurate. On simulated plasmidomes, Recycler recovered 5-14% more true plasmids compared to the best extant method with overall precision of about 90%. We validated these results in silico on real data, as well as in vitro by PCR validation performed on a subset of Recycler’s predictions on different data types. All 12 of Recycler’s outputs on isolate samples matched known plasmids or phages, and had alignments having at least 97% identity over at least 99% of the reported reference sequence lengths. For the two E. Coli strains examined, most known plasmid sequences were recovered, while in both cases additional plasmids only known to be present in different hosts were found. Recycler also generated plasmids in high agreement with known annotation on real plasmidome data. Moreover, in PCR validations performed on 77 sequences, Recycler showed mean accuracy of 89% across all data types – isolate, microbiome, and plasmidome. Recycler is available at http://github.com/Shamir-Lab/Recycler