PT - JOURNAL ARTICLE AU - Roye Rozov AU - Aya Brown Kav AU - David Bogumil AU - Naama Shterzer AU - Eran Halperin AU - Itzhak Mizrahi AU - Ron Shamir TI - Recycler: an algorithm for detecting plasmids from <em>de novo</em> assembly graphs AID - 10.1101/029926 DP - 2016 Jan 01 TA - bioRxiv PG - 029926 4099 - http://biorxiv.org/content/early/2016/05/09/029926.short 4100 - http://biorxiv.org/content/early/2016/05/09/029926.full AB - Plasmids are central contributors to microbial evolution and genome innovation. Recently, they have been found to have important roles in antibiotic resistance and in affecting production of metabolites used in industrial and agricultural applications. However, their characterization through deep sequencing remains challenging, in spite of rapid drops in cost and throughput increases for sequencing. Here, we attempt to ameliorate this situation by introducing a new plasmid-specific assembly algorithm, leveraging assembly graphs provided by a conventional de novo assembler and alignments of paired- end reads to assembled graph nodes. We introduce the first tool for this task, called Recycler, and demonstrate its merits in comparison with extant approaches. We show that Recycler greatly increases the number of true plasmids recovered while remaining highly accurate. On simulated plasmidomes, Recycler recovered 5-14% more true plasmids compared to the best extant method with overall precision of about 90%. We validated these results in silico on real data, as well as in vitro by PCR validation performed on a subset of Recycler’s predictions on different data types. All 12 of Recycler’s outputs on isolate samples matched known plasmids or phages, and had alignments having at least 97% identity over at least 99% of the reported reference sequence lengths. For the two E. Coli strains examined, most known plasmid sequences were recovered, while in both cases additional plasmids only known to be present in different hosts were found. Recycler also generated plasmids in high agreement with known annotation on real plasmidome data. Moreover, in PCR validations performed on 77 sequences, Recycler showed mean accuracy of 89% across all data types – isolate, microbiome, and plasmidome. Recycler is available at http://github.com/Shamir-Lab/Recycler