RT Journal Article SR Electronic T1 Scallop enables accurate assembly of transcripts through phasing-preserving graph decomposition JF bioRxiv FD Cold Spring Harbor Laboratory SP 123612 DO 10.1101/123612 A1 Mingfu Shao A1 Carl Kingsford YR 2017 UL http://biorxiv.org/content/early/2017/04/03/123612.abstract AB We introduce Scallop, an accurate, reference-based transcript assembler for RNA-seq data. Scallop significantly improves reconstruction of multi-exon and lowly expressed transcripts. On 10 human samples aligned with STAR, Scallop produces (on average) 35.7% and 37.5% more correct multi-exon transcripts than two leading transcript assemblers, StringTie [1] and TransComb [2], respectively. For transcripts expressed at low levels in the same samples, Scallop assembles 65.2% and 50.2% more correct multi-exon transcripts than StringTie and TransComb, respectively. Scallop obtains this improvement through a novel algorithm that we prove preserves all phasing paths from reads (including paired-end reads), while also producing a parsimonious set of transcripts and minimizing coverage deviation.