TY - JOUR T1 - Scallop enables accurate assembly of transcripts through phasing-preserving graph decomposition JF - bioRxiv DO - 10.1101/123612 SP - 123612 AU - Mingfu Shao AU - Carl Kingsford Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/04/03/123612.abstract N2 - We introduce Scallop, an accurate, reference-based transcript assembler for RNA-seq data. Scallop significantly improves reconstruction of multi-exon and lowly expressed transcripts. On 10 human samples aligned with STAR, Scallop produces (on average) 35.7% and 37.5% more correct multi-exon transcripts than two leading transcript assemblers, StringTie [1] and TransComb [2], respectively. For transcripts expressed at low levels in the same samples, Scallop assembles 65.2% and 50.2% more correct multi-exon transcripts than StringTie and TransComb, respectively. Scallop obtains this improvement through a novel algorithm that we prove preserves all phasing paths from reads (including paired-end reads), while also producing a parsimonious set of transcripts and minimizing coverage deviation. ER -