RT Journal Article SR Electronic T1 Accurate detection of de novo and transmitted INDELs within exome-capture data using micro-assembly JF bioRxiv FD Cold Spring Harbor Laboratory SP 001370 DO 10.1101/001370 A1 Giuseppe Narzisi A1 Jason A. O’Rawe A1 Ivan Iossifov A1 Yoon-ha Lee A1 Zihua Wang A1 Yiyang Wu A1 Gholson J. Lyon A1 Michael Wigler A1 Michael C. Schatz YR 2013 UL http://biorxiv.org/content/early/2013/12/13/001370.abstract AB We present a new open-source algorithm, Scalpel, for sensitive and specific discovery of INDELs in exome-capture data. By combining the power of mapping and assembly, Scalpel searches the de Bruijn graph for haplotype-specific sequence paths (contigs) that span each exon. The algorithm reports a single path for homozygous exons, two paths for heterozygous exons, and multiple paths for more exotic variations. A detailed repeat composition analysis coupled with a self-tuning k-mer strategy allows Scalpel to outperform other state-of-the-art approaches for INDEL discovery. We extensively compared Scalpel with a battery of >10000 simulated and >1000 experimentally validated INDELs between 1 and 100bp against two recent algorithms for INDEL discovery: GATK HaplotypeCaller and SOAPindel. We report anomalies for these tools in their ability to detect INDELs, especially in regions containing near-perfect repeats which contribute to high false positive rates. In contrast, Scalpel demonstrates superior specificity while maintaining high sensitivity. We also present a large-scale application of Scalpel for detecting de novo and transmitted INDELs in 593 families with autistic children from the Simons Simplex Collection. Scalpel demonstrates enhanced power to detect long (≥20bp) transmitted events, and strengthens previous reports of enrichment for de novo likely gene-disrupting INDEL mutations in children with autism with many new candidate genes. The source code and documentation for the algorithm is available at http://scalpel.sourceforge.net.