TY - JOUR T1 - Evaluation on Detection of Structural Variants by Low-Coverage Long-Read Sequencing JF - bioRxiv DO - 10.1101/092544 SP - 092544 AU - Li Fang AU - Jiang Hu AU - Depeng Wang AU - Kai Wang Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/12/17/092544.abstract N2 - Structural variants (SVs) in human genome are implicated in a variety of human diseases. Long-read sequencing (such as those from PacBio) delivers much longer read lengths than short-read sequencing (such as those from Illumina) and may greatly improve SV detection. However, due to the relatively high cost of long-read sequencing, users are often faced with issues such as what coverage is needed and how to optimally use the aligners and SV callers. Here, we evaluated SV calling performance of three SV calling algorithms (PBHoney-Tails, PBHoney-Spots and Sniffles) under different PacBio coverages on two personal genomes, NA12878 and HX1. Our results showed that, at 10X coverage, 76% ~ 84% deletions and 80% ~ 92 % insertions in the gold standard set can be detected by PBHoney-Spots. Combining both PBHoney-Spots and Sniffles greatly increased sensitivity, especially under lower coverages such as 6X. We further evaluated the Mendelian errors on an Ashkenazi Jewish trio dataset with low-coverage whole-genome PacBio sequencing. In addition, to automate SV calling, we developed a computational pipeline called NextSV, which integrates PBhoney and Sniffles and generates the union (high sensitivity) or intersection (high specificity) call sets. Our results provide useful guidelines for SV identification from low coverage whole-genome PacBio data and we expect that NextSV will facilitate the analysis of SVs on long-read sequencing data. ER -