RT Journal Article SR Electronic T1 Discovery of large genomic inversions using pooled clone sequencing JF bioRxiv FD Cold Spring Harbor Laboratory SP 015156 DO 10.1101/015156 A1 Marzieh Eslami Rasekh A1 Giorgia Chiatante A1 Mattia Miroballo A1 Joyce Tang A1 Mario Ventura A1 Chris T. Amemiya A1 Evan E. Eichler A1 Francesca Antonacci A1 Can Alkan YR 2015 UL http://biorxiv.org/content/early/2015/04/22/015156.abstract AB Motivation There are many different forms of genomic structural variation that can be broadly classified into two groups as copy number variation (CNV) and balanced rearrangements. Although many algorithms are now available in the literature that aim to characterize CNVs, discovery of balanced rearrangements (inversions and translocations) remains an open problem. This is mainly because the breakpoints of such events typically lie within segmental duplications and common repeats, which reduce the mappability of short reads. The 1000 Genomes Project spearheaded the development of several methods to identify inversions, however, they are limited to relatively short inversions, and there are currently no available algorithms to discover large inversions using high throughput sequencing technologies (HTS).Results Here we propose to use a sequencing method (Kitzman et al., 2011) originally developed to improve haplotype phasing to characterize large genomic inversions. This method, called pooled clone sequencing, merges the advantages of clone based sequencing approaches with the speed and cost efficiency of HTS technologies. Using data generated with pooled clone sequencing method, we developed a novel algorithm, dipSeq, to discover large inversions (>500 Kbp). We show the power of dipSeq first on simulated data, and then apply it to the genome of a HapMap individual (NA12878). We were able to accurately discover all previously known and experimentally validated large inversions in the same genome. We also identified a novel inversion, and confirmed using fluorescent in situ hybridization.Availability Implementation of the dipSeq algorithm is available at https://github.com/BilkentCompGen/dipseqContact calkan{at}cs.bilkent.edu.tr, francesca.antonacci{at}uniba.it