PT - JOURNAL ARTICLE AU - Birte Kehr AU - Páll Melsted TI - chopBAI: BAM index reduction solves I/O bottlenecks in the joint analysis of large sequencing cohorts AID - 10.1101/030825 DP - 2015 Jan 01 TA - bioRxiv PG - 030825 4099 - http://biorxiv.org/content/early/2015/11/06/030825.short 4100 - http://biorxiv.org/content/early/2015/11/06/030825.full AB - Summary Advances in sequencing capacity have lead to the generation of unprecedented amounts of genomic data. The processing of this data frequently leads to I/O bottlenecks, e. g. when analyzing a small genomic region across a large number of samples. The largest I/O burden is, however, often not imposed by the amount of data needed for the analysis but rather by index files that help retrieving this data. We have developed chopBAI, a program that can chop a BAM index (BAI) file into small pieces. The program outputs a list of BAI files each indexing a specified genomic interval. The output files are much smaller in size but maintain compatibility with existing software tools. We show how preprocessing BAI files with chopBAI can lead to a reduction of I/O by more than 95% during the analysis of 10 Kbp genomic regions, eventually enabling the joint analysis of more than 10,000 individuals.Availability and Implementation The software is implemented in C++, GPL licensed and available at http://github.com/DecodeGenetics/chopBAI