RT Journal Article
SR Electronic
T1 rehh 2.0: a reimplementation of the R package rehh to detect positive selection from haplotype structure
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 067629
DO 10.1101/067629
A1 Mathieu Gautier
A1 Alexander Klassmann
A1 Renaud Vitalis
YR 2016
UL http://biorxiv.org/content/early/2016/08/03/067629.abstract
AB Identifying genomic regions with unusually high local haplotype homozygosity represents a powerful strategy to characterize candidate genes responding to natural or artificial positive selection. To that end, statistics measuring the extent of haplotype homozygosity within (e.g., EHH, iHS) and between (Rsb or XP-EHH) populations have been proposed in the literature. The rehh package for R was previously developed to facilitate genome-wide scans of selection, based on the analysis of long-range haplotypes. However, its performance wasn’t sufficient to cope with the growing size of available data sets. Here we propose a major upgrade of the rehh package, which includes an improved processing of the input files, a faster algorithm to enumerate haplotypes, as well as multi-threading. As illustrated with the analysis of large human haplotype data sets, these improvements decrease the computation time by more than an order of magnitude. This new version of rehh will thus allow performing iHS-, Rsb- or XP-EHH-based scans on large data sets. The package rehh 2.0 is available from the CRAN repository (http://cran.r-project.org/web/packages/rehh/index.html) together with help files and a detailed manual.