TY - JOUR T1 - GPhase: Greedy Approach for Accurate Haplotype Inferencing JF - bioRxiv DO - 10.1101/073379 SP - 073379 AU - Kshitij Tayal AU - Naveen Sivadasan AU - Rajgopal Srinivasan Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/09/04/073379.abstract N2 - We consider the computational problem of phasing an individual genotype sample given a collection of known haplotypes in the population. We give a fast and accurate algorithm GPhase for reconstructing haplotype pair consistent with input genotype. It uses the coalescent based mutation model of Stephens and Donnelly (2000). Computing optimal solution under this model is expensive and our algorithm uses a greedy approximation for fast and accurate estimation. Our algorithm is simple, efficient and has linear time and space complexity. Experiments on real datasets revealed improved gene level phasing accuracy for GPhase tool compared to other widely used tools such as SHAPEIT, Beagle, MaCH and Impute2. On simulated data, GPhase tool was able to phase samples each containing more than 1700 markers with high accuracy. GPhase can be used for gene level phasing of individual samples using publicly available haplotype datasets such as HapMap data or 1000 genome data. This finds applications in studies on recessive Mendelian disorders where parent data is lacking. GPhase is freely available for download and use from https://github.com/kshitijtayal/GPhase/. ER -