PT - JOURNAL ARTICLE AU - Po-Ru Loh AU - Pier Francesco Palamara AU - Alkes L Price TI - Fast and accurate long-range phasing and imputation in a UK Biobank cohort AID - 10.1101/028282 DP - 2015 Jan 01 TA - bioRxiv PG - 028282 4099 - http://biorxiv.org/content/early/2015/10/04/028282.short 4100 - http://biorxiv.org/content/early/2015/10/04/028282.full AB - Recent work has leveraged the unique genealogical structure and extensive genotyping (>30%) of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here, we develop a fast and accurate LRP method, Eagle, that extends this paradigm to outbred populations by harnessing long (>4cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N=150K samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1–2 orders of magnitude faster than existing methods while achieving exquisite phasing accuracy (switch error rate ≈0.3%, corresponding to perfect phase at the scale of >10Mb). Moreover, we observed that Eagle imputed masked genotypes with accuracy R2>0.75 down to a minor allele frequency of 0.1%. Compared to computationally tractable alternatives, Eagle attained large improvements in phasing and imputation accuracy at N=150K and smaller improvements at smaller sample sizes, illustrating the advantages that LRP-based imputation will yield as very large reference panels become available.