RT Journal Article SR Electronic T1 Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores JF bioRxiv FD Cold Spring Harbor Laboratory SP 015859 DO 10.1101/015859 A1 Bjarni J. Vilhjálmsson A1 Jian Yang A1 Hilary Finucane A1 Alexander Gusev A1 Sara Lindström A1 Stephan Ripke A1 Giulio Genovese A1 Po-Ru Loh A1 Gaurav Bhatia A1 Ron Do A1 Tristan Hayeck A1 Hong-Hee Won A1 Schizophrenia Working Group of the Psychiatric Genomics Consortium A1 the Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) study A1 Sekar Kathiresan A1 Michele Pato A1 Carlos Pato A1 Rulla Tamimi A1 Eli Stahl A1 Noah Zaitlen A1 Bogdan Pasaniuc A1 Mikkel H. Schierup A1 Philip De Jager A1 Nikolaos A. Patsopoulos A1 Steve McCarroll A1 Mark Daly A1 Shaun Purcell A1 Daniel Chasman A1 Benjamin Neale A1 Michael Goddard A1 Peter Visscher A1 Peter Kraft A1 Nick Patterson A1 Alkes L. Price YR 2015 UL http://biorxiv.org/content/early/2015/03/01/015859.abstract AB Polygenic risk scores have shown great promise in predicting complex disease risk, and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves LD-pruning markers and applying a P-value threshold to association statistics, but this discards information and may reduce predictive accuracy. We introduce a new method, LDpred, which infers the posterior mean causal effect size of each marker using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the pruning/thresholding approach, particularly at large sample sizes. Accordingly, prediction R2 increased from 20.1% to 25.3% in a large schizophrenia data set and from 9.8% to 12.0% in a large multiple sclerosis data set. A similar relative improvement in accuracy was observed for three additional large disease data sets and when predicting in non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.