Abstract
Polygenic risk scores derived from genotype data (PRS) and family history of disease (FH) both provide valuable information for predicting disease risk, enhancing prospects for clinical utility. PRS perform poorly when applied to diverse populations, but FH does not suffer this limitation. Here, we explore methods for combining both types of information (PRS-FH). We analyzed 10 complex diseases from the UK Biobank for which family history (parental and sibling history) was available for most target samples. PRS were trained using all British individuals (N=409K), and target samples consisted of unrelated non-British Europeans (N=42K), South Asians (N=7K), or Africans (N=7K). We evaluated PRS, FH, and PRS-FH using liability-scale R2, focusing on three well-powered diseases (type 2 diabetes, hypertension, depression) with R2 > 0.05 for PRS and/or FH in each target population. Averaging across these three diseases, PRS attained average prediction R2 of 5.8%, 4.0%, and 0.53% in non-British Europeans, South Asians, and Africans, confirming poor cross-population transferability. In contrast, PRS-FH attained average prediction R2 of 13%, 12%, and 10%, respectively, representing a large improvement in Europeans and an extremely large improvement in Africans; for each disease and each target population, the improvement was highly statistically significant. PRS-FH methods based on a logistic model and a liability threshold model performed similarly when covariates were not included in predictions (consistent with simulations), but the logistic model outperformed the liability threshold model when covariates were included. In conclusion, including family history greatly improves the accuracy of polygenic risk scores, particularly in diverse populations.
Competing Interest Statement
The authors have declared no competing interest.