Abstract
Background Genome-wide DNA methylation (DNAm) profiling has allowed for the development of molecular predictors for a multitude of traits and diseases. Such predictors may be more accurate than the self-reported phenotypes, and could have clinical applications. Here, penalised regression models were used to develop DNAm predictors for body mass index (BMI), smoking status, alcohol consumption, and educational attainment in a cohort of 5,100 individuals. Using an independent test cohort comprising 906 individuals, the proportion of phenotypic variance explained in each trait was examined for DNAm-based and genetic predictors. Receiver operator characteristic curves were generated to investigate the predictive performance of DNAm-based predictors, using dichotomised phenotypes. The relationship between DNAm scores and all-cause mortality (n = 214 events) was assessed via Cox proportional-hazards models.
Results The DNAm-based predictors explained different proportions of the phenotypic variance for BMI (12%), smoking (60%), alcohol consumption (12%) and education (3%). The combined genetic and DNAm predictors explained 20% of the variance in BMI, 61% in smoking, 13% in alcohol consumption, and 6% in education. DNAm predictors for smoking, alcohol, and education but not BMI predicted mortality in univariate models. The predictors showed moderate discrimination of obesity (AUC=0.67) and alcohol consumption (AUC=0.75), and excellent discrimination of current smoking status (AUC=0.98). There was poorer discrimination of college-educated individuals (AUC=0.59).
Conclusions DNAm predictors correlate with lifestyle factors that are associated with health and mortality. They may supplement DNAm-based predictors of age to identify the lifestyle profiles of individuals and predict disease risk.
- DNAm
- DNA methylation
- BMI
- Body mass index
- AUC
- Area under the curve
- CpG
- Cytosine phosphate Guanine dinucleotide
- EWAS
- Epigenome-wide association study
- GS:SFHS
- Generation Scotland: The Scottish family health study
- LBC1936
- Lothian birth cohort 1936
- LASSO
- Least absolute shrinkage and selector operator
- HR
- Hazard ratio
- CI
- Confidence interval
- STRADL
- Stratifying resilience and depression longitudinally