ABSTRACT
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is an example of a complex disease of unknown etiology. Multiple studies point to disruptions in immune functioning in ME/CFS patients as well as with specific genetic polymorphisms and alterations of the DNA methylome in lymphocytes. However, the association between DNA methylation and genetic background in relation to the ME/CFS is currently unknown. In this study we explored this association by characterizing the genomic (~4.3 million SNPs) and epigenomic (~480 thousand CpG loci) variability between populations of ME/CFS patients and healthy controls. We found significant associations of methylation states in T-lymphocytes at several CpG loci and regions with ME/CFS phenotype. These methylation anomalies are in close proximity to genes involved with immune function and cellular metabolism. Finally, we found significant correlations of genotypes with methylation phenotypes associated with ME/CFS. The findings from this study highlight the role of epigenetic and genetic interactions in complex diseases, and suggest several genetic and epigenetic elements potentially involved in the mechanisms of disease in ME/CFS.
Introduction
Understanding the biological basis of complex traits and diseases remains one of the biggest challenges in biology and medicine. Chronic Fatigue Syndrome (also known as Myalgic Encephalomyelitis, hereafter referred to as ME/CFS) is an example of a complex, multifactorial disease with symptoms that vary substantially among patients. ME/CFS is a debilitating multisystem disease affecting between 1 and 2 million people in the United States alone1, with an annual economic impact between $17 and $24 billion 2. Yet its biological basis remains largely unknown.
Multiple studies point to disruptions in the immune and neuroendocrine systems in ME/CFS patients 3–14 ME/CFS appears to be associated with specific genetic polymorphisms 15–17, as well as with alterations of the DNA methylome in lymphocytes14,18,19. Understanding the contribution of the genetic background in ME/CFS patients as a predisposing factor for epigenetic abnormalities associated with the disease is a fundamental step to elucidate its causes. This understanding is also key for the development of tools to identify risk factors and potential treatments.
T-cell lymphocytes appear to be a primary cell type underlying immune and neuroendocrine abnormalities observed in ME/CFS patients. Functional impairment in T-cell glucocorticoid receptor and increased dexamethasone sensitivity are characteristic of some ME/CFS patients14, 20. Furthermore, genetic polymorphisms within non-coding regions of T-cell receptor loci15, as well as differential methylation in CD4+ T helper lymphocyte cells (Brenu et al., 2014), have been associated with the disease. The possible interactions between genomic and T-cell epigenomic variation in ME/CFS remain unknown.
In this study, we aimed to explore the association between DNA methylation profiles of T-cells and single nucleotide polymorphisms (SNPs) in ME/CFS patients. We quantified lymphocyte proportions and isolated CD3+ T-cells (including both CD4+ T helper cells and CD8+ T killer cells) via fluorescence activated cell sorting. We characterized the variation in genomic (~4.3 million SNPs) and epigenomic (~480 thousand CpG loci) variability among ME/CFS patients and healthy controls. Using this approach, we: 1) tested the association of genome-wide SNP genotypes with ME/CFS disease status; 2) tested the association of differentially methylated CpG loci and regions in CD3+ T-cells with ME/CFS disease status; 3) performed a methylation quantitative trait analysis to investigate the possible interactions between genetic background and methylation phenotypes of CD3+ T-cells associated with ME/CFS disease status.
Methods
Ethics approval and consent to participate
This study adhered to the human experimentation guidelines as outlined by the Helsinki Declaration of 1975. The collection of and analysis of clinical information and biological samples by the SolveCFS BioBank was ethically approved by the Genetic Alliance ethics review board (IRB # IORG0003358) and the University of Toronto (IRB #27391), which also approved all procedures for obtaining written informed consent from all participants in the study. Two copies of the consent form were signed, with one copy provided to the participants and one copy under secure storage at the SolveCFS Biobank.
Study population
In total, 61 ME/CFS diagnosed patients receiving care at the Bateman Horne Center, Utah (46 females, 15 males) and 48 healthy control (36 females, 12 males) individuals were recruited for this study. Female to male ratios were nearly identical in both cases and controls (3:1). The sex ratio in our population of ME/CFS patients (cases) is consistent with previously reported ratios indicating predominance of this illness in females21–23. Diagnosis of ME/CFS was performed according to the 1994 Fukuda24 and 2003 Canadian25 criteria. To quantify functional impairment, individuals completed the standardized health-related quality of life survey RAND-3626. All individuals met the following criteria: 1) were HIV and Hepatitis-C negative; and 2) had no intake history of immunomodulatory or immunosuppressive medications. The Body Mass Index (BMI) of individuals ranged between 16.4 and 46 , with no significant differences between case and control groups (t-test p=0.5; Wilcoxon p=0.74) (Fig. 1a). Similarly, age, which ranged between 18 and 62 years , was not significantly different between cases and controls (t-test p=0.06; Wilcoxon p=0.07) (Fig. 1b).
Sample processing
Whole blood from each individual was collected at the Bateman Horne Center and shipped overnight to Precision for Medicine, Maryland for Peripheral Blood Mononuclear Cells (PBMC) separation following procedures described in14 PBMCs were separated into two aliquots of approximately 7 million cells each, and shipped in liquid nitrogen to the Centre for Environmental Epigenetics and Development at the University of Toronto.
The first PBMC aliquot per patient was used for single nucleotide polymorphism (SNP) genotyping using the Human Omni 5-4 Array (Illumina Inc.). This array examines Single Nucleotide Polymorphisms at ~4.3 million loci throughout the human genome. Genomic DNA purification was performed with the MasterPure™ Complete DNA and RNA Purification Kit (Epicentre) following the standard protocol for cell samples. DNA quantity and purity was assessed using a NanoDrop 2000c Spectrophotometer (Thermo Scientific). Genotyping with the Omni 5-4 array was performed at the Princess Margaret Genomics Centre in Toronto.
The second PBMC aliquot was used for DNA methylation profiling of T-cells (CD3+) using the Human Methylation 450K Array (Illumina Inc.). This array quantifies methylation at ~480 thousand CpG loci throughout the human methylome. To quantify the relative proportions of cell type in the PBMC sample (i.e. CD4+ T-cells, CD8+ T-cells, CD19+ B-cells, and CD14+ monocytes) and isolate CD3+ T-cells, each sample was stained with fluorescently labelled antibodies and sorted in a FACSAria (BD Biosciences) flow cytometer at the Centre for Flow Cytometry and Microscopy of the Sunnybrook Research Institute in Toronto. Genomic DNA from T-cells was purified following the same procedure described above. Bisulfite conversion of purified DNA and CpG methylation profiling with the 450K Array was performed at the McGill University and Genome Quebec Innovation Centre in Montreal.
Genome-wide association analyses
Analyses of SNP data quality and of association with ME/CFS disease phenotypes were performed with different parameters in the programs GenABEL27 and PLINK28, following standardized protocols29, 30. To minimize the number of false positive and negative associations, we first identified and excluded data from individuals that met one or more of the following criteria: 1) Inferred sex, as determined by the heterozygosity of the X chromosome, was incongruent with the known sex of the individual; 2) Heterozygosity rates or amount of missing data were outliers with respect to the whole population (Fig. 2a); 3) More than 10% of marker data was missing; 4) Relatedness to other samples, as measured by the identity by descent (IBD) statistic, was greater than that of a second- to third-degree relative (IBD>0.1875); 5) Ancestry, as determined by a principal components analysis (PCA) (Fig. 2b) performed with EIGENSOFT31, 32, was substantially different than that of the majority population in our cohort (i.e. European ancestry). Data from 10 individuals were excluded from all analyses: 4 cases (2 females, 2 males) and 6 controls (6 females). In addition to these criteria, we re-analysed the data excluding individuals with health-related quality of life measurements that overlapped between cases and controls. This exclusion of intermediate illness phenotypes was aimed at increasing the power to detect possible associations between disease status and (epi)genotypes by decreasing the heterogeneity in phenotype symptomatology within each group. We utilized the scores of RAND-36 scales (physical functioning, energy/fatigue, emotional well-being, social functioning, pain, and general health) as quantitative measurements of ME/CFS phenotypes because these were significantly different (α = 0.05) between case and control groups, prior to excluding individuals with intermediate phenotypes (Figs. 1c-h). RAND-36 scale scores were summarized into principal components (PC) using the stats package in R. We excluded data from case and control individuals with overlapping values along PC1 (Fig. 3), which explained ~80% of variance in the RAND-36 data. In total, data from 30 individuals were excluded using this approach: 18 cases (12 females, 6 males) and 12 controls (9 females, 3 males). These data were re-analysed in PLINK.
After excluding individuals with sub-optimal data, we identified and excluded data from SNP loci that met one or more of the following criteria following29, 30: 1) Rate of missing genotypes was greater than 3% in PLINK (5% in GenABEL) (Fig. 2c); 2) Rate of missing data was significantly different (p < 0.00001) between cases and controls; 3) Allelic frequencies significantly deviated from Hardy-Weinberg equilibrium (χ2 p < 0.00001 in PLINK, FDR < 0.2 in GenABEL); and 4) Minimum allele frequency was smaller than 1% in PLINK (2% in GenABEL). Out of 4,284,426 genotyped SNP loci, 1,779,031 SNP loci were excluded from PLINK analyses, and 2,142,548 from GenABEL analyses.
We examined possible associations between SNP genotypes and ME/CFS disease phenotypes (case or control) using χ2 testing as well as logistic regression tests that included covariates of age, sex and BMI in both PLINK and GenABEL. To account for the uncertainty in the potential genetic model of inheritance of ME/CFS, we performed multiple tests with different underlying models: Genotypic, dominant, recessive, Cochram-Armitage trend, and allelic for the simple χ2 tests; and genotypic, dominant, recessive and multiplicative for logistic regressions. No inflation of test statistics was observed in any test (λ ranged between 1 and 1.01). To assess the significance of associations we: 1) Adjusted raw p-values for multiple testing following the Bonferroni33, Holm34, and Benjamini and Hochberg false discovery rate (FDR) corrections35; and 2) Calculated corrected (empirical) p-values (family wise) after 10,000 permutations. We generated spatial visualizations of raw p-values for all associations across chromosomes using the program Haploview36. Due to the higher prevalence of ME/CFS in females than males21–23, association tests were also performed in data from females only.
SNP characterisation
SNPs with significant associations were examined using the following reference tools: the NHGRI-EBI catalogue of genome-wide association studies (http://www.ebi.ac.uk/gwas/home;37, the Ensembl genome browser (http://www.ensembl.org/Homosapiens/Info/Index;38, the Single Nucleotide Polymorphisms Annotator SNiPA (http://snipa.helmholtz-muenchen.de/snipa3/;39, the Genotype-Tissue-Expression database GTEx (https://www.gtexportal.org/home/;40, the genome-wide association study of blood plasma proteome database pGWAS (http://metabolomics.helmholtz-muenchen.de/pgwas/;41, and the developing brain methylation quantitative trait loci database (http://epigenetics.essex.ac.uk/mQTL/;42.
Re-analysis of published GWAS data
Data from the ME/CFS genome-wide association study (GWAS) by Schlauch et al. (2016), were acquired from dbGAP (phs001015.v1.p1), and re-analysed following the pipeline described above to identify commonalities. The crlmm R Bioconductor package43, 44 was used for genotyping, and analysed in GenABEL, using the same thresholds as above. 704,464 SNP markers from 66 subjects passed our quality controls (44 females and 22 males, 36 cases and 30 controls). Because of the different arrays used (Illumina Human Omni 5-4 Array in this study vs. Affymetrix Genome-Wide Human SNP Array 6.0 used by Schlauch et al.) we constructed linkage disequilibrium (LD) proxies using LDlink45, with R2 ≥0.8, to make the results comparable.
Gene-set analysis
The program MAGMA46 was used to complete a generalized gene-set analysis of the SNP data. This analyses focuses on genetic associations with phenotype at the level of genes and gene-sets rather than individual SNPs. This strategy augments the power to detect associations with complex traits and diseases, such as ME/CFS. Gene sets were taken from Molecular Signatures Database (MSigDB)47, including hallmark gene sets (hallmark gene sets summarize and represent specific well-defined biological states or processes and display coherent expression;48), canonical pathways (gene-sets from KEGG, BioCarta and Reactome) and GO gene sets (gene-sets that contain genes annotated by the same gene ontology term).
Epigenome-wide association analyses
Analyses of CpG methylation data quality and of association with ME/CFS disease phenotypes were performed in the R package minfi49, following the pipeline suggested by50. All individuals identified for exclusion during the genome-wide association analyses were also excluded from this dataset to increase the power of detection of possible associations. In addition, we attempted to identify data from individuals that could represent potential outliers by the following graphical approaches (as suggested by49): 1) Comparing inferred sex versus known sex; 2) Examining intensity distributions of control CpG probes; 2) Plotting median methylated and unmethylated fluorescence signals; 3) Plotting methylation percentage density distributions; and 4) Summarizing the variability in the methylation data through a principal component analysis. No individuals were identified as outliers (Figs. 2d-f). We discarded data from CpG loci that: 1) Contained known SNPs at the methylation dinucleotide; or 2) Contained missing data. Raw probe florescence intensities were normalized by Subset-Quantile Within Array Normalization51, which takes into account the differences between Infinium type I and II probes. The level of methylation in each CpG locus was measured as beta-values, ranging from 0 to 1, which represent the proportion of methylation. In total, 467,971 CpG loci (out of 485,512) were retained for further analyses.
We examined possible associations between methylation levels at CpG loci and ME/CFS disease phenotypes (case or control) through F-tests of logit-transformed beta-values52 as implemented in the dmpFinder function. To correct for potential confounding effects of multiple methylation array batches (two in this study), as well as covariates of age, sex and BMI, we utilized the empirical Bayes procedure implemented in the R package ComBat53. To assess significance of associations (α = 0.05) we: 1) adjusted raw p-values for multiple testing by performing Benjamini and Hochberg false discovery rate (FDR) corrections; and 2) calculated empirical p-values after 10,000 permutations as described in14. In addition to testing for associations at individual CpG loci, we performed tests of association at differentially methylated genomic regions using the R package bumphunter as described in54. Genomics regions were defined as clusters of CpG loci within a 500bp region. We assessed the significance of associations (α = 0.05) by calculating empirical p-values from null distributions of test statistics after 1,000 bootstrap pseudoreplicates.
Genome-epigenome association analyses
To identify associations between SNP genotypes and methylation levels at significantly differentially methylated CpG loci, we performed a methylation quantitative trait loci (mQTLs) analysis using linear additive regression models (including covariates) in the R package Matrix eQTL55. Only local cis-mQTL were considered, i.e. CpG-SNP pairs that are within 1Mbp of each other. Both CpG loci and SNPs were mapped to the UCSC human genome assembly version hg19 (Genome Reference Consortium GRCh37)56. mQTLs were considered significant when FDR corrected p-values were smaller than α = 0.05.
Enrichment analysis
We carried out gene-set enrichment analyses for genes of interest. The R package clusterProfiler 3.4.457 was used for Gene Ontology Biological Process (GO BP)58, 59 and KEGG pathway60, 61 enrichment analysis, with p and q value cut-offs of ≤ 0.05. Reactome pathway analyses were carried out using the ReactomePA 1.20.2 R package62, with a p-value cut-off of ≤ 0.05. All three packages reference the latest versions of their respective databases. These significantly enriched annotations were visualized with the ‘enrichMap’ function of DOSE 3.2.0 R package63, with specific parameters to aid legibility using different numbers of enriched annotations.
Results & Discussion
Lymphocyte proportions
In contrast to the significant differentiation in health-related quality of life RAND-36 scales between ME/CFS cases and healthy controls (Fig. 1c-h), there were no differences in the relative proportions of cell types in the PBMC lymphocyte samples (Fig. 4). Previous studies investigating potential differences in the relative proportions of general lymphocyte types in ME/CFS patients have produced incongruent results which have ranged from an increased proportion of CD8+ T cells (Klimas et al., 1990) to no significant differences7, 64 These observations suggest that alterations of general lymphocyte type proportions may not be a characteristic feature of ME/CFS. Rather, abnormalities in immune system functioning associated with ME/CFS appear to involve alterations in the activity and abundance of specific sub-populations5,6,65.
Genetic associations with ME/CFS
None of the more than 2 million variable SNP loci targeted in this study had a significant association (α = 0.05) with ME/CFS after p-value corrections with Bonferroni, Holm, Benjamini and Hochberg, or permutation methods when data from both sexes were analysed together. This result was consistent across all the χ2 and logistic regression tests (Figs. 5a-b summarize the results of the simple χ2 genotypic test as a representative example). Because of the known increased prevalence of ME/CFS in females21–23, we performed independent analyses of data from females only. These analyses revealed a significant association (χ2 genotypic test, permutation-corrected p-value = 0.0374, OR = 0.1845, 1/OR = 5.42) of one SNP (rs11712777, chr3:42347678) with the ME/CFS disease phenotype.
These results are contrasting to previous genotype association analyses in ME/CFS populations, which have found statistically significant associations in multiple loci. The earliest study by Smith et al. (2011) evaluated 116,204 SNPs (n=40 CFS, n=40 non-ME/CFS) using the Affymetrix GeneChip Mapping 100K array, and found 65 SNPs associated with ME/CFS (p<0.001). Rajeevan (2015) used the Affymetrix Immune and Inflammation Chip to focus on ~11,000 SNPs located in genes involved in immune and inflammation pathways (n=121 ME/CFS, n=50 non-ME/CFS). Of these, 32 were associated with ME/CFS (p<0.05). Most recently, Schlauch et al. (2016) evaluated 906,600 SNPs with the Affymetrix Genome-Wide SNP Array 6.0 (n=42 ME/CFS, n=38 non-ME/CFS) and found 442 SNPs that were associated with ME/CFS (P<3.3×10-5). The SNP that we found in significant association with ME/CFS in females, rs11712777, was not included in any of these datasets. One SNP in the Schlauch et al. (2016) data, rs1468604, is in linkage disequilibrium (LD) with rs11712777 (r2 = 0.8716; European population). The apparent discrepancy could be explained by the imperfect linkage between the two SNPs, and therefore we recommend rs11712777 as a candidate for direct genotyping in future studies.
There are no other overlaps in the SNPs or genes associated with ME/CFS between this study and previous genetic association studies. This observation may be confounded by a combination of multiple factors, including: 1) Differences in the types of arrays utilized in each study (our study, with the largest genetic coverage to date, evaluated two-orders of magnitude more SNPs than the Rajeevan (2015) study); 2) Differences among cohorts due to the wide heterogeneity of ME/CFS; 3) Reduced statistical power to discriminate the effects of multiple small-effect variants due to relatively small sample sizes; and 4) Interactions with environmental and epigenetic factors. Additional larger-scale genome-wide association studies with overlapping SNP probes and larger sample sizes will further our understanding of the interaction between genetic factors and ME/CFS.
Generalized gene-set analysis of the SNP data in MAGMA did not identify any gene-set significantly enriched in either our data, or the data from Schlauch et al. (2016).
Characterisation of SNP rs11712777
We used a variety of online reference resources to characterize the current knowledge of rs11712777, and how it may influence ME/CFS phenotype (see Methods section). We also examined SNPs in high LD with SNP rs11712777 (R2 ≥0.8; Table 1). The Genotype-Tissue-Expression database (GTEx) indicates that rs11712777, and the genes in LD with it, form an expression quantitative trail loci (eQTL) altering the expression of the CCK (cholecystokinin peptide hormone) gene. CCK has a number of active forms, expressed in a variety of tissues, including the blood, intestine and blood66, and plays a role in appetite, body weight and the immune system66, 67 A rat knockout (KO) of the cholecystokinin B receptor (CCKBR) shows attenuated sickness behaviour68. This sickness behaviour in rats has remarkable similarity to some of the symptoms of ME/CFS69, including fatigue, malaise, hyperalgesia, sleepiness, anhedonia, weight loss and diminished activity69. CCK is also co-localized with sleep-promoting preoptic neurons in the hypothalamus70, which may relate to fatigue and unrefreshing sleep symptoms in ME/CFS. Finally, recent evidence suggests that CCK has a role regulating the differentiation of CD4+ T-cells71, and that CCK-expressing neurons are a critical cellular component of the hypothalamic-pituitary-adrenal axis72. These roles of CCK in components of the immune system are consistent with suggested immune dysregulation in ME/CFS3–8,14 While CCK-associated variant rs11712777 may be a biologically relevant candidate influencing susceptibility to ME/CFS, our findings suggest that it only accounts for a small fraction of the risk (OR = 0.1845). However, it constitutes a relevant target for future research.
In addition to rs11712777, SNP rs17223780 (R2 = 0.8799) binds DNase in CD14+ monocytes (http://www.regulomedb.org/snp/chr3/42363368), indicating a possible regulatory role in the immune system. Another SNP in the vicinity of rs11712777 (D’ = 0.7211, R2 = 0. 0126), rs33449 (chr3:42400801), is associated with increased daytime resting duration (http://www.ebi.ac.uk/gwas/search?query=3:42347678-42372207;73). This is a phenotype that may be related to the fatigue aspect of ME/CFS.
Epigenetic associations with ME/CFS
Of the 467,971 CpG loci analysed, 141 had significant associations with the ME/CFS phenotype (raw p-value < 0.05) and a mean percentage methylation difference between cases and controls greater than 5% when data from both sexes were analysed together (Fig. 5c). None of these differentially methylated loci were significant after FDR corrections, however 133 had significant empirical p-values < 0.05 calculated through permutation analyses (these are referred to as differentially methylated probes -DMPs; Supplementary Table S1). Analyses of methylation data from females alone indicated that 108 CpG loci had significant associations with the ME/CFS phenotype (raw p-value ≤ 0.05) and a mean percentage methylation difference between cases and controls greater than 5%. None of these differentially methylated loci were significant after FDR corrections, however 94 DMPs had significant empirical p-values ≤ 0.05 after permutation analyses (Supplementary Table S2). Out of these 94 DMPs, 29 were common to the DMPs found when analysing the data from both sexes combined.
Approximately half of the DMP were clustered in differentially methylated regions (DMRs). We found 17 DMRs with significant association with the ME/CFS phenotype (p-value < 0.05) when data from both sexes were analysed together (Supplementary Table S3). There were 22 DMRs when only females were considered (Supplementary Table S4). All of these regions were located nearby genes. 5 DMRs were found upstream of genes, 3 in promoters, 3 overlapping the 5’ end and 1 the 3’ end of genes, 10 inside introns, and 3 downstream of genes. DMR length, in terms of number of CpG loci ranged between 2 and 10 (average 2.72). 7 DMRs containing CpG loci identified as DMPs were detected in common between analyses of methylation data from both sexes and from females only (Figure 6).
These results are in contrast with previous findings by our group, which revealed thousands to tens of thousands of differentially methylated CpG loci associated with ME/CFS in PBMCs, using the same 450K array14, 18. It is possible that differences between the targeted cell populations (i.e. PBMCs vs. isolated T-cells) may have contributed to the differences in the number of differentially methylated CpGs. The number of cell types within PBMCs may broaden the spectrum of epigenetic marks and thus increase the number of possible associations with the ME/CFS disease phenotype. Consistent with this idea, Brenu et al (2014) found 120 differentially methylated CpGs associated with ME/CFS in CD4+ T-cells (p<0.001) using the 450K array (n=25 ME/CFS, n=18 non-ME/CFS). This number of differentially methylated CpGs is similar to the 133 DMPs we found in this study, which targeted a broader T-cell population (including CD4+ and CD4-T-cells). However, the only overlap between our study and the study from Brenu et al (2014) corresponded to the HLA-DQB1 (major histocompatibility complex, class II, DQ beta 1) gene. HLA-DQB1 encodes a protein that is part of the DQ heterodimer, a cell surface receptor that is essential in immune signalling. We found two contiguous differentially methylated regions within an intron of this gene (Supplementary Tables S1 and S3, Figure 6). One region was hypermethylated whereas the other was hypomethylated in the ME/CFS group. Interestingly, the gene HLA-DQB1 contained cis-mQTLs significantly associated with these two DMRs (see next section, as well as Supplementary Table S5, and Figure 6). Brenu et al. (2014) found CpG hypermethylation associated with the HLA-DQB1 gene, however the specific location of this association was not reported. Recent studies focusing on CD4+ T-cells of patients affected by immune disorders such as rheumatoid arthritis74 and multiple sclerosis75 have found differential methylation in HLA-DQB1. This result is consistent with a potential immune dysregulation in ME/CFS.
We found 31 genes associated with DMPs in T-cells that were common to this study and a previous study by our group14 These genes, which include PAX8 (paired box 8), and ATP4B (ATPase H+/K+ transporting beta subunit) (Supplementary Table S1), are involved in the regulation of cellular processes and cell signaling. This is in line with recent ME/CFS work that observed differences in cellular metabolism in ME/CFS76–78.
Our results suggest that DNA methylation modifications in T-cells in ME/CFS are associated with the cellular metabolism differences that are observed in the disease and may play a role in the development of these phenotypic differences, however future work is required to understand this relationship.
Genetic and epigenetic interactions associated with ME/CFS
All the DMPs identified according to empirical p-values had significant associations (FDR corrected p-values < 0.05) with SNP genotypes (independent of disease phenotype). In total there were 13,060 significant cis-mQTLs (Supplementary Table S5). Figure 6 shows the strongest SNP-DMPs cis-mQTLs associations (according to correlation coefficient R2) in each of the 7 DMP-containing DMRs that were common in analyses of methylation data from both sexes and from females only.
SPATC1L (spermatogenesis and centriole associated 1 like) and DUSP22 (dual specificity phosphatase 22) were the two genes containing cis-mQTLs with the largest differentially methylated regions: 11 DMPs (7 hypermethylated probes in 5’ UTR and 4 hypomethylated probes in 3’ UTR region) in SPATC1L and 10 hypermethylated probes in the 5’ UTR of DUSP22 (Supplementary Table S3 and Figure 6). While the exact function of SPATC1L is not well understood, it has been previously associated with xenobiotic response and differential methylation in the promoter of this gene is characteristic of certain ethnic groups in human populations79. DUSP22 hypermethylation has also been observed in the 5’ UTR region in T-cells of rheumatoid arthritis patients80. In T-cells, DUSP22 is known to inhibit proliferation and autoimmunity through inactivating Lck and preventing the activation of the T-cell receptor81. However, it remains to be confirmed how hypermethylation in the 5’ UTR region affects the overall activity of DUSP22 in T-cells of ME/CFS patients.
These results suggest that ME/CFS patients have differential methylation patterns in T-cells that are strongly correlated with the underlying genotype. Understanding the mechanisms of these interactions is a promising direction of research in ME/CFS.
Conclusions
We identified over one hundred differentially methylated CpG loci associated with ME/CFS in T lymphocytes. Approximately half of these were clustered in differentially methylated regions of 500bp in size or less. Our data and analyses suggest that there is an indirect role of genotype influencing DNA methylation patterns associated with ME/CFS. We found no substantial large-effect direct associations of specific genotypes with ME/CFS disease phenotype. Larger scale genome wide association studies are necessary to test for potential small-effect associations between genotype and ME/CFS phenotype.
All of the methylation values at differentially methylated loci in T lymphocytes had significant correlations with specific genotypes at neighboring SNPs (within a window of 1 Mbp), indicating that particular genetic backgrounds may influence methylation levels differently in ME/CFS patients than in controls. The genomic elements associated with genetic and epigenetic variants characteristic of ME/CFS patients in this study constitute targets for future research. Understanding the molecular mechanisms of genetic-epigenetic interactions of these targets will be key to develop new treatments for ME/CFS, and can serve as a model to understand the molecular basis of related complex diseases.
Author contributions
P.O.M. designed research; S.H. performed research; S.H. analyzed the data with contributions from W.C.D.V. and D.A.; S.D.V. contributed reagents/materials; S.H. wrote the manuscript with contributions from W.C.D.V., P.O.M., D.A. and S.D.V. All authors read and approved the final manuscript.
Accession Codes
SNP data will be made available through the NCBI dbSNP database, and methylation data through the NCBI GEO database, upon acceptance for publication.
Competing financial interests
The author(s) declare no competing financial interests.
Acknowledgements
Funding for this research was provided by operating funds from the Solve ME/CFS Initiative. S.H. was supported by a CIHR Fellowship – Priority Announcement: Myalgic Encephalomyelitis/CFS/FM award number FRN 141047. We would like to thank Dr. Benjamin Hing for helpful discussions.
References
- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.↵
- 15.↵
- 16.
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.
- 78.↵
- 79.↵
- 80.↵
- 81.↵