Abstract
The epigenome deteriorates with age, potentially impacting on ageing-related disease.
Here, we interrogate the DNA methylation state of the genomic loci of human tRNA. Whilst arising from only ~46kb (<0.002% genome), this information transfer machinery is the second most abundant cellular transcript. tRNAs also control metabolic processes known to affect ageing, through core translational and additional regulatory roles.
We identified a genomic enrichment for age-related DNA hypermethylation at tRNA loci. Analysis in 4,350 MeDIP-seq peripheral-blood DNA methylomes (16-82 years), classified 44 and 21 hypermethylating specific tRNAs at study- and genome-wide significance, respectively, contrasting with 0 hypomethylating. Validation and replication (450k array & independent targeted Bisuphite-sequencing) supported thehypermethylation of this functional unit. The strongest consistent signals, also independent of major cell-type change, occur in tRNA-iMet-CAT-1-4 and tRNA-Ser-AGA-2-6.
This study is this first comprehensive evaluation of the genomic DNA methylation state of human tRNA genes and reveals a discreet hypermethylation with advancing age.
Introduction
Ageing is implicated as a risk factor in multiple chronic diseases [1]. Understanding how the ageing process leads to deteriorating biological function is now a major research focus, with hopes to increase the human ‘healthspan’ and ameliorate the extensive physical, social and economic costs of these ageing-related disorders [2]. Epigenetic processes, which influence or can inform us about cell-type specific gene expression, are altered with age and are, furthermore, one of the fundamental hallmarks of this progression [3,4].
DNA methylation (DNAm) is the most common epigenetic modification of DNA and age-associated changes in this mark in mammalian tissues have been recognised for decades [5]. In fact, these alterations in DNAm with age are extensive with thousands of loci across the genome affected. Many represent ‘drift’ arising from the imperfect maintenance of the epigenetic state [6]. However, specific genomic regions show distinct directional changes, with loss of DNA methylation in repetitive or transposable elements [7], as well as gains in certain promoters, including the targets of polycomb repressor complex [8] and bivalent domains [9]. These observations with the advent of high-throughput DNAm arrays also permitted the identification of individual CpG sites that exhibit consistent changes with age, enabling the construction of predictors of chronological age known as epigenetic or DNAm ‘clocks’ [10–13]. Additionally, it was observed that ‘acceleration’ of this DNAm-derived measure is a biomarker of ‘biological’ ageing due to associations with morbidity and mortality (Reviewed in [14] & [15]). In a previous investigation of ageing-related DNAm changes within common disease-associated GWAS regions, we identified hypermethylation of the specific transfer RNA gene, tRNA-iMet-CAT-1-4 [16]. The initiator methionine tRNA possesses certain unique properties, including its capacity to be rate limiting for translation [17], association with the translation initiation factor eIF2 [18], and ability to impact the expression of other tRNA genes [19].
tRNAs are evolutionarily ancient [20] and fundamental in the translation process for all domains of life. This translation machinery and the regulation of protein synthesis are controlled by conserved signalling pathways shown to be modifiable in longevity and ageing interventions [21]. Additionally, beyond their core role in the information flow from DNA to protein sequence, tRNAs can fragment into numerous tRNA-derived small RNAs (tsRNAs) [22] with signalling and regulatory functions [23–26].
When liberally considering the 610 loci annotated as tRNA genes, as well as tRNA pseudo genes, nuclear encoded mitochondrial tRNA genes and possibly some closely related repetitive sequences, these features (gtRNAdb [27]) cover <46 kb (including introns) which represents <0.002% of the human genome [28]. Yet a subset of these genes produce the second most abundant RNA species next to ribosomal RNA [29] and are required for the production of all proteins. tRNA genes are transcribed by RNA polymerase III (polIII) [30] and have internal type II polIII promoters [31]. DNAm is able to repress the expression of tRNA genes experimentally [32] but may also represent co-ordination with the local repressive chromatin state [33]. Transcription is also repressed by the highly conserved polIII specific transcription factor Maf1 [34,35], who’s activity is modulated by the Target of Rapamycin Kinase Complex 1 (TORC1) [36]. TORC1 is a highly conserved hub for signals that modulate ageing [37].
tRNAs as well as tsRNAs are integral to the regulation of protein synthesis and stress response, two processes known to be major modulators of ageing. Metabolic processes are also recognised to modulate the age estimates of DNAm clocks [38]. Partial inhibition of translation increases lifespan in multiple model organisms [39] and PolIII inhibition increases longevity acting downstream of TORC1 [40]. Furthermore, certain tsRNAs circulating in serum can be modulated by ageing and caloric restriction [41].
We directly investigated ageing-related changes in the epigenetic DNA methylation state of the entire tRNA gene set, facilitated by the availability of a large-scale MeDIP-seq dataset. Arrays poorly cover this portion of genome, with even the latest EPIC (850k) arrays only covering <15% of the tRNA genes, with robust probes, and in total only ~4.7% of all the tRNA gene CpGs [42]. tRNA genes sit at the heart not only of the core biological process of translation but at a nexus of signalling networks operating in several different paradigms, from small RNA signalling to large scale chromatin organisation [43]. In summary, tRNA biology, protein synthesis, nutrient sensing, stress response and ageing are all intimately interlinked. In this study, we have identified tRNA gene DNA hypermethylation and independently replicated this newly described ageing-related observation.
Results
DNA Methylation of Specific tRNA Gene Loci Changes with Age
Due to tRNAs critical role in translation and evidence of their modulation in ageing and longevity-related pathways, we interrogated these genes for evidence of ageing-related epigenetic changes. Our discovery set was a large-scale peripheral blood-derived DNA methylome dataset comprising of 4350 samples. This sequencing-based dataset had been generated by Methylated DNA Immunoprecipitation (MeDIP-seq) [44], which relies on the enrichment of methylated fragments of 200-500 bp to give a regional DNAm assessment (500 bp semi-overlapping windows, see Methods). There are 416 high confidence tRNA genes in the human genome, we initially examined an expanded set of 610 tRNA-like features identified by tRNAscan including pseudogenes, possible repetitive elements, and nuclear encoded mitochondrial tRNAs [27,45]. Of these 492 were autosomal and did not reside in blacklisted regions of the genome [46]. Due to the small size of these tRNAs (60-86bp, median 73bp, excluding introns which are present in ~30 tRNAs with sizes from 10-99bp, median 19bp), this fragment-based method enabled a robust examination of the epigenetic state of these highly similar sequences. This was supported by a mappability assessment. The median mappability score density for all tRNA genes was 0.90 for 50mers when considering tRNA genes ±500bp reflecting the regional nature of the MeDIP-seq assay. In contrast the 50mer mappability density is 0.68 for tRNA genes alone, which would be representative of the mappability of reads generated using a technique such as whole-genome bisulfite sequencing (see Figures S1 & S2).
We identified 21 genome-wide significant and 44 study-wide significant results (p < 4.34 × 10−9 and 8.36 × 10−5, respectively, via linear regression (n=4350, see Methods). Study-wide significance was calculated conservatively with the Bonferroni method for all 598 autosomal tRNAs and closely related features. There was a strong directional trend with all results at both significance levels being due to increases in DNA methylation. Age-related changes in cell type proportion are strong in heterogeneous peripheral blood, and include a myeloid skew, loss of naive T cells and increases in senescent cells [47]. A subset of 3 genome-wide and 16 study-wide significant hypermethylation results remained significant even after correcting for potential cell-type changes by including lymphocytes, monocytes, neutrophils and eosinophil cell count data (n=3001, Listed in Figure 1, see Figure S3). Going forward we refer to these cell-type corrected sets of 3 and 16 tRNA genes as the genome-wide and study-wide significant tRNA genes respectively.
Due to the related nature of these twin samples, we also analysed these data in two subsets of n = 1198 & 1206 by selecting one twin from each pair into the separate sets. This analysis also included correction for Blood Cell counts. Even in these smaller sets, 5 and 7 tRNA genes, respectively, reached study-wide significance. In these sets 5/5 and 6/7 of these were present in 16 study-wide significant tRNA genes.
Furthermore, we examined a subset of samples with longitudinal data (n=658 methylomes from 329 individuals, median age difference 7.6 yrs). At the nominal significance threshold (p < 0.05) this yielded a split of 41 hypermethylating tRNA genes and 22 hypomethylating tRNA genes. Of these hypermethylated tRNAs, 2 are in the previously identified genome-wide significant set of 3 (with tRNA-iMet-CAT-1-4 ranked 3rd by p-value) and 9 are in the study-wide significant set of 16. Full results for each age model are provided in Supplementary File S1
tRNA Genes are Enriched for Age Related DNA Hypermethylation
Whilst ageing changes are pervasive throughout the DNA methylome, we identified a strong enrichment for this to occur within the tRNA genes (Fisher’s Exact Test p = 1.05 × 10−27) (see Figure 2 b). This is still significant if the 6 of the study-wide significant 16 tRNAs that have any overlap with polycomb or bivalent regions are excluded (p = 4.66 × 10−15)
CpG density itself is known to have a clear impact on the potential for variability of the DNA methylome as well as ageing-related changes [48,49]. To assess whether this hypermethylation finding was being merely driven by the inherent CpG density of the tRNA genes, we performed a CpG density matched permutation analysis (1,000X, see Methods). This supported the specific nature of these age-related DNAm changes within the functional tRNA genes (Empirical p-value < 1.0 × 10−3, Figure 2 a). As a point of comparison for this genomic functional unit, we also performed the same permutations for the known age-related changes in the promoters of genes that are polycomb group targets [8] and those with a bivalent chromatin state [9]. We were able to reproduce the enrichment of the polycomb group targets and bivalent regions (Empirical p-value < 1.0 × 10−3) in our dataset.
tRNA-iMet-CAT-1-4 (Figure 3a) is located in the largest tRNA gene cluster in the human genome at chr6p22.2-1. This cluster contains 157 tRNA genes spanning the 2.67Mb from tRNA-iMet-CAT-1-2 to tRNA-Leu-AAG-3-1, and also hosts a histone gene microcluster. tRNA-Ile-AAT-4-1 and tRNA-Ser-AGA-2-6 are neighbours and are located on chromosome 17 (Figure 3b). Notably tRNA-Ile-AAT-4-1 and tRNA-Ser-AGA-2-6 have a third close neighbour tRNA-Thr-AGT-1-2 which does not show significant age-related hypermethylation. We observe a similar pattern of sharp peaks of significance closely localised around the other loci in the study-wide significant set. GENCODE 19 places tRNA-Ile-AAT-4-1 in the 3’UTR of a Nonsense-mediated decay transcript of CTC1 (CST Telomere Replication Complex Component 1, CTC1-002, ENST00000449476.2) and not of its primary transcript.
To place these hypermethylating tRNA genes in their genomic context we examined how the extended set of 44 non-blood cell-type corrected study-wide significant tRNAs clustered with other tRNA genes. We clustered the tRNA genes by grouping together all tRNAs within 5Mb of one another and then required that a cluster contain at least 5 tRNA genes with a density of at least 5 tRNA genes per Mb (see Methods). This yielded 12 major tRNA gene clusters containing a total of 353 tRNA genes, 42 of the 44 study-wide significant tRNA genes within these clusters (Figure 4a). The hypermethylating tRNA genes are spread evenly among these clusters proportionately to their size (% ageing tRNA per cluster; non-significant one-way ANOVA).
Age-related tRNA gene set DNA Hypermethylation is even observed in one Newborn versus one Centenarian
We examined an available Whole Genome Bisulphite sequencing (WGBS) dataset from Heyn et al. [50] (see Methods) These data consisted of blood-derived DNA WGBS in one newborn child and one 26 year old, and centenarian (103 years). In their analysis, the centenarian was found to have more hypomethylated CpGs than the neonate across all genomic compartments, including promoters, exonic, intronic, and intergenic regions. However, even in this examination of 3 individuals of 3 different age in the 55% of tRNA that possessed coverage, we observed DNA hypermethylation with age among the study-wide significant hypermethylating tRNA genes. The centenarian was significantly more methylated in this set of tRNAs than the neonate (Wilcoxon rank sum test, 6.14% increase (95% CI -Inf - 4.31), W = 717, p = 6.14 × 10−4, see Figure 4b).
Age-related Changes Independently Replicated with Targeted Bisulfite Sequencing
In order to further robustly support these-ageing related changes, we attempted to replicate these findings ourselves in an independent ageing dataset. Furthermore, we employed a different technology targeted bisulfite (BiS) sequencing to also further validate the MeDIP-seq-derived results. These data provide individual CpG resolution to identify what may be driving the regional DNAm changes observed, and precise quantitation of the magnitude of change.
We performed this targeted BiS-seq in blood-derived DNA from 8 pools of age-matched individuals at 4 time-points (~4, ~28, ~63, ~78 years) from a total of 190 individuals, as detailed in Table 1. This approach permitted us to assay tRNA gene DNA methylation across a large number of individuals whilst requiring a minimal amount of DNA from each (80-100ng), and costing ~1/24th as much as performing sequencing individually. A total of 79 tRNA loci generated reliable results post-QC (see Methods). These tRNAs covered a total of 458 CpGs with a median of 6 CpGs per tRNA (range 1-9). Median Coverage per site across pools, technical replicates and batches was 679 reads (mean 5902).
Firstly, we ran the 8 Pooled samples on the Illumina EPIC (850k) array to confirm that our pooling approach was applicable for DNAm ageing-related evaluation. This showed an R2 = 0.98 between pool mean chronological age and Horvath clock DNAm predicted age [11](see Figure 5c). Therefore, this confirmed the utility of our pooling approach. We also used these array derived data to estimate the major blood cell proportions for each of these pools with the Houseman algorithm [51].
We noted that individual tRNA loci exhibiting age-related changes in DNAm had duplicate or isodecoder (same anticodon but body sequence variation) sequences in the genome, which despite exact or near sequence identity did not show similar changes. tRNA-iMet-CAT-1-4 for instance is 1 of 8 identical copies in the genome and was the only locus that showed significant changes. The results of pairwise differential methylation tests between age groups for the 6 top tRNAs from the MeDIP-seq models are listed in Table 2. tRNA-iMet-CAT-1-4 shows a pairwise increase of 3.7% from age 4 years to age 78 years.
Of the 3 top hits in MeDIP-seq, tRNA-iMet-CAT-1-4 (Figure 5a(iii)) and tRNA-Ser-AGA-2-6 (Figure 5a(viii)) exceeded nominal significance (p-values = 9.35 × 10−4 & 4.28 × 10−2, respectively). However, tRNA-Ile-AAT-4-1 (Figure 5a(xi)) showed a nominal decrease in DNAm with age. tRNA tRNA-Leu-TAG-2-1 from the study-wide significant set also showed nominally significant hypermethylation with age (Figure 5a(xviii)). Also, four of the individual CpGs in tRNA-iMet-CAT-1-4 exhibited nominally significant increases in DNAm with Age (Figure 5b).
Select Duplicates & Isodecoders of Hypermethylating tRNA loci remain unchanged
We targeted a selection of these duplicate and isodecoder loci for bisulfite sequencing in order to confirm that the identified DNAm changes are specific to a given locus and not general to related tRNAs. Examining the tRNA-iMet-CAT-1 family, only the previously identified 1-4 version confirmed significant hypermethylation (not 1-2, 1-3 or 1-5)(Figure 5a(i-iv)). Likewise the tRNA-Ser-AGA-2-6 version was supported compared to 2-1,2-4 and 2-5(Figure 5a(v-viii))). Full Age model results are available in Supplementary File S2.
DNA methylation 450k Array Data Validates the MeDIP-seq Results
Although DNA methylation array poorly covers the tRNA gene set, we wished to attempt to see if this BiS-based but differing and well-established technology was supportive at all of our DNA hypermethylation findings. TwinsUK had available 450k array on 587 individuals, and this platform includes 143 probes, covering 103 tRNAs. All the 3 top tRNAs in the MeDIP-seq results were covered by this data set, and 7 of the 16 study-wide significant set. 9 tRNAs show significant (p < 4.58 × 10−4) increases in DNA methylation with age in models corrected for blood cell counts including all 3 of the 3 tRNAs identified in the MeDIP-seq as genome-wide significant and 5 of the 7 study-wide significant set present on the array (Figure 6). Although it should be noted that 56 of these 143 probes are within the non-robust set of Zhou et al. [42], including 1 of the genome-wide, and 1 of the study-wide results (covering tRNA-Ile-AAT-4-1 & tRNA-Val-AAC-4-1), respectively. Full age model results for all tRNA probes are provided in Supplementary File S3.
Ageing-Related tRNA Loci show increased Enhancer-Related Chromatin Signatures
We further explored the activity of the tRNA genes in public Chromatin segmentation data in blood (Epilogos Blood & T-cells set) [53,54]. This shows proportionally more Enhancer-related (Enh, EnhBiv & EnhG) chromatin states at tRNA genes hypermethylating with age than the stronger Promoter-related (TSS) in other tRNAs. (Figure 7 b). Whereas these characteristics are less frequently predominant in the rest of the tRNAs (Figure 7 b). Age-hypermethylating tRNA are enriched for enhancer chromatin states compared to the rest of the tRNA genes (Fisher’s Exact test p = 0.01).
Age Hypermethylating tRNAs are more methylated in Lymphoid than Myeloid cells
Three tRNA genes remained genome-wide significant and 16 study-wide significant following correction for major cell-type fraction. This is suggestive of either cell-type independent change or, presumably less likely, a very large effect in a minor cell-type fraction. tRNAs have exhibited tissue-specific expression [57–59] and blood cell-type populations change with age. Specifically, there is shift to favour the production of cells in the myeloid lineage [47]. These points lead us to examine tRNA gene DNAm in sorted cell populations. We used a publically available 450k array dataset [55]) that has been used in the construction of cell-type specific DNAm references for cell-type fraction prediction using the Houseman algorithm [51] (see Methods). This consists of data from 6 individuals (aged 38 ± 13.6/yrs) from seven isolated cell populations (CD4+ T cells, CD8+ T cells, CD56+ NK cells, CD19+ B cells, CD14+ monocytes, neutrophils, and eosinophils). We found that tRNA gene DNAm could separate myeloid from lymphoid lineages (Figures 7 a & S4).
Of the eight study-wide significant tRNAs with array coverage, we identified that collectively these eight are significantly more methylated in the lymphoid than the myeloid lineage (1.1% difference, Wilcoxon rank sum test p = 1.50 × 10−6 95% CI 0.7%-∞). Thus, any age related increases in myeloid cell proportion would be expected to dampen rather than exaggerate the age-related hypermethylation signal that we observed. In addition tRNA-Ile-AAT-4-1 and tRNA-Ser-AGA-2-6 have the highest variance in their DNAm of all 129 tRNAs covered in this dataset. This could represent ageing-related changes as these samples range across almost 3 decades. Another possibly may be that these loci as well as hypermethylating are also increasing their variability with age in a similar fashion to those identified by Slieker et al. [60]. In that study they identified that those loci accruing methylomic variability were associated with fundamental ageing mechanisms.
tRNA Genes also Hypermethylate with Age in Solid Tissue
Some tRNA gene expression has been shown to be highly tissue specific [57–59]. It follows that our observations of changes in DNAm with age in blood might be specific to that tissue. We used a mix of 450k and 27k array data from ‘solid tissue normal’ samples made available by TCGA (The Cancer Genome Atlas) and data from foetal tissue [61,62] downloaded from GEO (see Methods). The samples from TCGA range in age from 15-90 (n = 733). Only 43 tRNA genes had adequate data to compare across tissues in this dataset and 115 in the foetal tissue data.
Only 2 of the 3 tRNA genes we identified as genome-wide significant and a further 1 of the study-wide significant tRNA genes are present in the set of 45 tRNA genes in the TCGA data, thus limiting our ability to draw conclusions about the tissue specificity of our results. Solid tissue samples have a strong preponderance for low levels of methylation consistent with the active transcription of many tRNA genes and show slight increasing methylation with age but age accounts for very little of the variance (linear regression slope estimate = 1.52; R2 = 0.0002; p-value 1.34 × 10−3 (Figure S5d). In a pan-tissue analysis we found that 10 tRNA genes showed changes in DNAm with age, 9 of which were hypermethylation (p-value < 1.1 × 10−3). One of these tRNA genes, tRNA-Ser-TGA-2-1 was also present in study-wide significant set of tRNA genes. Figures S6 & S7 illustrate minimal tissue specific differences. Interestingly, however, tRNA-iMet-CAT-1-4 and tRNA-Ser-AGA-2-6 appeared more variable in methylation state than many other tRNAs in the TCGA normal tissue samples (Figure S6) and indeed have the highest variance in DNA methylation across tissues (Figure S5c).
Mice also show age-related tRNA gene DNA hypermethylation
We examined the DNA methylation of the mouse tRNA genes in using data from a reduced representation bisulfite sequencing (RRBS) experiment performed by Petkovich et al. [63]. These data from 152 mice covered 51 tRNA genes and 385 CpGs after QC (see Methods), representing ~11% of mouse tRNA genes. The mice ranged in age from 0.67-35 months.
Three of the 51 tRNAs showed Bonferroni significant DNA methylation changes with age (p-value < 1.08 × 10−4) and all were in the hypermethylation direction. These three are tRNA-Asp-GTC-1-12, tRNA-Ile-AAT-1-4, tRNA-Glu-TTC-1-3 (Figure 8). Full Age model Results are available in Supplementary File S4.
In order to investigate the mouse results further we made use of data from Thornlow et al. [64], which had established tRNA ortholog sets for 29 mammalian species. They identified 197 mouse tRNAs as having direct human orthologs with 44 of these included in the mouse results from Petkovich et al. [63]. However, unfortunately, although 2 of the top 3 tRNAs (tRNA-Ser-AGA-2-6 & tRNA-Ile-AAT-4-1) have mouse orthologs (tRNA-Ser-AGA-2-2 & tRNA-Ile-AAT-1-1), they are not covered in these mouse data. Furthermore, none of the tRNAs showing significant hypermethylation in mice (at nominal p<0.05) had human orthologs in our MeDIP-seq study-wide significant hypermethylating set. Therefore, whilst we cannot directly compare specific tRNA loci due to this lack of coverage, it is interesting that the small number of significant tRNA genes in the mouse data also hypermethylate with age.
Discussion
Our work has identified a previously unknown enrichment for age-related epigenetic changes within the tRNA genes of the human genome. This observation was strongly directional with increasing DNA methylation with age [65]. The MeDIP-seq dataset employed brought advantages in exploring this undefined terrain of the human tRNA genes. Firstly, being genome-wide it provides much increased access, as these regions are poorly covered by current arrays. Secondly, being a fragment-based regional assessment of DNA methylation, the individual but highly similar small tRNA genes can be surrounded by unique sequence.
We determined by genome-wide permutation that this strong hypermethylation signal was specific to the tRNA genes, and not merely driven by the underlying CpG density of these loci. A targeted BiS-seq experiment validated the defined nature of the tRNA change in an independent dataset, with a successful pooling approach, which may also be useful for other ageing-related targeted DNA methylome evaluations. Additionally, we gained support for our results from limited DNA methylation array data.
Whilst the changes in DNA methylation we have observed are relatively small (i.e. 3.7% between 4-year to 78-year pools in tRNA-iMet-CAT-1-4), this is consistent with the effect size seen in the majority of positions differentially methylatyed with age in many other studies [60,66], except for the noted extreme outliers, such as EVOLV2 [67]. Furthermore, effect size cut-offs are observed to remove a large fraction of true age-DMPs [68].
We subsequently explored further what was driving this age-related phenomenon and its possible biological implications. As this result was observed in peripheral blood, we were well aware that we were examining DNA derived from a heterogeneous cell type population [69]. Moreover, that there are well known age-related proportional changes in peripheral blood cell composition [47]. The TwinsUK MeDIP-seq and 450k array DNA methylation data included measured haematological values. Therefore, we adjusted for major cell type effects, such as a myeloid skew, and distinct tRNAs were still significant. Although, a caveat to our study is that this can not exclude changes in minor specific sub-cell fractions types. However, that these age-related effects were strong enough to be observed in both a regional MeDIP-seq assessment and a pooled sequencing approach, implies that they not extremely subtle. We examined age-related tRNA gene DNA methylation changes in the limited subset of mouse tRNA genes covered in publicly available RRBS data (~13%) and were able to identify tRNAs exhibiting DNA hypermethylation with age in this set. This suggests that age-related tRNA gene hypermethylation may not be unique to humans, but at least observed across mammals.
Due to the high number of hypermethylating tRNAs prior to cell-type correction, we were also curious whether the epigenetic state of this small tRNA gene fraction of genome could capture and in fact be a defined fingerprint of cell type. We found that tRNA gene DNA methylation could separate myeloid from lymphoid lineages. There was also some suggestion of more fine-grained blood cell-type signatures in tRNA DNAm, such as the separation of CD19+ B cells from CD4/8+ T cells. Ageing is also known to lead to an increase in senescent cells (e.g. CD8+ CD28-cells). Whether these epigenetic changes in the tRNA genes uniquely represent these cell-types will require technical advances to enable future single cell DNA methylome analysis to accurately assess these regions. If further supported, the epigenetic state of these tRNA loci may aid the taxonomy of cell-type definition.
We observed a predominantly unmethylated state across fetal (Figure S7) and adult tissues (Figure S6) at tRNA gene loci, consistent with the high rate of transcription at many tRNA genes. We suspect that the tRNA genes largely remain unmethylated through development and that the moderate increases in DNAm that we are observing with age at these loci are being driven by changes arising primarily in older individuals. Distinct biological changes have been observed recently in aged individuals [70,71]. Our data are supportive of the tRNA gene set representing a distinct functional unit, which lacks the logarithm change in the young observed strongly in CpGs employed in ‘clocks’ [13]. This would also be consistent with the lack of significant differences in the tRNA loci detected between the neonate and the 26-year-old adult in the Heyn et al. data. This low baseline DNA methylation of the tRNA genes may also explain why we predominantly observe hypermethylation with age. Whether this is driven by mechanisms, such as aberrant DNA methylation targeting of the tRNA loci or specific sub-celltype effects with age, will require further experimental investigation.
This signal within the tRNA families was observed to occur at specific Isodecoders. Isodecoders expand in number with organismal complexity and the high prevalence in mammals has been suggested due to their additional regulatory functionality [72,73]. They also have distinct translational efficiency [74], which can also have consequences in human disease [75]. After correcting for major cell types, we identified 2 tRNA genes tRNA-iMet-CAT-1-4 and tRNA-Ser-AGA-2-6 which had the most consistent hypermethylation across 3 different assays. Regarding the inconsistent tRNA-Ile-AAT-4-1 result, it is covered by a MeDIP-seq 500bp window which exhibited genome-wide significant hypermethylation, but also partially overlaps tRNA-Ser-AGA-2-6 (Figure 3B). Whilst the 450k array probe overlapping tRNA-Ile-AAT-4-1 (cg06382303) appears to replicate this hypermethylation, it is a borderline case for exclusion flagged by Zhou et al. [42] due to non-unique 30bp 3’ subsequence. In the targeted Bisulfite sequencing data, tRNA-Ile-AAT-4-1 exhibited a loss of methylation. Therefore, this may suggest that the hypermethylation signal observed at this locus in the MeDIP-seq data could have been ‘pulled up’ by the neighbouring tRNA-Ser-AGA-2-6 hypermethylation signal. Of the 16 study-wide significant tRNA genes, only these two of these have a shared significant window, furthermore, in the expanded set of 44 only tRNA-Thr-AGT-1-2 could be similarly affected.
Furthermore, there is great complexity to the fragmentation of tRNA [23], with physiological processes such as stress shown to induce fragment production [76]. These resultant tsRNAs can feedback on protein synthesis by regulating ribosome biogenesis [77] and others have diverse regulatory functions such as targeting transposable element transcripts [78]. They are also observed to circulate in the blood in a cell-free fashion, and fragment levels can be modulated by ageing and calorie restriction [41]. The isodecoder specific nature of our findings frame a possible hypothesis for regulatory change with age and future work will be required to unravel this potential. tsRNA abundance has been associated with locus specific tRNA gene expression, in some cases independent of mature tRNA levels [79]. This has important implications for the interpretation of our results given the multi-copy nature of genes like tRNA-iMet-CAT-1-4, as even if expression levels of mature iMet tRNAs are unaffected by changes in one copy’s DNA methylation, these changes could still influence the levels of particular tsRNAs derived from specific tRNA loci.
The location of tRNA-Ser-AGA-2-6 and tRNA-Ile-AAT-4-1 immediately downstream of CTC1 and of tRNA-Ile-AAT-4-1 within the 3’UTR of an alternate isoform of CTC1, which undergoes nonsense-mediated decay, raises the possibility that the gene body epigenetic regulation of CTC1 may impact on the state of these tRNA genes. CTC1’s function in telomere maintenance [80], DNA replication licensing [81], and it’s role in a rare progeroid condition [82] indicate that it has ageing-relevant biology. The possible relationship between the the regulation of CTC1 and that of the tRNA genes downstream of it warrants further study.
Whilst, the expression of the tRNA genes has long been simplified as ‘constitutive’, some observations have indicated that many tRNA genes are expressed in a tissue-specific fashion in diverse organisms [58,59]. Although others have found the majority of isodecoders are transcribed in different cell types [28]. Several transcription factors acting via TFIIIB [83] have a negative (the tumour suppressors p53 [84] and Rb [85]) or positive (the proto-oncogene c-Myc) influence [83]. Regulatory sequence in the flanking or the internal regions of tRNA genes do not explain tRNA expression variation [86]. Whilst DNAm is able to repress the expression of tRNA genes [32], the broader chromatin environment also affects tRNA transcription. Due to the co-ordinated nature of epigenomic modifications, it may also be revealing to evaluate ageing-related histone modification in these tRNA loci. Furthermore, recent data from Gerber et al [87] indicates Pol-II may also have a dynamic regulatory role in tRNA expression.
Changes in the epigenetic state of specific tRNA could be modulating transcription efficiency or even codon availability in the ageing cell. tRNA gene dosage is quite closely matched to amino acid usage frequency in the human exome. However, the transcriptome codon usage frequency and tRNA gene expression have been claimed to vary with the replicative state of cells, separating differentiated from replicating cells [88]. Others have argued that these differences are substantially explained by variation in GC content [89] and that codon usage frequencies are observed to be mostly invariant in the transcriptomes of a wide range of tissues, as well as across developmental time [86]. Although, experimental stress-related states have revealed changes with an over-representation of codons that are translated by rare tRNAs [90].
tRNA sequences themselves are under strong structural (both secondary and tertiary) [72] as well as functional constraint, which leads to an order of magnitude reduction in variation compared the background genomic mutation rate [28]. Despite strong purifying selection maintaining very low variation in tRNA gene bodies, tRNA genes are subject to high levels of transcription-associated mutagenesis (TAM) leading to elevated mutation rates over evolutionary time in their immediate flanking sequences [91]. The possible effects of genetic variation on DNA methylation mean that polymorphic tRNA could be another potential caveat to our work. Although, there is no significant population variation in, for example, tRNA iMet sequences in 1,000 Genomes data. Indeed, there are only 11 new isodecoder sequences with high confidence (tRNAscan scores ≥ 50) at >1% population frequency [28]. There is also some evidence for tRNA copy number variation at specific loci, although this remains under-characterised [92,93]. Another potential cause we considered was whether age-related somatic copy number increases could be occurring in these loci. Population or somatic copy number expansions could lead to increased methylated reads in MeDIP-seq without any epigenetic state change. However, this would not be consistent with the targeted and array BiS conversion methodologies, where the proportion of methylated to unmethylated reads would still be constant.
It is worth noting the parallels with known cancer and ageing epigenetic changes, and that tRNAs are also dysregulated in cancer [94], with proposed utility as prognostic markers [95]. Furthermore, the early replicating state of tRNA loci, potentially associated with high expression [96], may make them prone to hypermethylate, as is observed in early replicating loci in both cancer [97] and senescent cells [98]. Interestingly, tRNA gene loci may also play a role in local as well as large scale genome organisation [43]. tRNA gene clusters act as insulators [99] and have extensive long-range chromatin interactions with other tRNA gene loci [43]. The coordinated transcription of tRNAs at subnuclear foci and the B-box sequence elements bound by TFIIIC and not PolIII may represent an organising principle for 3D-chromatin by providing spatial constraints [100]. Therefore, these tRNA epigenetic changes could contribute to the structural changes that are also observed in ageing [101].
In conclusion, due to the unique challenges that make the tRNA genes difficult to examine it has remained epigenetically under-characterised despite its critical importance for cell function. We directly interrogated the epigenetic DNA methylation state of the functionally important tRNA genes, across the age spectrum in a range of datasets as well as methodologies and identified an enrichment for age-related DNA hypermethylation in the human tRNA genes.
Methods
Participants
Participants in the ‘EpiTwins’ study are adult volunteers from the TwinsUK Register. The participants were aged between 16 and 82 years, with a median of ~55 years (cohort profile [102]). Ethics for the collection of these data were approved by Guy’s & St Thomas’ NHS Foundation Trust Ethics Committee (EC04/015—15-Mar-04) and written informed consent was obtained from all participants.
Participants for our targeted bisulfite sequencing of select tRNA loci were drawn from two studies. Samples from participants aged 4 and 28 years are from the MAVIDOS [103] study and participants aged 63 and 78 years are from the Hertfordshire cohort study [104]. Due to a limited number of available samples, the two 4 year old pools contained DNA from 20 individuals each, with all other pools having 25 contributing individuals. Pool 1, the first 4 year old pool used DNA from all male samples, with all other pools using all female samples. Thus, the total number of participants was 190 (see Table 1). Samples from the 28 year old time point are all from pregnant women at ~11 weeks gestation.
tRNA annotation information
Genomic coordinates of the tRNA genes were downloaded from GtRNAdb [27]. The 2 tRNAs located in chr1_gl000192_random are tRNA-Gly-CCC-8-1 & tRNA-Asn-ATT-1-2 (Supplementary File S5). The 213 probes overlapping tRNA genes were derived from intersecting the tRNA gene annotation data from gtRNAdb with the Illumina 450k array manifest annotation for the hg19 genome build using bedtools v2.17.0 [105]. We excluded 107 tRNAs from blacklisted regions of hg19 [46].
tRNA Gene Clustering
We clustered the tRNA genes by grouping together all tRNAs within 5Mb of one another using the bedtools merge tool v2.17.0 [105]. We used a command of the form: bedtools merge -c 4 -o collapse -d <N> -i hg19-tRNAs.bed where <N> is the binsize. We varied the binsize and selected 5Mb as it is at approximately this size that number of clusters with more than one tRNAs exceeds the number of singleton tRNAs (Figure S9). We added the further requirements that these groupings contain at least 5 tRNA genes with a density of at least 5 tRNA genes per Mb to be considered as clusters.
DNA methylome data
TwinsUK MeDIP-seq methylomes
The Methylated DNA Immunoprecipitation sequencing (MeDIP-seq) data was processed as previously described [16,106]. These processed data are available from the European Genome-phenome Archive (EGA) (https://www.ebi.ac.uk/ega) under study number EGAS00001001910 and dataset EGAD00010000983. The dataset used in this work consists of 4350 whole blood methylomes with age data. 4054 are female and 270 male. 3001 have full blood counts. There are 3652 individuals in this data set. These individuals originate from 1933 unique families. There are 1234 monozygotic (MZ) twin pairs (2468 individuals), and 458 dizygotic (DZ) twin pairs (916 individuals).
MeDIP-seq used a monoclonal anti-5mC antibody to bind denatured fragmented genomic DNA at methylated CpG sites. This antibody-bound fraction of DNA was isolated and sequenced [44]. MeDIP-seq 50-bp single-end sequencing reads were aligned to the hg19/GRCh37 assembly of the human genome and duplicates were removed. MEDIPS (v1.0) was used for the MeDIP-seq specific analysis [107]. This produced reads per million base pairs (RPM) values binned into 500bp windows with a 250bp slide in the BED format, resulting in ~12.8 million windows on the genome. MeDIP-seq data from regions of interest was extracted using Bedtools v2.17.0 [105].
Analysis of DNA methylome data for Significant Ageing-related changes
All analysis was performed in R/3.5.2. Linear models were fitted to age using the MeDIP-seq DNA methylome data, as quantile normalised RPM scores at each 500bp window. Models were fitted with: 1. No covariates; 2. Batch information as a fixed effect; 3. Blood cell-type counts for neutrophils, monocytes, eosinophils, and lymphocytes as fixed effects; and 4. Batch and Blood Cell counts as fixed effects. Model 1 & 2 were fitted on the full set of 4350 as batch information was available for all samples but blood cell count data was only available for a subset of 3001 methylomes. Models 1 & 2 fitted in the n=3001 subset were similar to those fitted in the complete set of 4350. Models 3 & 4 were fitted in the n=3001 subset with full covariate information and sets of significant tRNAs identified at study-wide and genone wide levels in model 4 were used in subsequent analyses. Models were also fitted for two unrelated subsets created by selecting one twin from each pair (Monozygotic or Dizygotic), yeilding sets with n = 1198 & 1206 DNA methylomes. One additional model was fitted for longitudinal analysis, samples were selected by identifying individuals with a DNA methylome at more than one time point and filtering for only those with a minimum of 5 years between samples. This yielded 658 methylomes from 329 individuals with age differences of 5-16.1 yrs, median 7.6 yrs. Models for this set included participant identifier as a fixed effect in addition to blood cell counts and batch information.
Permutation Analysis for Enrichment with Age-related Changes
We performed a permutation analysis to determine whether the CpG distribution of sets of the tRNA genes was the principle driver of the ageing-related changes observed. Windows overlapping tRNAs have a higher proportion of windows with a greater CpG density than their surrounding sequences (see supplementary Figure S8). CpGs residing within moderate CpG density loci are the most dynamic in the genome [48] and CpG dense CpG island regions include specific ageing-related changes [8,9,16]. For comparison we also performed the permuation in the CGI regions from the Polycomb group protein target promoters in Teschendorff et al. [8] and bivalent loci from ENCODE ChromHmm ‘Poised Promoter’ classification in the GM12878 cell-line [54]. A random set of 500bp windows representing an equivalent CpG density distribution of the feature set in question were selected from the genome-wide data. Above a certain CpG density there are insufficient windows to sample without replacement within a permutation. Furthermore, above ~≥ 18% CpG density CpG Islands become consistently hypomethylated [108]. Therefore, all windows with a CpG density of ≥ 18% (45 CpGs per 500bp) were grouped and sampled from the same pool. i.e. a window overlapping a tRNA gene which had a 20% density could be represented in permutation by one with any density ≥ 18%. This permutation was performed 1,000 times to determine an Empirical p value by calculating the number of times the permutation result exceeded the observed number of significant windows in the feature set. , where r is the sum of significant windows in all permutations and N is number of permutations [109].
Neonate and Centenarian Whole Genome Bisulfite Sequencing
DNA methylation calls were downloaded from GEO:GSE31263 and intersected with tRNA genes using bedtools v2.17.0 [105].
Sample pooling and EPIC array
We performed an Illumina Infinium DNA methylation EPIC array ((C) Illumina) and targeted bisulfite sequencing of select tRNA gene loci. Here we used DNA extracted from whole blood and pooled into 8 samples from unrelated individuals at 4 time-points with 2 pools at each time-point. The timepoints were 4, 28, 63, and 78 years. Using the EPIC array we were able to infer the DNAm age using the Horvath DNAm clock [11] and blood cell-type composition of our samples using the Houseman algorithm [51].
Targeted Bisulfite Sequencing
We selected tRNA loci for targeted sequencing in which have had observed changes and DNAm with age and closely related tRNAs in which changes were not observed. Primer design was performed using ‘methPrimer’ [110] (Supplementary File S6). A total of 84 tRNA loci were targeted and 79 subsequently generated reliable results post-QC. The targeted tRNAs covered a total of 723 CpGs with a median of 8 CpGs per tRNA (range 1-13), data passing QC was generated for 458 CpGs, median 6 (range 1-9) per tRNA.
Quality was assessed before and after read trimming using fastqc [111] and multiqc [112] to visualise the results. Targeting primers were trimmed with cutadapt [113] and a custom perl5 script. Quality trimming was performed with trim_galore [114]. Alignment and methylation calling was performed with Bismark (v0.20.0) [115] making use of bowtie2 [116]. The alignment was performed against both the whole hg19 genome and just the tRNA genes +/− 100bp to assess the possible impact of off-target mapping. Mapping to the whole genome did produce purported methylation calls at a larger number of loci than mapping just to the tRNA genes (683,783 vs 45,861 respectively). Introducing a minimum coverage threshold of 25 reads dramatically reduced this and brought the number of sites into line with that in the tRNA gene set (36,065 vs 33,664 respectively) suggesting a small number of ambiguously mapping reads. All subsequent analysis was performed using the alignment to just the tRNA genes with a minimum coverage of 25 reads.
We performed pairwise differential methylation analysis of the tRNA genes at the different time points using RnBeads [52] with limma [117] and a minimum coverage of 25 reads. We also performed linear regression predicting age from DNA methylation at the targeted tRNA sites, permitting us to compare rates of increase with age. For the linear regression, we used only CpG sites with more than 25 reads mapped to the regions of the genome targeted for amplification.
TwinsUK Illumina 450k array methylomes
Illumina Infinium DNA methylation 450k arrays ((C) Illumina) were also performed on TwinsUK participants, in 770 Blood-derived DNA samples which had matched MeDIP-seq data. These data were preprocessed in the form of methylation ‘beta’ values pre-processed as previously described [16,106]. Cell-type correction was performed using cell-count data and the following model: lm(age ∼ beta + eosinophils + lymphocytes + monocytes + neutrophils).
Chromatin Segmentation Data
Epilogos chromatin segmentation data [53] was downloaded for the tRNA gene regions +/− 200bp from https://explore.altius.org/tabix/epilogos/hg19.15.Blood T-cell.KL.gz using the tabix utility. The data used was the ‘Blood & T-cell’ 15 State model based on segmentation of 14 cell-types. This data was manipulated and visualised with R and ggplot2.
Isolated Blood Cell Type Specific Data
Data from 7 cell-type fractions from 6 Male individuals was downloaded from GSE35069 [55] using GEOquery [118]. Five of the 6 top age hypermethylating tRNAs are covered by this array dataset. Heatmaps were created with the ComplexHeatmap R package [56].
Cancer and Tissue Specific Methylation Data
Data was downloaded from the TCGA (The Cancer Genome Atlas) via the GDC (genomic data commons) data portal [119] using the GenomicDataCommons R package. Data from foetal tissue [61,62] was downloaded from GEO (GSE72867, GSE30654). From the TCGA, we selected samples for which DNAm data was available from both the primary site and normal solid tissue, and for which we could infer an approximate age (within one year). We selected those probes overlapping tRNA genes yielding 73,403 data points across 19 tissues with an age range of 15-90yrs (median 63.4) (Supplementary File S7)
Mouse RRBS Analysis
We downloaded methylation calls and coverage information resulting from RRBS performed by Petkovich et al. [63] from GEO using GEOquery [118] GSE80672. These data from 152 mice covered 68 tRNA and 436 CpGs after QC requiring >50 reads per CpG and >10 data points per tRNA. We excluded 5 tRNAs from blacklisted regions of mm10 [46]. After QC there were 58 tRNA genes and 385 CpGs. We performed simple linear modeling to predict age from methylation level at each tRNA and each CpG.
Data availability
The MeDIP-seq data supporting the results of this article are available in the EMBL-EBI European Genome-phenome Archive (EGA) under Data set Accession number EGAD00010000983 (https://www.ebi.ac.uk/ega/datasets/EGAD00010000983). The targeted BiS-sequencing data is available at: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA635108
Code availablity
Available at https://github.com/RichardJActon/tRNA_paper_code
Author Contributions
RJA designed experiments and analysed all the processed and experimental data. CGB conceived and designed the experiments. TDS, KW and JW conceived and provided TwinsUK MeDIP-seq data. YX, FG and JW produced raw MeDIP-seq data with WY and JB processing and quality controlling these data. WY contributed an analysis concept. CC, NH, ED, and KL provided MAVIDOS and Hertfordshire sample data. EB, EW, and CAM performed the targeted BiS sequencing experiment. PGH contributed additional data and discussion of results. RJA and CGB wrote the paper. All authors reviewed and approved the final manuscript.
Supplementary materials
Acknowledgements
We gratefully acknowlege the individuals from TwinsUK, Mavidos and the Hertfordshrie cohort. TwinsUK received funding from the Wellcome Trust (Ref: 081878/Z/06/Z), European Community’s Seventh Framework Programme (FP7/2007-2013), the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London. Further funding support for the EpiTwin project was obtained from the European Research Council (project number 250157) and BGI. SNP Genotyping was performed by The Wellcome Trust Sanger Institute and National Eye Institute via NIH/CIDR. The authors would like to thank Nikki Graham for her assistance with the identification and pooling of the MAVIDOS and Hertfordshire DNA samples. The authors also acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work. The MRC-LEU is supported by the Medical Research Council (MRC). CGB recieved support from Diabetes UK (16/0005454). RJA was in receipt of a MRC Doctoral fund (1820097).
Footnotes
Various changes in response to reviewers feedback, briefly: - Section on tsRNA expression dropped. - More detail on the genomic context of age-hypermethylating tRNAs. - Consolidation of figures into panels and improved legends - Additional supplementary data - Human/Mouse orthologs comparison
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.
- 25.
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵