Abstract
Aging is associated with accumulation of somatic mutations. This process is especially pronounced in mitochondrial genomes of postmitotic cells, which accumulate large-scale somatic mitochondrial deletions with time, leading to neurodegeneration, muscular dystrophy and aging. Slowing down the rate of origin of these somatic deletions may benefit human lifespan and healthy aging. The main factors determining breakpoints of somatic mitochondrial deletions are direct nucleotide repeats, which might be considered as Deleterious In Late Life (DILL) alleles. Correspondingly, the decreased amount of these DILL alleles might lead to low production of somatic deletions and increased lifespan. Intriguingly, in the Japanese D4a haplogroup, which is famous for an excess of centenarians and supercentenarians, we found that the longest direct repeat (“common repeat”) in the human mitochondrial genome has been disrupted by a point synonymous mutation. Thus we hypothesize that the disruption of the common repeat annuls common deletion (which is the most frequent among all somatic deletions) and at least partially may contribute to the extreme longevity of the D4a Japanese haplogroup. Here, to better understand the mitochondrial components of longevity and potential causative links between repeats, deletions and longevity we discuss molecular, population and evolutionary factors affecting dynamics of mitochondrial direct repeats.
Introduction
Phenotypic effects of aging are largely determined by the expression of hundreds of alleles which are deleterious in late life (Hughes et al. 2002; Reed and Bryant 2000), and either neutral, slightly deleterious, or slightly beneficial in early life (Vermulst et al. 2008; Cortopassi 2002). These alleles can either have a direct phenotypic effect of their own, or increase the rate of somatic mutations which, in turn, can have strong phenotypic consequences. For example, the proof-reading-deficient version of nucleus-encoded mtDNA polymerase causes somatic accumulation of point mutations (Trifunovic et al. 2004) and deletions (Vermulst et al. 2008) in the mitochondrial genome (mtDNA) of mice, leading to reduced lifespan and premature onset of aging-specific phenotypes. Another example of alleles with indirect effect on phenotype is mtDNA direct repeats. Repeats can hybridize with each other and promote harmful somatic deletions in the mitochondrial genome (Schon et al. 1989). Somatic accumulation of mtDNA with deletions is particularly pronounced in post-mitotic cells (Kowald and Kirkwood 2018). For example, clonal expansion of short mtDNAs in neurons determines the Parkinson disease (Bender et al. 2006) and Kearn-Sayre syndrome (Schon et al. 1989) and is associated with healthy aging (Kraytsberg et al. 2006), while in muscle fibers it determines human myopathies (Herbst et al. 2007; Vincent et al. 2018).
In a comparison of mammalian species, the abundance of direct nucleotide repeats in mtDNA was negatively associated with longevity (Samuels 2004; Samuels, Schon, and Chinnery 2004; Khaidakov, Siegel, and Shmookler Reis 2006), suggesting that direct repeats may constrain mammalian lifespan, probably by affecting the probability of harmful somatic deletions. Furthermore, it has been shown that the number of the inverted repeats in mitochondrial genome is correlated with mammalian longevity (J.-N. Yang, Seluanov, and Gorbunova 2013). Recently, however, it has been shown that this correlation is not driven by negative selection against direct repeats in long - lived mammals (Lakshmanan et al. 2015), questioning the results of previous works.
Here we analyze the relationship between mtDNA direct repeats and longevity in humans. We discuss the association between the disrupted common repeats in mtDNA and the increased lifespan of the Japanese D4a haplogroup, and propose the existence of a functional link between the absence of the repeat, deficit of somatic deletions and increased lifespan. We discuss potential application of the haplogroup with the disrupted common repeat in the current mitochondrial donation technology as well as the possibility to disrupt the common repeat in the future. Next, we address the question if the loss of the repat can be considered a beneficial mutation from the evolutionary point of view, and observe no current evidence to support it. We extrapolate our conclusion to all mammalian species, questioning the existence of purifying selection against direct repeats in long-lived mammals.
1. disrupted mtDNA common repeat may increase the human lifespan and decrease the occurrence of mitochondrial encephalomyopathies
Age-related accumulation of somatic mitochondrial mutations (Khrapko and Vijg 2009) is especially pronounced in mitochondrial genomes (mtDNA) of postmitotic cells, such as neurons and skeletal muscles. Continuous turnover of numerous mitochondrial genomes within postmitotic cells creates an additional intracellular level of selection, which might lead to propagation of mutant mitochondrial genomes. For example, mitochondrial genomes of substantia nigra neurons start to accumulate somatic large-scale deletions after 40 years of life (Kraytsberg et al. 2006). Initially intracellular fraction of these mutated mitochondrial genomes is very low (1 among 10,000 wild type full-length copies of mtDNA inside a cell) and has no phenotypic effect. However these short mitochondrial genomes tend to expand within a cell and during several dozens of years their fraction reaches phenotypically important 50-80% of heteroplasmy. Continuous accumulation of short mutated mtDNA within neurons ultimately leads to neurodegeneration, one of the important phenotypes of aging. Thus, understanding of the molecular mechanisms, affecting the time of origin of the somatic mtDNA deletions as well as their rate of clonal expansion is extremely important (Khrapko 2011; Popadin et al. 2014). Slowing down these processes may postpone the process of neurodegeneration and sarcopenia, account for healthier aging and possibly increase human lifespan.
Analyzing somatic mtDNA deletions in human neurons we (Guo et al. 2010) and others (Samuels, Schon, and Chinnery 2004) demonstrated that the main factors determining the deletion breakpoints are long imperfect duplexes, inside which there are short stretches of direct nucleotide repeats. Direct repeats also play a key role in the formation of deletions according to the newest "copy-choice recombination" model, proposed by Maria Falkenberg lab (Persson et al. 2019). Since these repeats are associated with somatic deletions that in turn affect human health status in post-reproductive age, the repeats are considered as “Deleterious In Late Life” alleles (DILL): neutral during reproductive age (and correspondingly neutral from a pure evolutionary point of view) but deleterious in late life and correspondingly deleterious from a health point of view. We hypothesized that the decreased number of direct repeats in the mitochondrial genome can lead to low rate or late time of origin of somatic deletions and thus may make aging more healthy or postpone it.
Intriguingly, we found that the longest (and correspondingly the most severe) direct repeat in the human mitochondrial genome (a 13 base-pair “common” nucleotide repeat observed in 98% of the human population) has been disrupted by point mutation m.8473T>C in the Japanese D4a haplogroup, famous because of an excess of centenarians (persons who live 100 or more years) and supercentenarians (persons who live 110 or more years) (Bilal et al. 2008; Alexe et al. 2007). We hypothesized that the disruption of the common repeat has a beneficial effect on the D4a haplogroup since it decreases the probability that the corresponding somatic deletion (which is also called “common” deletion because it appears very often in aged postmitotic tissues) originates during life and thus might postpone neurodegeneration and sarcopenia explaining at least partially the extreme longevity of the D4a haplogroup (Konstantin Popadin 2008) (Figure 1). As a proof-of-principle experiment we analyzed samples of frontal cortex of two aged individuals from haplogroup N1b, harboring similar but not identical to D4a germ-line variant, m.8472C>T disrupting the common repeat (Guo et al. 2010). In line with our hypothesis, we observed no common deletions at all in their mitochondrial genomes, which implies that the disruption of the 13 bp repeat even by a single-nucleotide germline variant completely blocks the formation of the somatic common deletion. Recently, a new model of deletion formation has been proposed (Persson et al. 2019), which postulates that formation of the mtDNA deletion is a result of slipped replication during active L-strand mtDNA synthesis. Quite interestingly, the authors demonstrated in in vitro experiments that the deleted product was lost when either 8470 or 13447 arms of the common repeat were mutated (Persson et al. 2019). We would like to emphasize that the results of this in vitro experiment are completely in line with our original hypothesis (Konstantin Popadin 2008) as well as with the absence of the common deletions in the N1b haplogroup (Guo et al. 2010).
Putting together several orthogonal lines of evidence: (i) an association of somatic mtDNA deletion load with neurodegeneration (Bender et al. 2006; Kraytsberg et al. 2006) and sarcopenia (Herbst et al. 2007; Vincent et al. 2018); (ii) an association between the disrupted repeat and increased lifespan of D4a haplogroup (Bilal et al. 2008; Alexe et al. 2007); (iii) an absence of the common deletions in aged frontal cortex samples in N1b haplogroup with disrupted common repeat (Guo et al. 2010); and (iv) blocking of the deletion formation by mutated arms of the common repeat in the in vitro experiment (Persson et al. 2019), we conclude that the disrupted common repeat indeed can grant more healthy aging to the carriers by curing of somatic deletion load (Figure 1).
2. Can we use it? Choice of mtDNA haplogroup as a part of assisted reproductive technology now and targeted modification of the common repeat in the future
We have shown above that disrupted common repeat may increase human longevity and decrease predispositions to mitochondrial encephalomyopathies such as neurodegeneration and sarcopenia. In this chapter we would like to discuss current and future medical strategies, which can use beneficial properties of such variants as m.8473T>C.
The assisted reproductive technologies (ART) that aim to reduce or prevent transmission of deleterious mtDNA mutations have a long history. In the beginning, embryologists had introduced the donor cytoplasm with “healthy” mitochondria into one of the recipient oocytes; the technique was subsequently banned by the FDA due to safety reasons (St. John 2002). Recently, after great modifications, a new version of this technique called mitochondrial donation has been introduced into practical medicine to prevent the inheritance of mutant mtDNA (Kang et al. 2016; Hyslop et al. 2016; Gorman et al. 2018). Currently the technology is the following: removal of the nuclear genome from an oocyte or zygote that carries an mtDNA mutation, followed by the transfer to an enucleated donor oocyte or zygote with wild-type mtDNA. However, even up-to-date technology is not perfect: 15% of human embryonic stem cell lines obtained from the embryo after mitochondrial donation restored the pool of mutated mitochondria trapped with nuclear material (called mtDNA carryover), which significantly exceeds the 1% expected due to genetic drift (Kang et al. 2016). The mechanism of expansion of a certain subpopulation of mitochondria in the process of embryogenesis is not clear yet (Wolf, Hayama, and Mitalipov 2017), although it has been shown that certain mtDNA haplogroups tend to replicate faster than others (Kang et al. 2016). In this context, it is worth thinking about choosing the mtDNA haplogroup during the mitochondrial donation procedure. And during this choice it is possible to consider such variants as m.8473T>C in D4a which are not only lacking deleterious variants but also carry potential beneficial one which may decrease occurrence of mitochondrial encephalomyopathies and increase longevity.
In the United Kingdom, the ART procedure with complete substitution of mitochondrial genome has been allowed only in the case of high risk of transmission of severe mitochondrial diseases and only if the embryo is a male (Craven et al. 2017). This gender restriction aims to avoid transmission of the mixed mtDNA haplogroups (heteroplasmy of recipient and donor mtDNAs) from generation to generation which is not natural and thus might have some side effects. Indeed, it has been shown that the mixture of mtDNA haplotypes has unusual genetic and behavioral effects in mice, even when each haplotype alone produces a normal phenotype (Sharpley et al. 2012; Lane 2012; Jones 2012). Interestingly, the recent discovery (although extremely questionable - see (Salas et al., n.d.)) of frequent biparental inheritance of mtDNA (Luo et al. 2018) may suggest that mixes of mtDNA haplotypes are pretty common in our species and thus there is nothing artificial in the creation of females with mixed mtDNA haplotypes.
In many countries the mitochondrial donation technology is not forbidden, that leads to several kids, born from the “three parents”. The first such child was born in Mexico from a mother with a mitochondrial disease (Leigh Syndrome, m.8993T>G, (Zhang et al. 2017)). After this, seven kids were born in Ukraine (“Webinar: ‘Is There an Alternative to Egg Donation?’ | DL-Nadiya” 2019), for whom mitochondrial donation technology was used as a tool of rejuvenation of oocytes. Despite the mtDNA carryover problem, Ukraine embryologists have created two girls which can transmit the mix of different haplotypes to the next generation (at the moment of the manuscript preparation the data about heteroplasmy level in child born in Ukraine have not published yet).
There is an additional concern that donated mtDNA may lead to some deleterious consequences, because the mitochondrial genome is placed in a novel nuclear environment and previously coevolved mito-nuclear interactions are disrupted. Analysing this problem from the population genetic point of view, Eyre-Walker concluded that mitochondrial donation is unlikely harmful in human population which are on average weakly differentiated (Eyre-Walker 2017). However, recently deleterious mitonuclear incompatibility in six admixed human populations has been described (Zaidi and Makova 2019), meaning that the problem of potential mitonuclear incompatibility as a result of mitochondrial donation is still open. To avoid both these problems - mixes of haplotypes as well as mito-nuclear incompatibility there is a theoretical possibility to edit mtDNA introducing just one substitution, such as m.8473T>C.
It seems to be an attractive idea to edit the mtDNA sequence in a predictable manner. Unfortunately, currently, there are no available techniques to do that in living cells. The heteroplasmy shift is still the only way to manipulate the mtDNA sequence (Bacman et al. 2018; Gammage et al. 2018) by the induction of the targeted double-strand breaks into the mtDNA structure which leads to its elimination by replicative machinery (Peeva et al. 2018). It is worth noting that the heteroplasmy shift was also used on eggs as part of ART in animals (Reddy et al. 2015; Y. Yang et al. 2018; McCann et al. 2018). We would like to emphasize that the heteroplasmy shift is not considered as a “true” genome editing. That is, we take the book of the genome of an organism and tear out a page with information on a specific gene. Although such manipulations are often referred to as genome editing, in the words of George Church, “burning a page of the book is not editing the book” (Ledford 2016).
For “true” mtDNA editing, there are several modified RNA-guided DNA-endonucleases (RGENs) systems called base editors (Rees and Liu 2018) which do not require the generation of double-strand breaks. The limitation on ubiquitous use of these systems is their two-component nature: RGENs include both protein and RNA parts. Successful delivery of RNA moieties into mitochondria (as a part of RGEN) is very complicated, but there is an experimental proof of its possibility (Jo et al. 2015; Loutre et al. 2018; Bian et al. 2019). Thus, in the future, there are all the prerequisites that mtDNA editing technology can be implemented, and even after some time, introduced into the clinic as ART, as it has happened with mitochondrial donation technology (Figure 2).
3. no evidence that disruption of the common repeat is evolutionary beneficial
If disruption of the common repeat increases the longevity, but has no effect on fitness, it is not subject to selection and is expected to be evolutionary neutral. However, the disruption of the common repeat may also increase fitness (i) directly - if carriers are healthier during reproductive age and/or have higher fertility, or (ii) indirectly if increased longevity of parents and grandparents is advantageous to offspring (grandmother effect). Potential importance of the grandmother effect in human population has been discussed in several recent interesting papers (Chapman et al. 2019; Engelhardt et al. 2019). Here, we only focus on direct effects, which may provide evolutionary benefits to the carriers of the disrupted common repeat. Direct mechanisms assume that variation in deletion load is important even at low level of heteroplasmy which is typical for reproductive age (20-50 years). If so, for example 5% versus 10% of heteroplasmy in postmitotic tissues of carriers versus controls may help to maintain these tissues in healthier conditions (decreased sarcopenia, decreased neurodegeneration during the reproductive age), or, for example, 1% versus 3% of heteroplasmy in carrier oocytes versus control ones may increase fertilisation rate of carriers. Uncovering these weak differences between cases and controls requires deep and large-scale phenotypic characterisation of both cohorts, which has not been done yet. An additional idea is that the average germline mutation rate in carriers might be decreased because of the lower frequency of deletions, lower oxidative stress in D4a oocytes and lower ROS production. If so, we can test this effect by analysing human mtDNA tree and comparing the mutation rate of carriers versus controls.
To test this possibility, we used all human mitochondrial genomes available in the HmtDB database (Clima et al. 2017), reconstructed their phylogeny, called haplogroups for each genome and analyzed two arms of the common 13 bp repeat (see supplementary materials). As expected, we observed that more than 98% (42641 out of 43437) of human genomes have perfect common direct repeat, inherited from the common ancestor with chimpanzee. Among the carriers of the disrupted repeat, only the proximal arm (8469–8482 bp) was disrupted, while the distal arm (13447–13459 bp) was completely conserved. The most frequent variants of the disrupted proximal arm are presented in table 1. Among them, 399 cases have the substitution m.8473T>C, which has occurred many times independently leading to 5 big clusters with more than 20 genomes each, and many small clusters (Figure 3). The observation that this rare m.8473T>C substitution is not unique to D4a and marks at least four big additional subtrees in the human mitochondrial tree should rise an interest to both the longevity and potentially decreased predisposition to mitochondrial encephalomyopathies of the following haplogroups: R2, U2e, H1c and U6a. Currently, in the literature we found no evidence for increased longevity of these haplogroups. Importantly, other variants of the disrupted common repeat are also informative; for example, the D5a haplogroup with disrupted common repeat by m.8479A>G (see Table 1) is associated with increased longevity (Alexe et al. 2007).
To evaluate the potential effect of the disrupted repeat on germline rate of nucleotide substitutions, we have focused on the five haplogroups with the m.8473T>C substitution (hereafter, cases) and assigned to each of them the closest sister subtree as a control. Next, we used a modified relative ratio test. We approximated the mutation rate by the branch lengths from the common ancestor (ancestor of both cases and controls) to the terminal tips of cases and controls using three subsets of weakly-constrained positions (with high variant allele frequencies in the human population) and applying four substitution matrices (Supplementary Materials). We observed no universal trend: the D4a haplogroup demonstrated a decreased substitution rate, U6a and U2e demonstrated an increased substitution rate, and R2 and H1c showed no effect at all (Supplementary Material, Table 1). The decreased germ-line mutation rate in D4a is very interesting and may reflect not only the disrupted mtDNA repeat discussed in this paper, but also nuclear (POLG, TWINKLE, associated with the haplogroup) as well as environmental factors affecting both the decreased mutation rate (germline and somatic) and increased longevity. Thus, it is worth to continue future analyses of the decreased germline and somatic mtDNA mutation rate in D4a and other mtDNA haplogroups, associated with increased longevity. Altogether however in this pilot study, we concluded that there is no evidence confirming a decreased mutation rate in all subtrees with disrupted common repeat and thus, there are no evidence that the disruption of the common repeat is evolutionary beneficial per se. However, we would like to emphasize that a deeper phenotypic description (occurence of mitochondria-related diseases such as sarcopenia, neurogeneneration etc.) of all haplogroups with disrupted common repeat (table 1) may shed light on potential benefits of these substitutions (Raule et al. 2014).
4. no evidence of negative selection against direct repeats in long-lived mammals
In the previous chapter we couldn’t find evidence for selection favouring disruption of the common repeat in human population. Here we would like to extrapolate this logic on all mammalian species and question several studies claiming that negative correlation between abundance of the direct repeats and longevity assumes a negative selection against direct repeat in long-lived mammals. We hypothesize, that negative correlation between repeats and longevity might appear as a result of increased number of direct repeats in short-lived mammals due to their more asymmetrical nucleotide composition. We focus on mtDNA nucleotide content as a strong potential confounder effect of which may explain majority of the previous results. It is expected, that in random nucleotide sequence with equal nucleotide content (A, T, G and C with each 25%) abundance of repeats will be minimal and as soon as nucleotide frequency deviate from 25% the probability of origin of repeats is getting higher due to purely combinatorial nature of these repeats (in an extreme scenario if the whole sequence is made by the same nucleotide, the whole genome will be covered by repeats). To visualise this effect and to test the potential strength of this confounder we performed simple in silico experiment where we simulated random nucleotide sequences with length 16000 base pairs and different nucleotide contents (changing frequency of one nucleotide from 10 to 50% and keeping all three other nucleotides with the same frequencies) and estimated for them an abundance of direct repeats as previously (Guo et al. 2010) (Figure 4A). We can see that indeed minimal abundance of repeats corresponds to 25%, and any deviations from this frequency leads to increased abundance of direct repeats. The effect of nucleotide content on abundance of repeats is very strong and thus this confounder might be of great importance in mammalian mtDNAs with strongly biased nucleotide contents. It has been shown for example that longevity of mammals is positively associated with mtDNA GC content which might be explained either by selection forces (Lehmann et al. 2008) or mutational bias (Mikhaylova et al., n.d.). Irrespectively of the explanation of the bias we assume, that the higher the deviation from 25% the higher the number of randomly expected direct repeats. Below we test the importance of the nucleotide composition using two approaches: (i) random reshuffling and (ii) multiple linear models.
First of all using 705 mammalian species with sequenced whole mitochondrial genome and known generation length (Pacifici et al. 2013) we performed correlation analysis between the generation length and the fraction of the genome, covered by direct repeats. Both generation length and abundance of the direct repeats were normalized using phylogenetically independent contrasts (Felsenstein 1985). We observed weak negative correlation between generation length and direct repeats (Spearman’s rho = −0.076, p - value = 0.04, Figure 4B red vertical line). Next we were interesting in the main driver of this correlation: either nucleotide content play the primary role in this correlation (short-lived mammals are more A rich and this will increase number of randomly expected repeats in short-lived mammals) or negative selection against direct repeats in long-lived mammals is also involved (this will decrease number of repeats in long-lived mammals)? To approach this question we 100 times randomly reshuffled all mammalian genomes maintaining their original nucleotide content and their original generation length and tested - if there is a correlation between the abundance of direct repeats and generation length. Interestingly, we observed that all 100 correlations based on reshuffled sequences were much stronger as compared to the real one (Figure 4B). It means, that species-specific nucleotide content, associated with generation length has strong enough effect to artificially drive the negative correlation between the repeats and generation length. Using this analysis we can’t claim the absence of selection against direct repeats in long-lived mammals but we can demonstrate that nucleotide content is extremely important confounder, which leads to strong negative correlation without any selection related to longevity. We would like to emphasize that in our reshuffling approach we didn’t compare directly real sequences with reshuffled ones, we used reshuffled sequences just to demonstrate how important nucleotide composition might be in the shaping of the abundance of direct repeats.
Second, we performed multiple linear models, where abundance of direct repeats was explained as a function of generation length and nucleotide content (Supplementary Materials, Table 2). In the majority of models, effect of the generation length was nonsignificant (marginally significant), while nucleotide content was almost always significant and, importantly to note, that for A and T nucleotides which have average frequency higher than 25% coefficients are positive, while for G and C nucleotides, which have frequency less than 25%, coefficients were negative. In other words increase in the fraction of rare nucleotides is associated with decrease in the amount of direct repeats while increase in the fraction of frequent nucleotides is associated with increase in the amount of direct repeats (see Figure 4A). This result is completely in line with our notion that abundance of direct repeats is mainly determined by nucleotide content and thus it seems neutral, not under strong negative selection.
Conclusion
Combining several lines of evidence we suggest that disrupted common repeat in human mtDNA decreases somatic deletion load postponing age-related degradation of postmitotic cells and aging per se (Figure 1). Thus corresponding mtDNA haplogroups with the disrupted repeat might be used in mitochondrial donation technologies (Figure 2). Interestingly, despite the beneficial nature of these disruptions from the human health point of view, we found no evidence supporting that these disruptions are under positive selection in either human (Figure 3) or other mammalian species (Figure 4). This category of variants, beneficial for human aging but neutral selectively, are important for both to cure age-related diseases and to understand deeper evolutionary mechanisms of aging (Hughes et al. 2002).
Acknowledgments
K.P. was supported by the 5 Top 100 Russian Academic Excellence Project at the Immanuel Kant Baltic Federal University. This work was also supported by Russian Foundation of Basic Research [No. 18-29-13055 & 18-04-01143 to K.P] and Russian Science Foundation [No. 17- 75-20015 to I.M.]