ABSTRACT
Substitution rates in plant mitochondrial genes are extremely low, indicating strong selective pressure as well as efficient repair. Plant mitochondria possess base excision repair pathways, however, many repair pathways such as nucleotide excision repair and mismatch repair appear to be absent. In the absence of these pathways, many DNA lesions must be repaired by a different mechanism. To test the hypothesis that double-strand break repair (DSBR) is that mechanism, we maintained independent lines of plants deficient in uracil-N-glycosylase (UNG) for 10 generations to determine the repair outcomes when that pathway is missing. In the absence of UNG, there is an increase in in double-strand breaks as assayed by recombination at repeated sequences. Surprisingly, given the single-seed descent bottleneck and the bottleneck of mitochondrial genomes in gametogenesis, no single nucleotide polymorphisms were fixed in any line in generation 10. The pattern of heteroplasmy was also unaltered through 10 generations. These results indicate that double strand break repair is a general system of repair in plant mitochondria, and appears to be so efficient that base excision repair is nearly dispensable. The existence of this general system may explain the seemingly anomalous differences between genes and non-genes in plant mitochondria.
INTRODUCTION
Plant mitochondrial genomes have very low base substitution rates, while also expanding and rearranging rapidly (1-4). The low nonsynonymous substitution rates in protein coding genes indicates that selective pressure to maintain the genes is high, and the low synonymous substitution rates indicate that the DNA repair mechanisms are very accurate (5,6). However, little is known about the multiple pathways of DNA repair in plant mitochondria. So far, there is no evidence of nucleotide excision repair (NER), nor mismatch repair (MMR) in plant mitochondria (7,8). It has been hypothesized that in plant mitochondria, the types of DNA damage that are usually repaired through NER and MMR are repaired through double-strand break repair (DSBR) (9,10). Plant mitochondria have the nuclear-encoded base excision repair (BER) pathway enzyme Uracil DNA glycosylase (UNG) (7). UNG is an enzyme that can recognize and bind to uracil in DNA and begin the process of base excision repair by enzymatically excising the uracil (U) residue from single stranded or double stranded DNA (11). Uracil can appear in a DNA strand due to the spontaneous deamination of cytosine, or by the misincorporation of dUTP during replication (12). Unrepaired uracil in DNA can lead to G-C to A-T transitions within the genome.
The low substitution rate and the high rearrangement rate of plant mitochondria must be due to both selection and the specific DNA damage repair mechanisms available. These mechanisms must also account for the observations of genome expansion found in land plant mitochondria. Few pathways of repair besides DSBR are known in plant mitochondria, and it is possible that many lesions, including mismatches are repaired by creating double-strand breaks and using a template to repair both strands. Our hypothesis is that DSBR accounts for most of the repair in plant mitochondria, and both error-prone and accurate subtypes of DSBR lead to the observed patterns of genome evolution (13). One way of testing this is to eliminate the pathway of uracil base excision repair and ask if the G-U mispairs that occur by spontaneous deamination are instead repaired by DSBR.
The Arabidopsis thaliana mitochondrial genome contains two pairs of very large repeats (4.2 and 6.6kb) that commonly undergo recombination (14-16) producing multiple isoforms of the genome. The mitochondrial genome also contains many smaller repeats between 50 and 600 base pairs (16-19). In wild type plants, these repeats of unusual size (ROUS) recombine at very low rates. However, the ROUS have been shown to recombine at higher rates in several mutant lines defective in plant mitochondrial genome maintenance, such as msh1 and reca3 (20-22). In addition, ciprofloxacin (CIP) is a DNA gyrase inhibitor that can cause the formation of double strand breaks (DSBs) in mitochondria (23). Treating plants with CIP has been shown to induce DSBs in A. thaliana mitochondria, overwhelming the DSBR homology surveillance system and leading to increased ectopic recombination between ROUS (24). Thus recombination at the ROUS can be used as an indicator of increased DSBR. In this work we show that a loss of uracil base excision repair leads to increased DSBR, supporting our hypothesis.
Numerous proteins known to be involved in the processing of plant mitochondrial DSBs have been characterized. Plants lacking the activity of mitochondrially targeted recA homologs have been shown to be deficient in DSBR (21,25). In addition, it has been hypothesized that the plant MSH1 protein may be involved in binding to DNA lesions and initiating DSBs (9,10). The MSH1 protein contains a mismatch binding domain fused to a GIY-YIG type endonuclease domain which may be able to make DSBs (26,27). Here we provide evidence that in the absence of mitochondrial UNG activity, several genes involved in DSBR, including MSH1, are transcriptionally upregulated.
MATERIALS AND METHODS
Plant growth conditions:
Arabidopsis thaliana Columbia-0 (Col-0) seeds were obtained from Lehle Seeds (Round Rock, TX, USA). UNG T-DNA insertion hemizygous lines were obtained from the Arabidopsis Biological Resource Center, line number CS308282. Hemizygous T-DNA lines were self-crossed to obtain homozygous lines (Genotyping primers: wild-type 5’-TGTCAAAGTCCTGCAATTCTTCTCACA-3’ and 5’-TCGTGCCATATCTTGCAGACCACA-3’, ung 5’-ATAATAACGCTGCGGACATCTACATTTT-3’ and 5’-ACTTGGAGAAGGTAAAGCAATTCA-3’). All plants were grown in walk-in growth chambers under a 16:8 light:dark schedule at 22°C. Plants grown on agar were surface sterilized and grown on 1x Murashige and Skoog Basal Medium (MSA) with Gamborg’s vitamins (Sigma) with 5μg/mL Nystatin Dihydrate to prevent fungal contamination.
RT-PCR:
RNA was extracted from mature leaves of plants grown in soil during ung generation nine and from seedlings grown on MSA with and without 0.5µM CIP(28). Reverse transcription using Bio-Rad iScript was performed and the resulting cDNA was used as a template for qPCR to measure relative transcript amounts. Quantitative RT-PCR data was normalized using UBQ11 as a housekeeping gene control. Reactions were performed in a Bio-Rad CFX96 thermocycler using 96 well plates and a reaction volume of 20µL/well. SYBRGreen mastermix (Bio-Rad) was used in all reactions. Two biological and three technical replicates were used for each amplification. Primers are listed in Supplementary Table S2. The thermocycling program for all RT-qPCR was a ten-minute denaturing step at 95° followed by 45 cycles of 10s at 95°, 15s at 60°, and 13s at 72°. Following amplification, melt curve analysis was done on all reactions to ensure target specificity. The melt curve program for all RT-qPCR was from 65°-95° at 0.5° increments for 5s each.
ROUS recombination qPCR:
DNA was collected from the mature leaves of Columbia-0 and generation ten ung plants, and from Columbia-0 seedlings grown on MSA with and without 0.5µM CIP using the CTAB DNA extraction method (29). qPCR was performed using primers from the flanking sequences of the ROUS. Primers are listed in Supplementary Table S1. Using different combinations of forward and reverse primers, either the parental or recombinant forms of the repeat can be selectively amplified (see Figure 1A). The mitochondrially-encoded COX2 and RRN18 genes were used as standards for analysis. Reactions were performed in a Bio-Rad CFX96 thermocycler using 96 well plates with a reaction volume of 20µL/well. SYBRGreen mastermix (Bio-Rad) was used in all reactions. Number of biological replicates are listed in figure descriptions, three technical replicates were used for each reaction. The thermocycling program for all ROUS recombination qPCR was a ten-minute denaturing step at 95° followed by 45 cycles of 10s at 95°, 15s at 60°, and a primer specific amount of time at 72° (extension times for each primer pair can be found in Supplementary Table S1). Following amplification, melt curve analysis was done on all reactions to ensure target specificity. The melt curve program for all qPCR was from 65°-95° at 0.5° increments for 5s each.
DNA sequencing:
DNA extraction from frozen leaves of Columbia-0 and generation 10 ung was done by a modification of the SPRI magnetic beads method of Rowan et al (30,31). Genomic libraries for paired-end sequencing were prepared using a modification of the Nextera protocol (32), modified for smaller volumes following Baym et al (33). Following treatment with the Nextera Tn5 transpososome 14 cycles of amplification were done. Libraries were size-selected to be between 400 and 800bp in length using SPRI beads (31). Libraries were sequenced with 150bp paired-end reads on an Illumina HiSeq 4000 by the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley.
Reads were aligned using BWA-MEM v0.7.12-r1039 (34). The reference sequence used for alignment was a file containing the improved Columbia-0 mitochondrial genome (accession BK010421) (35) as well as the TAIR 10 Arabidopsis thaliana nuclear chromosomes and chloroplast genome sequences (36). If reads are aligned only to the mitochondrial genome, some reads containing nuclear or chloroplast sequences with similarity to mitochondrial sequence are incorrectly aligned to the mitochondria, leading to false SNP calls downstream in the analysis (data not shown). Using Samtools v1.3.1 (37), bam files were sorted for uniquely mapped reads for downstream analysis.
Variants were called using VarDict (38). To minimize the effects of sequencing errors, SNPs called by VarDict were filtered by the stringent quality parameters of Allele Frequency ≥ 0.05, Qmean ≥ 30, MQ ≥ 30, NM ≤ 3, Pmean ≥ 8, Pstd = 1, AltFwdReads ≥ 3, and AltRevReads ≥ 3. The mitochondrial reference genome positions corresponding to RRN18 and RRN26 were excluded from analysis because they have similarity to bacterial 16S and 23S ribosomal RNAs, respectively. Sequencing reads from contaminating soil bacteria can be misaligned to these positions and falsely called as low frequency SNPs. No other mitochondrial sequences show such similarity to bacterial genes (data not shown).
RESULTS
Experimental Design
This work was designed as a mutation accumulation study (39). We expected a relatively low mutation rate and chose 23 different ung homozygous plants derived from one hemizygous parent. These 23 plants were designated as generation 1 ung and were allowed to self-cross. The next generation was derived by single-seed descent from each line, and this was repeated until generation 10 ung plants were obtained. Leaf tissue and progeny seeds from each line were kept at each generation.
Increased Double-strand break repair:
If most DNA damage in plant mitochondria is repaired by double-strand break repair (DSBR), supplemented by base excision repair (7), then in the absence of the Uracil-N-glycosylase (UNG) pathway we predict an increase in DSBR. To find evidence of this we assayed ectopic recombination between homeologous repeats, which increases when DSBR is increased (21,22,24). We assayed ROUS recombination in ung generation ten lines and in wild-type by quantitative PCR (qPCR). Different combinations of primers in the unique sequences flanking the ROUS allow us to determine the relative copy numbers of parental type ROUS and low frequency recombinants (Figure 1A). The mitochondrial genes COX2 and RRN18 were used to standardize relative amplification between lines. We and others (18,24) have found that some of the ROUS are well-suited for qPCR analysis and are sensitive indicators of ectopic recombination, increasing in repair-defective mutants and when drugs are used to increase double-strand breaks. We chose to analyze recombination at the three repeats known as Repeats D, L, and B (17). All three show reductions in the parental 2/2 form, while repeat B also shows a reduction in the parental 1/1 form (Figure 1B). This indicates that ectopic recombination is increased, however, this pattern of recombination is different than what is seen in wild-type plants treated with CIP. In the presence of 0.5μM CIP, there is an increase in the recombinant 1/2 and 2/1 forms of the repeats as well (Figure 2).
Substitution Mutations
We also expected that in the absence of UNG there would be an increase in G-C to A-T substitution mutations. To test this prediction, we sequenced 23 independent lines of the ung mutant in generation 10 using an Illumina Hi-Seq4000 system. Mitochondrial sequences from these independent ung lines were aligned to the Columbia-0 reference genome using BWA-MEM and single nucleotide polymorphisms were identified using VarDict. The average depth of coverage of the mitochondrial genome was 130x and the range for different lines was 50x to 467x.
There were no SNPs that reached fixation to an allele frequency of 1 in any ung lines. Mitochondrial genomes are not diploid; each cell can have many copies of the mitochondrial genome. Therefore, it is possible that an individual plant could accumulate low frequency mutations in some of the mitochondrial genomes in the cell. VarDict (38) was used to detect heteroplasmic SNPs at allele frequencies as low as 0.05. VarDict’s sensitivity in calling low frequency SNPs scales with depth of coverage and quality of the sample, so it is not possible to directly compare heteroplasmic mutation rates in samples with different depths of coverage. However, because the activity of the UNG protein is specific to uracil, the absence of the UNG protein should not have any effect on mutation rates other than G-C to A-T transitions. Because of this, heteroplasmic mutations that are not G-C to A-T transitions can be considered the background rate of heteroplasmic SNP accumulation in plant mitochondria. We therefore compared the numbers of G-C to A-T transitions to all other mutations. If the ung mutants are accumulating heteroplasmic SNPs at a faster rate than wild-type, we would expect to see that as an increased number of G-C to A-T transitions compared to other mutation types. In contrast, there is little difference in the ratios of G-C to A-T transitions between ung and Col-0 (see Table 1). Because detection of low frequency SNPs depends on read depth, we only analyzed the 7 ung samples with an average mitochondrial read depth above 125x for this comparison. In the absence of a functional UNG protein, plant mitochondria do not accumulate cytosine deamination mutations at an increased rate.
Alternative Repair Pathway Genes
Because the ung mutants show increased double-strand break repair but not an increase of G-C to A-T transition mutations, we infer that the inevitable appearance of uracil in the DNA is repaired via conversion of a G-U pair to a double-strand break and efficiently repaired by the DSBR pathway. If this is true, genes involved in the DSBR processes of breakage, homology surveillance and strand invasion in mitochondria will be up-regulated in ung mutants. To test this hypothesis, we assayed transcript levels of several candidate genes known to be involved in DSBR (8,17,20-22,24,25,40-42) in ung lines compared to wild-type using RT-PCR. MSH1 and RECA2 were upregulated in ung lines (9.68 fold ± 4.64 and 5.39 fold ± 0.98, respectively – see Figure 3). RECA3, OSB1, SSB, and RECG showed no differential expression compared to wild-type.
To investigate the differences in ROUS recombination between ung lines and wild-type lines treated with CIP, expression of DSBR genes was compared between wild-type plants with and without CIP treatment. MSH1 was downregulated in wild-type plants grown in the presence of CIP compared to untreated control plants (0.42 fold ± 0.1 – see Figure 3)
Discussion
In the mitochondrion as well as in the nucleus and chloroplast, cytosine is subject to deamination to uracil. This could potentially lead to transition mutations, and is dealt with by a specialized base excision repair pathway. The first step in this pathway is hydrolysis of the glycosidic bond by the enzyme Uracil-N-glycosylase (UNG), leaving behind an abasic site (11). An AP endonuclease can then cut the DNA backbone, producing a 3’ OH and a 5’ dRP. Both DNA polymerases found in Arabidopsis mitochondria, POL1A and POL1B, exhibit 5’-dRP lyase activity, allowing them to remove the 5’ dRP and polymerize a new nucleotide replacing the uracil (43). In the absence of functional UNG protein, cytosine will still be deaminated in plant mitochondrial genomes, so efficient removal of uracil must be through a different repair mechanism, most likely DSBR (9,10). We have found that in ung mutant lines, there is an increase in double-strand break formation as assayed by an increase in ectopic recombination at ROUS, as well as an increase in the expression of genes known to be involved in DSBR, consistent with this hypothesis.
Surprisingly, we have also found that ung lines do not accumulate G-C to A-T transition mutations at a higher rate than wild-type. This finding is particularly surprising given the possible bottlenecking of mitochondrial genomes during female gametogenesis, and given the deliberate bottleneck in the experimental design of maintaining the different lines through single-seed descent for 10 generations. This finding supports the hypothesis that plant mitochondria have a very efficient alternative mismatch or damage surveillance system that prevents G-C to A-T transitions from becoming fixed in the mitochondrial population, but that the side-effect of increased ectopic recombination may be the selective pressure that maintains the UNG pathway.
The angiosperm MSH1 protein consists of a DNA mismatch binding domain fused to a double-stranded DNA endonuclease domain (26,27). Although mainly characterized for its role in recombination surveillance (22), MSH1 is a good candidate for a protein that may be able to recognize and bind to various DNA lesions and make DSBs near the site of the lesion, thus funneling these types of damage into the DSBR pathway. With many mitochondria and many mitochondrial genomes in each cell there are numerous available templates to accurately repair DSBs through homologous recombination, making this a plausible mechanism of genome maintenance. Here we show that in ung lines, MSH1 is transcriptionally upregulated more than 9-fold compared to wild-type. This further supports the hypothesis that MSH1 initiates repair in plant mitochondria by creating a double-strand break at G-U pairs, and possibly other mismatches and damaged bases.
Several other proteins involved in processing plant mitochondrial DSBs have been characterized. The RECA homologs, RECA2 and RECA3, are homology search and strand invasion proteins (21,22,25,42,44-46). The two mitochondrial RECAs share much sequence similarity, however RECA2 is dual targeted to both the mitochondria and the chloroplast, while RECA3 is found only in the mitochondria (21,22). RECA3 also lacks a C-terminal extension present on RECA2 and most other homologs. This extension has been shown to modulate the ability of RECA proteins to displace competing ssDNA binding proteins in E. coli (47). Arabidopsis reca2 mutants are seedling lethal and both reca2 and reca3 lines show increased ectopic recombination at ROUS (21). Arabidopsis RECA2 has functional properties that RECA3 cannot perform, such as complementing a bacterial recA mutant during the repair of UV-C induced DNA lesions (42). Here we show that in ung lines, RECA2 is transcriptionally upregulated more than 5-fold compared to the wild-type. However, RECA3 is not upregulated in ung lines. Responding to MSH1-initiated DSBs may be one of the functions unique to RECA2. The increased expression of RECA2 in the absence of a functional UNG protein is further evidence that uracil arising in DNA may be repaired through the mitochondrial DSBR pathway.
We also tested the differential expression of other genes known to be involved in processing mitochondrial DSBs. The single stranded binding protein genes OSB1 and SSB, and the helicase RECG were not found to be differentially expressed at the transcript level compared to wild-type (data not shown).
The specific patterns of recombination at mitochondrial ROUS are different between wild-type, ung mutants, and DSBR mutants. In msh1 lines, there is an increase in ROUS recombination likely due to relaxed homology surveillance in the absence of the MSH1 protein (22). In ung lines, the mitochondrial recombination machinery is still intact, so any differences in ROUS recombination between ung lines and wild-type are solely due to differences in processing uracil in DNA with or without the UNG protein. At many ROUS, when DSBs are chemically induced by ciprofloxacin, parental type repeats are reduced while recombinant type repeats increase in prevalence (24). The ung lines show reductions in parental type repeats, but no change in recombinant type repeats. One explanation for this behavior is that the upregulation of MSH1 in the ung lines leads to an increase in DSBs at deaminated cytosine and reduces the accumulation of recombinant ROUS. In contrast, there is a downregulation of MSH1 in plants exposed to 0.5µM CIP (Figure 3). This downregulation would lead to relaxation of the homology surveillance activity of MSH1, making accumulation of recombinant ROUS more common. Such asymmetric accumulation of recombination products has been documented in plant mitochondrial genomes (17,18,21,22,48).
To determine the outcomes of genomic uracil in the absence of a functional UNG protein, we sequenced the genomes of several ung lines. No fixed mutations of any kind were found in ung lines, even after 10 generations of self-crossing. Low frequency heteroplasmic SNPs were found in both wild-type and ung lines, but ung lines showed no difference in the ratio of G-C to A-T transitions to other mutation types when compared to wild-type.
Clearly the double-strand break repair pathway in plant mitochondria can repair uracil in DNA sufficiently to prevent mutation accumulation, providing evidence that DSBR is the primary mechanism of repair. Why then has the BER pathway been conserved in plant mitochondria where NER and MMR have apparently been lost? DSBR protects the genome efficiently from mutations, but also leads to an increase in ectopic recombination and rearrangements in plant mitochondria. We suggest that for DNA lesions canonically repaired through NER or MMR pathways, the slight increase in ectopic recombination due to repair by DSBR does not incur any fitness cost compared to the strong selection for accurate repair of genes. In Arabidopsis, the most common type of heteroplasmic SNP found in the mitochondrial genome is the G-C to A-T transition; cytosine deamination is therefore a very common type of DNA damage in plant mitochondria. Perhaps the increased ectopic recombination that results from the repair of these very common DNA lesions provides just enough selective pressure to maintain an intact BER pathway in plant mitochondria. The UNG protein is also transported to the plastid (7). UNG might therefore be retained in the mitochondria due to selection in the plastid.
Here we have provided evidence that in the absence of a dedicated BER pathway, plant mitochondrial genomes do not accumulate SNPs at an increased rate. Instead, DNA damage is accurately repaired by double-strand break repair which also causes an increase in ectopic recombination at homeologous repeats. Do plant mitochondria even need base excision repair? The answer appears to be “barely.” It has recently been shown that mice lacking a different mitochondrial BER protein, oxoguanine glycosylase, do not accumulate mitochondrial SNPs (49). Here we show that in plants base-excision repair by UNG is similarly unnecessary to prevent mitochondrial mutations. Perhaps a generalized system of DNA repair also exists in mammalian mitochondria similar to the broad capacity of DSBR to repair different lesions in plant mitochondria. Clearly DSBR is efficient and accurate, and the presence of the UNG pathway reduces ectopic recombination slightly. Double strand break repair and recombination are important mechanisms in the evolution of plant mitochondrial genomes, but many key enzymes and steps in the repair pathway are still unknown. Further identification and characterization of these missing steps is sure to provide additional insight into the unique evolutionary dynamics of plant mitochondrial genomes.
Data Availability
Fastq files generated from Illumina sequencing of ung lines and wild-type control are available from the Sequence Read Archive, BioProject ID PRJNA492503.
Funding
This work was supported by a grant from the National Science Foundation to A.C.C. (MCB-1413152), and used the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 OD018174 Instrumentation Grant.
Conflict of Interest Statement
None declared.
Acknowledgements
We are grateful to Beth Rowan, UC Davis, for advice and protocols for DNA extraction and Illumina sequencing. Thanks to Emma Purfeerst for keeping our lab running and putting up with us during her time here and to Emily Jezewski for help with qPCR. Thanks to Sterling Ericsson and Ana Martinez-Hottovy for discussion and comments on the manuscript. Daniel Schactman helped with the disposal of leaves from generations 2 through 9 of the ung lineages.