ABSTRACT
APOBEC3 deaminases (A3s) provide mammals with an anti-retroviral barrier by catalyzing dC-to-dU deamination on viral ssDNA. Within primates, A3s have evolved diversely via gene duplications and fusions. Human APOBEC3C (hA3C) efficiently restricts the replication of viral infectivity factor (vif)-deficient Simian immunodeficiency virus (SIVΔvif), but for unknown reasons, it inhibits HIV-1Δvif weakly. In catarrhines (Old World monkeys and apes), the A3C loop 1 displays the conserved amino acid pair WE, while the corresponding consensus sequence in A3F and A3D is the largely divergent pair RK, which is also the inferred ancestral sequence for the last common ancestor of A3C|D|F in primates. Here, we report that modifying the WE residues in hA3C loop 1 to RK leads to stronger interactions with ssDNA substrate, facilitating catalytic function, which resulted in a drastic increase in both deamination activity and the ability to restrict HIV-1 and LINE-1 replication. Conversely, the modification hA3F_WE resulted only in a marginal decrease in HIV-1Δvif inhibition. The two series of ancestral gene duplications that generated A3C, A3D-CTD and A3F-CTD allowed neo/subfunctionalization: A3F-CTD maintained the ancestral RK residues in loop 1, while strong evolutionary pressure selected for the RK→WE modification in catarrhines A3C, possibly allowing for novel substrate specificity and function.
AUTHOR SUMMARY The restriction factors of the APOBEC3 (A3) family of cytidine deaminases inhibit the replication of Vif-deficient retroviruses mainly by mutating their viral genomes. While there are seven A3 proteins (A3A-A3H) found in humans only A3G and A3F potently inhibit HIV-1 replication. A3C in general and its retroviral restriction capacity have not been widely studied probably due to its weak anti-HIV-1 activity, however, it displays a strong antiviral effect against SIV. Understanding the role of A3C is important because it is highly expressed in CD4+ T cells, is upregulated upon HIV-1 infection, and is distributed cell-wide. In this study, we report that replacing two residues in loop 1 of A3C protein with conserved positively-charged amino acids enhance the substrate DNA binding, which markedly facilitates its deamination-dependent antiviral activity against HIV-1 as well as increasing the restriction of LINE-1 retroelements. Furthermore, our evolutionary analysis demonstrates that the pressure that caused the loss of potential loop 1 residues occurred only in A3C but not in primate homologues. Overall, our study highlights the possibility of A3C acting as a super restriction factor, however, this was likely evolutionarily selected against to achieve a balance between anti-viral/anti-LINE-1 activity and genotoxicity.
INTRODUCTION
The APOBEC3 (A3) family of single-stranded (ss) DNA cytidine deaminases builds an intrinsic immune defense against retroviruses, retrotransposons, and other viral pathogens [1–4]. There are seven human A3 proteins (A3s) that possess either one (A3A, A3C, and A3H) or two (A3B, A3D, A3F, and A3G) zinc (Z)-coordinating DNA cytosine deaminase motifs, HXE[X23-28]PC[X2-4]C (where X indicates a non-conserved position) [5–7]. A3G was identified as a factor capable of restricting infection of HIV-1 lacking Vif (viral infectivity factor) protein in non-permissive T cell lines whose biochemical properties and biological functions were extensively studied [3,8–11].
The encapsidation of A3 into the viral particles is crucial for virus inhibition [12–17]. During reverse transcription, viral core-associated A3 enzymes can deaminate cytidines (dC) on the retroviral ssDNA into uridines (dU). These base modifications in the minus DNA strand cause coding changes and premature stop codons in the plus-strand viral genome (dG→dA hypermutation), which impair or suppress viral infectivity [2,9,18–21]. In addition to the mutagenic activity of the viral-incorporated A3 enzyme, deaminase-independent mechanisms of restriction were also manifested by impeding reverse transcription or inhibiting DNA integration [22–27]. To counteract A3 mediated inhibition, lentiviruses evolved the Vif protein, which physically interacts with A3s to target them for polyubiquitination and proteasomal degradation, and thereby depleting the cellular A3s [28–30]. These A3-Vif interactions are often species-specific [31–35].
A3D, A3F, A3G, and A3H were shown to restrict HIV-1 lacking vif (HIV-1Δvif) [2,35–39]. Recently, mutation signatures resulting from the catalytic activity of nuclear localized A3s (especially A3A, A3B, and likely A3H) were reported in several cancer types [40, 41] (for reviews, see: [42–45]. However, the A3C, which is distributed in both cytoplasm and nucleus [46] seems not to be a causative agent of chromosomal DNA mutations. Human A3C is known to act as a potent inhibitor of Simian immunodeficiency virus from African green monkey (SIVagm) and SIVmac (from rhesus macaque), limits the infectivity of herpes simplex virus, certain human papillomaviruses, murine leukemia virus, Bet-deficient foamy virus, and hepatitis B virus and represses the replication of LINE-1 (L1) retrotransposons [46–56]. However, the restrictive role of A3C on HIV-1 is marginal and there are several contradictory findings regarding its viral packaging and cytidine deamination activity [39,47,57–59]. Notably, A3C is expressed ubiquitously in lymphoid cells [5,47,60,61]. mRNA expression levels of A3C were found to be higher in HIV-infected CD4+ T lymphocytes [39, 47], and significantly elevated in elite controllers with respect to ART-suppressed individuals [62]. A3C was found to moderately deaminate HIV-1 DNA if expressed in target cells of the virus and rather increased viral diversity than caused restriction [60].
The crystal structure of A3C and its HIV-1 Vif-binding interface were reported recently [63]. The study revealed several key residues in the hydrophobic V-shaped groove formed by the α2 and α3 helices of A3C that facilitate Vif binding resulting in proteasome-mediated degradation of A3C [63]. We extended this finding and identified additional Vif interaction sites in α4 helix of A3C [64]. Other than a previous study that predicted putative DNA substrate binding pockets [52], biochemical and structural aspects of A3C enzymatic activity and their relevance for antiviral activity are not well investigated to date [3, 4].
Recently, we have shown that increasing the catalytic activity of A3C by an S61P substitution (based on the structural homology found between A3C and A3F at their C-terminal domain, A3F-CTD) is not sufficient to inhibit HIV-1Δvif [65]. It is unclear why A3C can potently restrict SIVΔvif, but not HIV-1Δvif despite the fact that the wild-type human enzyme possesses reasonable catalytic activity and encapsidates efficiently into retroviral particles [65]. Here we set out to understand the function of A3C in the context of HIV-1 inhibition. We generated a synthetic open reading frame derived from sooty mangabey monkey genome (smm, Cercocebus atys (torquatus) lunulatus), encoding for an A3C-like protein (hereafter called smmA3C-like protein) capable of restricting HIV-1 to similar or higher extent than human A3G. This A3C-like protein was reported to be resistant to HIV-1 Vif-mediated depletion [64]. Using this smmA3C-like protein as a tool, here we dissect the structure-function of hA3C and identify the crucial regions of A3C that facilitate stronger inhibition of HIV-1.
RESULTS
Identification of an A3Z2 protein with enhanced antiviral activity
To determine whether A3C from non-human primates can potently restrict HIV-1Δvif propagation, we produced HIV-1Δvif luciferase reporter virus particles with A3C (an A3Z2 protein) from human, rhesus macaque, chimpanzee (cpz), African green monkey (agm), with human A3G (an A3Z2-Z1, double domain protein), or a synthetic smmA3C-like protein and tested their viral infectivity. Viral particles were pseudotyped with the glycoprotein of Vesicular stomatitis virus (VSV-G) and normalized by reverse transcriptase (RT) activity before infection. The luciferase enzyme activity of infected cells was quantified two days post infection. Figure 1A shows the level of relative infectivity of HIV-1Δvif in the presence of the tested A3C proteins and hA3G. Human, rhesus, chimpanzee, and African green monkey A3C proteins reduced the relative infectivity of HIV-1Δvif similarly by approximately 60 to 70%.
Conversely, smmA3C-like protein inhibited HIV-1Δvif replication by more than one order of magnitude (Fig. 1A). Human A3G served as a positive control. Expression of the A3s in viral vector-producing cells showed that expression levels of smmA3C-like protein and agmA3C were lower than those of A3Cs from human, rhesus, and cpz (Fig. 1B). Viral incorporation of smmA3C-like protein was found to be very similar to hA3G, but much less efficient compared to hA3C (Suppl. Fig. S1A).
The smmA3C-like construct was originally described to express A3C of smm [64]. However, using alignments of primate A3Z2 and related A3 proteins, we later found that the generated open reading frame consists of exons encoded by genes of smmA3C and smmA3F. In the smmA3C-like construct, the first “exon” (encoding for amino acids 1MNPQIR6) and last “exon” (encoding for amino acids 153FKYC to EILE190) were derived from smmA3C (smmA3C exon 1 and exon 4) while the second “exon” (encoding for amino acids 7NPMK to FRNQ58) and third “exon” (encoding for amino acids 59VDPE to VDPE151) in smmA3C-like were of smmA3F origin (smmA3F C-terminal domain, CTD, exon 5 and exon 6) (Suppl. Fig. S2). Poor annotation of the smm genome and the high sequence similarity let us fuse these exons that were derived from smmA3C and smmA3F during the amplification step. To compare smmA3C-like to the wild-type proteins, we cloned the genuine smmA3C and smmA3F-CTD and tested their activity. We found that only the smmA3C-like protein and not smmA3C protein showed enhanced cytidine deaminase activity (Suppl. Fig. S3). The smmA3F-CTD construct failed to express detectable levels of protein in transfected cells (Suppl. Fig. S3).
To study G-to-A mutations on the plus-strand of viral DNA triggered by A3 in vivo activity, we routinely used a method called “3D-PCR” [65, 66]. DNA sequences in which the cytosines were deaminated by A3 activity contain less GC base pairs than non-edited DNA, resulting in a lower melting temperature than the original, non-edited DNA. Therefore, successful amplification at lower denaturation temperatures (Td) (83.5 - 87.6°C) by 3D-PCR is indicating the presence of A3-edited sequences. Because restriction of HIV-1Δvif by smmA3C-like protein was similar or slightly stronger than restriction by hA3G (Fig. 1A), we analyzed the DNA editing capacity of these A3s during infection by 3D-PCR on the viral genome. 3D-PCR amplification with samples of cells infected with HIV-1Δvif viruses encapsidating hA3C, rhA3C, cpzA3C, or agmA3C yielded amplicons until Td 86.3°C, whereas the activity of smmA3C-like protein on the same substrate allowed to produce amplicons at lower Td, 84.2°C. In control reactions using virions produced in the presence of hA3G, PCR amplification of viral DNA was detectable at lower Td (85.2°C and weakly at 84.2°C) (Fig. 1C). Importantly, using the vector control sample (no A3), PCR amplicons could be amplified only at higher Td (87.6°C). In our previous study, we elaborately compared the hypermutation load and patterns induced by A3C, A3G, and A3F in retroviruses and found that in HIV-1Δvif, the G→A mutation rate induced by hA3C and A3C.S61P (Suppl. Fig. S4, an A3C mutant with enhanced deamination activity against SIVagmΔvif) was about 6%, whereas A3G and A3F triggered mutation rate was above 15% [65]. To study the effect of smmA3C-like protein in HIV-1Δvif, PCR products generated on smmA3C-like protein-edited samples formed at 84.2°C were cloned and independent clones were sequenced. smmA3C-like protein caused hypermutation in HIV-1Δvif with a rate of 17.16% and predominantly favored the expected GA dinucleotide context (Suppl. Fig. S5). In addition, we have applied qualitative in vitro cytidine deamination assays using A3 proteins isolated from HIV-1Δvif and SIVagmΔvif viral particles [67, 68]. This PCR-based assay depends on the sequence change caused by A3 converting a dC→dU in an 80-nucleotide (nt) ssDNA substrate harboring the A3C-specific TTCA motif. Catalytic deamination of dC→dU by A3C is then followed by a PCR that replaces dU by dT generating an MseI restriction site. The efficiency of MseI digestion was monitored by using a similar 80-nt substrate containing dU instead of dC in the recognition site. As expected, hA3C and hA3C.S61P, encapsidated into the HIV-1Δvif particles, did not yield a considerable product resulting from ssDNA cytidine deamination [65], however, smmA3C-like protein formed high amounts of deamination products (Fig. 1D). Using smmA3C-like protein, the deamination products were observed even after transfection of 10-fold smaller amounts of expression plasmid during virus production. In contrast, A3C and A3C.S61P proteins isolated from SIVagmΔvif particles but not from HIV-1Δvif particles produced the expected deamination products, whereas smmA3C-like protein exhibited the strongest catalytic activity, regardless of the source (Fig. 1D). Taken together, we conclude that smmA3C-like protein inhibits HIV-1 by cytidine deamination causing hypermutation of the viral DNA.
Identification of the regulatory domain of smmA3C-like protein that mediates HIV-1 restriction
Amino acid sequence identity and similarity between hA3C and smmA3C-like protein reach 77.9% and 90%, respectively (Suppl. Fig. S4A). To facilitate the identification of distinct determinants of smmA3C-like protein that confer HIV-1 inhibition, ten different hA3C/smmA3C-like chimeras were constructed [64] (Fig. 2A). We first tested the anti-HIV-1Δvif activity of these A3C chimeras. Viral particles containing different chimeric proteins were produced and their infectivity was tested. As shown in Fig. 2B, chimeras C2, C4, and C8 strongly reduced the infectivity of HIV-1Δvif. Especially, chimera C2 (hA3C harboring a swap of 36 residues of the smmA3C-like protein at the N-terminal end) inhibited HIV-1Δvif replication by about two orders of magnitude. On the contrary, chimeras C6 and C9 reduced viral infectivity by 72% relative to vector control (Fig. 2B).
Next, we determined the intracellular expression and virion incorporation efficiency of the chimeras by immunoblotting. Chimeras C2, C3, C5, C7, and C9, which contain residues 37 to 76 of hA3C (Fig. 2A), were more highly expressed than C1, C4, C6, and C10 (Fig. 2C). Specifically, chimera C2 displayed higher protein levels than hA3C while C10 was below the detection threshold. Chimeras, C2, C4, C6, C7, and C9 were found to be encapsidated in HIV-1Δvif (Fig. 2C, viral lysate). In particular, C3 and C5 were less efficiently packaged into viral particles although they were present at higher intracellular expression levels. Conversely, C6 produced less protein but its viral incorporation was higher than that of C3 or C5. In addition, we analyzed the in vitro cytidine deaminase activity of these chimeras as described above (Fig. 2D). Here we used lysates of transfected HEK293T cells to readily evaluate the catalytic activity of the chimeric A3Cs. As demonstrated in Fig. 2D, only the amounts of deamination products predominantly generated by C2 and C4 were similar to those produced by smmA3C-like protein.
Taken together, chimeras C2 and C4 strongly restricted HIV-1Δvif and are characterized by corresponding in vitro deamination activity. C6, by contrast, lost any antiviral and deamination activity, suggesting that the N-terminal region of smmA3C-like protein is crucially involved in the antiviral mechanism. We speculate that residues in C2 and C4 that are absent in C6 complemented the restriction activity of these chimeras. Due to its superior antiviral activity we mainly focused on chimera C2 in our following experiments.
Synergistic effects of residues in the RKYG motif of chimera C2 and smmA3C-like protein govern their potent antiviral activity
To identify the specific residues in C2 that are essential for its anti-HIV-1 activity, we targeted two N-terminal motifs of C2, namely 13DPHIFYFH20 (shortly “DHIH”) and 24LRKAYG29 (named “RKYG”) as presented in the sequence alignments of Suppl. Fig. S4A, and generated more variants of C2 by swapping one, two, or four amino acids with the analogous residues of hA3C as presented in Fig. 3A. First, we cloned the C2 variants C2.DH-YG (YGTQ motif of helix α1) and C2.RKYG-WEND (WEND motif of loop 1, see A3C alignment and ribbon diagram Suppl. Fig. S4) and tested their anti-HIV-1 and deamination activity. This pilot experiment revealed that loop 1 motif RKYG but not α1 helix motif DHIH in C2 is essential for its activity (data not shown). Hence, we constructed the mutants C2.R25W, K26E, Y28N, and G29D (Fig. 3A) and tested them for catalytic and antiviral activity. Since the in vitro deaminase activity of the chimeras C1 to C10 correlated with their antiviral activity (Fig. 2B and 2D), we expressed these variants of C2 in HEK293T cells and performed in vitro deamination assays. The results of the deamination assay clearly demonstrated that the DH motif in C2 is not relevant for its potent catalytic activity as the C2.DH-YG acted similar to C2 (Fig. 3B), but mutation of the RKYG motif in the RKYG-WEND variant resulted in a loss of deamination activity (Fig. 3B). Interestingly, none of the single amino acid changes in RKYG (C2.R25W, K26E, Y28N, and G29D) resulted in the loss-of-function of C2, albeit the catalytic activities of R25W and K26E were partially reduced (Fig. 3B). Consistent with the data obtained from the in vitro assay, the chimeric C2.RKYG-WEND variant failed to restrict the infectivity of HIV-1Δvif, while C2 and its point mutants strongly inhibited the virus (Fig. 3C). Immunoblot analysis of cell and viral lysates further confirmed that cellular expression and viral encapsidation of these variants were comparable (Fig. 3D). Finally, to test the in vivo DNA editing capacity, we did 3D-PCR analysis using C2, C2.DH-YG, and C2.RKYG-WEND variants. As presented in Fig. 3E, only HIV-1Δvif particles produced in the presence of A3C chimera C2 and its mutant C2.DH-YG harbored viral DNA that was detected by PCR products at low-denaturation temperature and C2.RKYG-WEND behaved similarly to the vector control (Fig. 3E). Likewise, replacing RKYG with WEND in the smmA3C-like protein inhibited its antiviral activity (Figs. 4A and 4B) and deamination activity of HIV-1 genomes (Fig. 4C) as did the active site mutant E68A.
The WE-RK mutation in loop 1 of hA3C determines its strong deaminase-dependent antiviral function
Mutational changes of the RKYG motif to WEND residues in loop 1 of C2 and smmA3C-like protein resulted in complete loss of enzymatic functions and anti-HIV-1 activities (Figs. 3C and 4A). To identify the residues in hA3C that are critically required for the deaminase-dependent antiviral activity against HIV-1Δvif, we modified the loop 1 of hA3C with 25WE26>>25RK26 and 28ND29>>28YG29 residues and compared their antiviral capacity (please see A3C alignment and ribbon diagram Suppl. Fig. S4). As controls, we included additional mutants such as a catalytically inactive Zn2+-coordinating C97 mutant, A3C.C97S [52], and the variants A3C.S61P [65] and A3C.S188I [69] (Suppl. Fig. S4A) exhibiting enhanced deaminase activity. Compared to wild-type hA3C, WE-RK greatly enhanced inhibition of HIV-1Δvif, and the ND-YG variant behaved like wild-type A3C, while S61P and S188I have demonstrated only marginally increased HIV-1Δvif restriction (Fig. 5A). Importantly, mutant A3C.C97S did not inhibit HIV-1Δvif (Fig. 5A).
Next, we generated active site mutants to analyze if the antiviral activity of A3C.WE-RK is deamination-dependent. To achieve this, we introduced a C97S mutation in each of these constructs. Additionally, we compared the ancillary effect of mutants such as S61P [65] and S188I [69] by introducing these mutations in the WE-RK variant of A3C. As expected, the inhibitory activities of A3C.WE-RK, A3C.WE-RK.S61P, and A3C.WE-RK.S61P.S188I against HIV-1Δvif were abolished by active site ablating mutation C97S, indicating the importance of the enzymatic activity of A3C (Fig. 5B). Introducing either the single mutation S61P or the double mutation S61P.S188I did not considerably change the action of A3C.WE-RK (Fig. 5B). Immunoblot analysis of cell and viral lysates demonstrated that hA3C and all mutants (except A3C.WE-RK.S61P.S188I.C97S mutant) expressed a comparable level of protein (Fig. 5C).
However, viral incorporation of A3C.C97S, A3C.WE-RK.C97S, A3C.WE-RK.S61P.C97S, and WE-RK.S61P.S188I.C97S was slightly decreased relative to that of wild-type and mutant proteins that do not contain the C97S mutation (Fig. 5C). Moreover, we confirmed the effects of these mutants on HIV-1Δvif propagation by 3D-PCR (Fig. 5D) and deamination assay in vitro (Fig. 5E). In both assays, we found that the C97S mutation destroys the function of all A3C variants. Thus, we conclude that the loop 1-mediated enhanced activity of hA3C.WE-RK is dependent on catalytic deamination.
The RK-WE mutation in loop 1 moderately reduces the antiviral activity of hA3F
The residues 25RK26 in loop 1 of smmA3C-like protein are derived from exon 5 of A3F gene in which exon 5 to 7 encoding A3F-CTD and conserved in primate A3F proteins (Suppl. Fig. S2). Various loops within A3F-CTD were recently investigated with respect to their role in substrate binding and enzyme function [70] but it was not possible to unravel the antiviral activity of A3F-CTD, mainly due to difficulties in expressing this domain in human cells as found earlier [65, 71]. hA3C and hA3F-CTD display 77% sequence similarity, reflecting a common evolutionary origin [6]. Importantly, the antiviral activity of hA3F is mediated by its CTD [72, 73]. To test the impact of RK residues in loop 1 of the hA3F-CTD, we compared the antiviral activity of hA3F with A3F.RK-WE against HIV-1Δvif. hA3F and hA3F.RK-WE expressed similar amounts of protein and were equally encapsidated in HIV-1 particles (Fig. 6A). However, A3F.RK-WE exhibited an about two-fold decreased capacity to inhibit HIV-1Δvif compared with wild-type A3F (Fig. 6B). Consequently, A3F.RK-WE showed decreased mutation efficiency compared with wild-type A3F (Figs. 6C and 6D), which was consistent with data presented in a recent report [70]. Thus, we conclude that loop 1 with its residues RK in CTD of A3F is important for hA3F’s enzymatic function.
Inhibition of LINE-1 retrotransposition by A3C variants
Since A3C and A3F restrict endogenous LINE-1 (L1) retrotransposition activity by 40-75% and 66-85% [46,56,74,75], respectively, we set out to elucidate how the WE and the RK residues in loop 1 of both hA3C and hA3F, respectively, affect the L1 inhibiting activity. To this end, we quantified the L1-inhibiting effect of human wild-type A3A, A3C, and A3F proteins and their mutants hA3C.WE-RK, hA3C.WE-RK.S61P, and hA3F.RK-WE by applying a dual-luciferase retrotransposition reporter assay [76]. In this cell culture-based assay, the firefly luciferase gene is used as the reporter for L1 retrotransposition and the Renilla luciferase gene is encoded on the same plasmid for transfection normalization (Fig. 7A). Consistent with previous reports, overexpression of hA3A, hA3C, and hA3F resulted in inhibition of L1 reporter retrotransposition by approximately 94%, 68%, and 56%, respectively (Fig. 7B). The mutant hA3C.WE-RK displayed an increased L1-restricting effect (from 56% to ∼96%), and the introduction of the additional mutation hA3C.WE-RK.S61P did not further increase the ability of the enzyme to restrict L1 mobilization (Fig. 7B). Notably, hA3F and the mutant hA3F.RK-WE exhibited a comparable level of L1 restriction, indicating that regions other than loop 1 of A3F-CTD and, probably, the NTD (N-terminal domain) of hA3F are involved in L1 restriction (Fig. 7B). Immunoblot analysis of cell lysates of co-transfected HeLa-HA cells demonstrated comparable expression of the L1 reporter and HA-tagged A3- and A3 mutant proteins (Suppl. Fig. S7). These findings indicate that the WE-RK mutation in hA3C enhances its L1 inhibiting activity. Based on the observed antiviral activity and the L1 restricting effect of hA3C.WE-RK on L1, we hypothesize that the introduction of these positively charged residues in hA3C significantly fosters its interaction with nucleic acids, which was recently reported to mediate its L1 inhibiting activity [56].
The positively charged residues R25 and K26 in A3C-C2 form salt-bridges with the backbone of the ssDNA
The structural model of hA3C variant C2 binding to ssDNA, which is based on the ssDNA-bound crystal structure of A3A, shows a cytidine residue in the active center of hA3C-C2 (Fig. 8A). However, the ssDNA fragment, which was co-crystallized with hA3A, is too short to interact with residues 25, 26, 28, and 29, which differ between hA3C WT and the C2 variant. Hence, this binding mode model cannot explain why C2 has a higher cytidine deaminase activity than hA3C WT. To assess the binding to a longer ssDNA fragment, we generated a complex model of ssDNA bound to the NTD of rhesus macaque A3G (rhA3G) [77], similar to the ssDNA-bound A3F-CTD model built previously [78], and aligned the crystal structure of hA3C WT and the model of C2 to this complex (Figs. 8B, 8C, and 8D). The positively charged residues R25 and K26 in C2 form salt-bridges with the backbone of the ssDNA (Fig. 8D) in contrast to hA3C WT (Fig. 8C). Additionally, Y28 of C2 can form π-π-stacking interactions with the aromatic DNA bases (Fig. 8D). Thus, these three residues can form stronger interactions with ssDNA in C2 than their counterparts in hA3C. This finding may explain the enhanced cytidine deaminase activity of C2 compared to hA3C.
Furthermore, we performed five replicas of molecular dynamics (MD) simulations of 2 µs length each for hA3C, C2, and hA3C.S61P.S188I to assess the structural impact of the substitutions. The root mean square fluctuations (RMSF), which describe atomic mobilities during the MD simulations, show distinct differences between the variants in the putative DNA-binding regions of the proteins: the RMSF of C2 and hA3C.S61P.S188I are up to 2 Å larger compared to hA3C WT in the regions carrying the substitutions (residues 21-32 for C2 and residues 55-67 for hA3C.S61P.S188I) (Suppl. Fig. S8). This effect is specifically related to the respective substitutions, as no change in RMSF occurs for a variant in a region where it is identical to A3C WT. The increased movement of ssDNA-binding residues might improve the sliding of C2 and hA3C.S61P.S188I along the ssDNA, owing to more transient interactions with the ssDNA backbone. Conversely, the RMSF of loop 7 is up to 1 Å lower in both the C2 and hA3C.S61P.S188I variants compared to the hA3C WT (Suppl. Fig. S8).
WE-RK mutation in the loop 1 of hA3C enhances the interaction with ssDNA
To validate our structural modeling analysis (Fig. 8), and to address if the interaction of hA3C and hA3C.WE-RK with the substrate ssDNA was differentially affected, we performed electrophoretic mobility shift assays (EMSA) using hA3C-GST (A3C fused to glutathione S-transferase, GST) and hA3C.WE-RK-GST purified from HEK293T cells (Fig. 9A). As a probe, we used a biotin-labeled ssDNA oligonucleotide that harbors a TTCA motif in its central region [65, 79]. Because hA3C-GST is known to form a stable DNA-protein complex when the protein concentration reaches ≥ 20 nM ([65] and data not shown), we decreased the amount of A3C and its mutant protein to specifically test their inherent DNA binding capacity. In a titration experiment with concentrations ranging from 2 to 8 nM in steps of 2 nM of hA3C-GST and hA3C.WE-RK-GST purified protein, we detected a clear trend in the formation of DNA–protein complexes for hA3C-GST and hA3C.WE-RK-GST (Fig. 9B). Intriguingly, DNA-protein complexes of hA3C.WE-RK-GST started appearing at the lowest protein concentration used (2 nM), while hA3C-GST-DNA complexes were detected at protein concentration ≥ 6 nM. The top-shifted complexes were formed only with hA3C.WE-RK-GST and not with hA3C-GST. To confirm the specificity of the DNA–protein complexes, we competed for the reaction with unlabeled DNA carrying the same nucleotide sequence as the used probe in 500-fold excess relative to that probe. The addition of the competitor DNA to the sample containing the maximum (8 nM) amount of A3C protein, efficiently disrupted the protein-DNA complex formation. Together, data from structural modeling and EMSA experiments allowed us to conclude that the two amino acid-change in loop 1 of A3C boosts the ssDNA binding capacity of A3C. Importantly, the GST moiety did not affect the binding ([65] and data not shown).
Evolution of A3Z2 loop 1 regions in primates
We performed a phylogenetic reconstruction for the A3Z2 domains in primates, using the A3Z2 sequences in the northern tree shrew as outgroup. Because in primates A3D and the A3F contain two Z2 domains, we analyzed at the A3Z2 domain level (N- and C-terminal Z2) (Fig. 10A). Our results show that the A3Z2 domains underwent independent duplication in the two sister taxa, tree shrews and primates: the three A3Z2 tree shrew sequences constitute a clear outgroup to all primate A3Z2 sequences. We identified a sharp clustering of the A3D-NTD and A3F-NTD on the one hand and of A3C, A3D-CTD, and A3F-CTD on the other hand. As to New World monkeys (Platyrrhini), we could only confidently retrieve A3C sequences from the white-faced sapajou Cebus capucinus and from the Ma’s night monkey Aotus nancymaae. These sequences from A3C New World monkeys were basal to all Catarrhini (Old World monkeys and apes) A3C, A3D-CTD and A3F-CTD sequences, suggesting that the two gene duplications leading to the extant organization of A3C, A3D, and A3F occurred after the Platyrrhini/Catarrhini split 43.2 Mya (41.0 - 45.7 Mya) and before the Cercopithecoidea/Hominoidea split 29.44 Mya (27.95 - 31.35 Mya). The results show a tangled distribution within the A3D-NTD and A3F-NTD clade, and within the A3D-CTD and A3F-CTD clade. These confusing relationships are more obvious when comparing an unconstrained Z2 tree with a tree in which monophyly of the large six clades identified was enforced. Conversely, Catarrhini A3C sequences form a monophyletic taxon, and this A3C gene tree essentially adheres to the corresponding species tree (Fig. 10B). Focusing on the nodes that we could identify with confidence, we performed ancestral phylogenetic inference of the most likely amino acid sequence for the A3 loop 1 as well as consensus analysis of the extant sequences. Our results recover the well-conserved aromatic stacking stretch F[FY]FXF characteristic of all A3s. In the A3C, A3D-CTD, and A3F-CTD clade, we identified a motif with divergent evolution flanked by conserved small hydrophobic amino acids. The most likely ancestral form is the amino acid motif LRKA, which is also the form present in extant New World monkeys A3C and the most common in extant A3F-CTD; in the extant A3D-CTD the Arg residue is less conserved in L[RLQ][KT]A; and strikingly, in the ancestor of Catarrhini A3C, this motif changed to LWEA. Only subsequently, and exclusively in the Chlorocebus lineage, this change was partly reverted to LREA by a transition TGG>CGG. This reversion should have occurred after the divergence within Cercopithecinae, around 13.7 Mya (10.7 - 16.6 Mya).
DISCUSSION
Compared to the studies conducted over the past decade on the potent HIV-1 restriction factors A3G and A3F, investigations on A3C are very limited. Only a few recent studies have addressed the catalytic activity and substrate binding capacity of A3C [65,69,80]. While the previously characterized hA3C mutants S61P and S188I boost the catalytic activity of the enzyme to a certain level, none of these mutations are decisive because they do not reduce the HIV-1Δvif infectivity to any level accomplished by A3G nor do they directly partake in catalytic activity [65,69,80]. Because our repeated attempts to express A3F-CTD in human cells were not successful ([65] and Suppl. Fig. S3), we assayed A3C proteins from different Old World monkey species. Due to the high level of nucleotide sequence identity between the A3 paralogs in the sooty mangabey monkey genome, we unintentionally generated the smmA3C-like protein with superior anti-HIV-1 and enzymatic activity. We have identified the key role of two positively-charged residues in loop 1 of the smmA3C-like protein (and of the hA3F-CTD), namely R25 and K26 in the RKYG motif. Replacing RKYG of A3C chimera C2 or smmA3C-like protein by the WEND (form of this motif in hA3C) abolished both their anti-HIV-1 and catalytic activity. Notably, the converse strategy of introducing the substitution WE-RK in the loop 1 of hA3C rendered hA3C.WE-RK a potent, deaminase-dependent, anti-HIV-1 enzyme. Consistent with these observations, our EMSA data clearly demonstrate that residues in the loop 1 of A3C regulate protein-DNA interaction and we postulate that this interaction is causative for the enhanced deamination activity and enhanced anti-HIV and L1 activity. A similar model was discussed by Solomon and coworkers, which demonstrated that loop 1 residues of hA3G-CTD-2K3A-E259A (a catalytically inactive form of A3G-CTD) strongly interact with substrate ssDNA and that this distinguishes catalytic binding from non-catalytic binding [81]. Interestingly, loop 1 of A3A was found to be important for substrate specificity but not for substrate binding affinity [82], while loop 1 of A3H especially residue R26, plays a triple role for RNA binding, DNA substrate recognition, and catalytic activity likely by positioning the DNA substrate in the active site for effective catalysis [83]. In accordance with this, our study claims that 25RK26 substitution in loop 1 of A3C provides the microenvironment that drives the flexibility in substrate binding and enzymatic activity.
The binding model developed here rationalizes how A3C variant C2 can interact with the negatively charged backbone of ssDNA via the positively charged loop 1 side chains of R25 and K26 (Fig. 8D). Like our modeling strategy, Fang et al. [78] used their binding mode model of A3F-CD2 with ssDNA to identify residues in the A3G-CTD important for ssDNA binding. Furthermore, the increased mobility of DNA binding regions carrying the substitutions in C2 and hA3C-S61P.S188I, respectively, compared to hA3C (Suppl. Fig. S8) suggests that C2 and hA3C-S61P.S188I can better slide along the ssDNA than hA3C: The higher mobility of the residues may allow them to adapt more quickly to the passing ssDNA, which, paired with likely stronger interactions with the backbone of the ssDNA, may explain the increased deaminase activity. In addition, loop 7 exhibits a decreased mobility in both C2 and hA3C-S61P.S188I compared to hA3C, which was shown to be a predictor for higher deaminase activity, DNA binding, and substrate specificity of A3G and A3F, and reported to be also relevant for antiviral activity of A3B and A3D [73,84–86].
Unexpectedly, our experiments also demonstrated that LINE-1 restriction by A3C which was reported earlier to be deaminase-independent [56], is enhanced after expression of the A3C.WE-RK variant. These data suggest that the reported RNA-dependent physical interaction between L1 ORF1p and A3C dimers might be mediated by A3C loop 1, is partly dependent on the two amino acids W25 and E26 and enhanced by the R25 and K26 substitutions. However, L1 inhibition by A3F was not significantly altered by the A3F.RK-WE mutations, clearly indicating that in A3F other regions (and NTD) are likely to be relevant for L1 restriction.
Because selection likely had to balance between anti-viral/anti-L1 activity and genotoxicity of A3 proteins, we wanted to characterize loop 1 residues during the evolution of closely related A3Z2 proteins such as A3C, A3D CTD and A3F CTD in primates. In the most recent common ancestor of these enzymes, before the split Catarrhini-Platyrrhini some 43 Mya, the sequence of this motif in loop 1 is LRKAYG. In New World Monkeys, the A3C genes were not duplicated and are basal to the three sister clades of Catarrhini A3C, A3D-CTD, and A3F-CTD. In extant A3C sequences in New World monkeys, the loop 1 motif has notably remained unchanged and reads LRKAYG. In Catarrhini, on the contrary, the ancestral A3C sequence underwent two rapid rounds of duplication that occurred after the split with the ancestor of Platyrrhini, and before the split between the ancestors of Cercopithecoidea and Hominoidea, some 29 Mya. In extant A3F-CTD sequences, the consensus form of the loop 1 remains LRKAYG, albeit with certain variability of the Arg residue to be exchanged by other positively charged amino acids. In extant A3D-CTD enzymes, this motif has undergone erosion, is more variable and reads L[RLQ][KT]A[YC]G. Interestingly, loop 1 in A3C has experienced rapid and swift selective pressure to exchange the positively charged RK amino acids by the largely divergent chemistry of WE, yielding LWEAYG. This selective sweep occurred very rapidly, as this is the fixed form in all Catarrhini. Notoriously, and exclusively in the Chlorocebus lineage, this amino acid substitution was partly reverted to LREAYG, which is the conserved sequence in the four Chlorocebus A3C entries available.
Overall, our results suggest that the two duplication events that generated the extant A3C, A3D-CTD, and A3F-CTD sequences in Catarrhines, released the selective pressure on two of the daughter enzymes allowing them to explore the sequence space and to evolve via sub/neofunctionalisation, as proposed for Ohno’s in-paralogs [87]. Thus, the A3F-CTD form of the loop 1 diverged little from the ancestral chemistry and possibly maintained the ancestral function, while the release in conservation pressure on A3D-CTD allowed the enzyme loop 1 to accumulate mutations and diverge from the ancestral state. In turn, A3C was rapidly engaged into a distinct evolutionary pathway, which is unique due to the highly divergent chemistry of loop 1 but also because A3C is the only A3Z2 monodomain enzyme of the A3 family.
In conclusion, we postulate that the loop 1 region of A3s might have a conserved role in anchoring ssDNA substrate for efficient catalysis and that hA3C’s weak deamination and anti-HIV-1 activity might have been the result of losing DNA interactions in loop 1 during its evolution. It is thus possible that genes encoding A3C proteins with loop 1 residues with a higher ssDNA affinity were too genotoxic to benefit its host by superior anti-viral and anti-L1 activity.
MATERIALS AND METHODS
Cell culture
HEK293T cells were maintained in Dulbecco’s high-glucose modified Eagle’s medium (DMEM) (Biochrom, Berlin, Germany), supplemented with 10% fetal bovine serum (FBS), 2 mM L-glutamine, 50 units/ml penicillin, and 50 µg/ml streptomycin at 37°C in a humidified atmosphere of 5% CO2. Similarly, HeLa-HA cells [88] were cultured in DMEM with 10% FCS (Biowest, Nuaillé, France), 2mM L-glutamine and 20 U/ml penicillin/streptomycin (Gibco, Schwerte, Germany).
Plasmids
The HIV-1 packaging plasmid pMDLg/pRRE encodes gag-pol, and the pRSV-Rev for the HIV-1 rev [89]. The HIV-1 vector pSIN.PPT.CMV.Luc.IRES.GFP expresses the firefly luciferase and GFP reported previously [90]. HIV-1 based viral vectors were pseudotyped using the pMD.G plasmid that encodes the glycoprotein of VSV (VSV-G). SIVagm luciferase vector system was described before [31]. All APOBEC3 constructs described here were cloned in pcDNA3.1 (+) with a C-terminal hemagglutinin (HA) tag. The smmA3C-like expression plasmid was generated by exon assembly from the genomic DNA of white-crowned mangabey (Cercocebus torquatus lunulatus), and the cloning strategy for smmA3C-like and the chimeras of hA3C/smmA3C-like plasmid construction was recently described [64]. The expression vector for A3G-HA was generously provided by Nathaniel R. Landau. Expression constructs hA3C, rhA3C, cpzA3C, agmA3C and A3C point mutant A3C.C97S were described before [52,55,65]. smmA3C-like with C-terminal V5 tag was cloned using following primers forward 5’-EcoRI-ATGAATTCGCCACCATGAATCCACAGATCAGAAAC and reverse 5’-NotI-ATGCGGCCGCCACTCGAGAATCTCCTGTAGGCGTC.
Various point mutants hA3C.WE-RK, hA3C.ND-YG, hA3C.WE-RK.C97S, hA3C.WE-RK.S61P, hA3C.WE-RK.S61P.C97S, hA3C.WE-RK.S61P.S188I, hA3C.WE-RK.S61P.S188I.C97S, hA3F.RK-WE, smmA3C-like.E68A were generated by using site-directed mutagenesis. Similarly, single or multiple amino acid changes were made in expression vectors to produce chimera 2 mutants (C2.DH-YG, C2.RKYG-WEND, C2.R25W, C2.K26E, C2.Y28N, and C2.G29D) and smmA3C-like.RKYG-WEND. To clone C-terminal GST-tagged hA3C, hA3C.WE-RK, the ORFs were inserted between the restriction sites HindIII and XbaI in the mammalian expression construct pK-GST mammalian expression vector [91]. Individual exons of authentic smmA3C and smmA3F and smmA3F-like genes exons were amplified and cloned in pcDNA3.1. All the primer sequences are listed in Suppl. table 1.
Virus production and isolation
HEK293T cells were transiently transfected using Lipofectamine LTX and Plus reagent (Invitrogen, Karlsruhe, Germany) with an appropriate combination of HIV-1 viral vectors (600 ng pMDLg/pRRE, 600 ng pSIN.PPT.CMV.Luc.IRES.GFP, 250 ng pRSV-Rev, 150 ng pMD.G with 600 ng A3 plasmid or replaced by pcDNA3.1, unless otherwise mentioned) or SIVagm vectors (1400 ng pSIVTan-LucΔvif, 150 ng pMD.G with 600 ng A3 plasmid) in 6 well plate. 48 h post-transfection, virion containing supernatants were collected and for isolation of virions, concentrated by layering on 20% sucrose cushion and centrifuged for 4 h at 14,800 rpm. Viral particles were re-suspended in mild lysis buffer (50 mM Tris (pH 8), 1 mM PMSF, 10% glycerol, 0.8% NP-40, 150 mM NaCl and 1X complete protease inhibitor).
Luciferase-based infectivity assay
HIV-1 luciferase reporter viruses were used to transduce HEK293T cells. Prior infection, the amount of reverse transcriptase (RT) in the viral particles was determined by RT assay using Cavidi HS kit Lenti RT (Cavidi Tech, Uppsala, Sweden). Normalized RT amount equivalent viral supernatants were transduced. 48 h later, luciferase activity was measured using SteadyliteHTS luciferase reagent substrate (Perkin Elmer, Rodgau, Germany) in black 96-well plates on a Berthold MicroLumat Plus luminometer (Berthold Detection Systems, Pforzheim, Germany). Transductions were done in triplicate and at least three independent experiments were performed.
Immunoblot analyses
Transfected HEK293T cells were washed with phosphate-buffered saline (PBS) and lysed in radioimmunoprecipitation assay buffer (RIPA, 25 mM Tris (pH 8.0), 137 mM NaCl, 1% glycerol, 0.1% SDS, 0.5% sodium deoxycholate, 1% Nonidet P-40, 2 mM EDTA, and protease inhibitor cocktail set III [Calbiochem, Darmstadt, Germany].) 20 min on ice. Lysates were clarified by centrifugation (20 min, 14800 rpm, 4°C). Samples (cell/viral lysate) were boiled at 95⁰C for 5 min with Roti load reducing loading buffer (Carl Roth, Karlsruhe, Germany) and subjected to SDS-PAGE followed by transfer (Semi-Dry Transfer Cell, Biorad, Munich, Germany) to a PVDF membrane (Merck Millipore, Schwalbach, Germany). Membranes were blocked with skimmed milk solution and probed with appropriate primary antibody, mouse anti-hemagglutinin (anti-HA) antibody (1:7,500 dilution, MMS-101P, Covance, Münster, Germany); mouse α-V5 antibody (1: 4000 dilution; Serotec); goat anti-GAPDH (C-terminus, 1:15,000 dilution, Everest Biotech, Oxfordshire, UK); mouse anti-α-tubulin antibody (1:4,000 dilution, clone B5-1-2; Sigma-Aldrich, Taufkirchen, Germany), mouse anti-capsid p24/p27 MAb AG3.0 [92] (1:250 dilution, NIH AIDS Reagents); rabbit anti S6 ribosomal protein (5G10; 1:103 dilution in 5% BSA, Cell Signaling Technology, Leiden, The Netherlands). Secondary Abs.: anti-mouse (NA931V), anti-rabbit (NA934V) horseradish peroxidase (1:104 dilution, GE Healthcare) and anti-goat IgG-HRP (1:104 dilution, sc-2768, Santa Cruz Biotechnology, Heidelberg, Germany). Signals were visualized using ECL chemiluminescent reagent (GE Healthcare). To characterize the effect of the expression of A3 proteins and their mutants on LINE-1 (L1) reporter expression, HeLa-HA cells were lysed 48 h post-transfection using triple lysis buffer (20 mM Tris/HCl, pH 7.5; 150 mM NaCl; 10 mM EDTA; 0.1% SDS; 1% Triton X-100; 1% deoxycholate; 1x complete protease inhibitor cocktail [Roche]), clarified and 20 μg total protein were used for SDS-PAGE followed by electroblotting. HA-tagged A3 proteins and L1 ORF1p were detected using an anti-HA antibody (Cat.# MMS-101P; Covance Inc.) in a 1:5,000 dilution and the polyclonal rabbit-anti-L1 ORF1p antibody #984 [93] in a 1:2,000 dilution, respectively, in 1xPBS-T containing 5% milk powder. ß-actin expression (clone AC-74, 1:30,000 dilution, Sigma-Aldrich Chemie GmbH) served as a loading control.
Differential DNA denaturation (3D) PCR
HEK293T cells were cultured in 6-well plates and infected with DNAse I (Thermo Fisher Scientific, Schwerte, Germany) treated viruses for 12 hours. Cells were harvested and washed in PBS, the total DNA was isolated using DNeasy DNA isolation kit (Qiagen, Hilden, Germany). A 714-bp fragment of the luciferase gene was amplified using the primers 5’-GATATGTGGATTTCGAGTCGTC-3’ and 5’-GTCATCGTCTTTCCGTGCTC-3’. For selective amplification of the hypermutated products, the PCR denaturation temperature was lowered stepwise from 87.6°C to 83.5°C (83.5°C, 84.2°C, 85.2°C, 86.3°C, 87.6°C) using a gradient thermocycler. The PCR parameters were as follows: (i) 95°C for 5 min; (ii) 40 cycles, with 1 cycle consisting of 83.5°C to 87.6°C for 30 s, 55°C for 30 s, 72°C for 1 min; (iii) 10 min at 72°C. PCRs were performed with Dream Taq DNA polymerase (Thermo Fisher Scientific). PCR products were stained with ethidium bromide. PCR product (smmA3C-like sample only) from the lowest denaturation temperature was cloned using CloneJET PCR Cloning Kit (Thermo Fisher Scientific) and sequenced. smmA3C-like protein-induced hypermutations of eleven independent clones were analysed with the Hypermut online tool (https://www.hiv.lanl.gov/content/sequence/HYPERMUT/hypermut.html) [94]. Mutated sequences (clones) carrying similar base changes were omitted and only the unique clones were presented for clarity.
In vitro DNA cytidine deamination assay
A3 proteins expressed in transfected HEK293T cells or virion incorporated A3s used as input. Cell lysates were prepared with mild lysis buffer 48 h post plasmid transfection. Deamination reactions were performed as described [67, 95] in a 10 µL reaction volume containing 25 mM Tris pH 7.0, 2 µl of cell lysate and 100 fmol single-stranded DNA substrate (TTCA: 5’-GGATTGGTTGGTTATTTGTATAAGGAAGGTGGATTGAAGGTTCAAGAAGGTGATGGAAGTTATGTTTGGTAGATTGATGG). Samples were treated with 50 µg/ml RNAse A (Thermo Fisher Scientific). Reactions were incubated for 1 h at 37°C and the reaction was terminated by boiling at 95°C for 5 min. One fmol of the reaction mixture was used for PCR amplification Dream Taq polymerase (Thermo Fisher Scientific) 95°C for 3 min, followed by 30 cycles of 61°C for 30 s and 94°C for 30 s) using primers forward 5’-GGATTGGTTGGTTATTTGTATAAGGA and reverse 5’-CCATCAATCTACCAAACATAACTTCCA. PCR products were digested with MseI (NEB, Frankfurt/Main, Germany), and resolved on 15% PAGE, stained with ethidium bromide (7.5 μg/ml). As a positive control, substrate oligonucleotides with TTUA instead of TTCA were used to control the restriction enzyme digestion [65].
L1 retrotransposition assay
Relative L1 retrotransposition activity was determined by applying a rapid dual-luciferase reporter based assay described previously [76]. Briefly, 2×105 HeLa-HA cells were seeded per well of a six-well plate and transfected using Fugene-HD transfection reagent (Promega) according to the manufacturer’s protocol. Each well was cotransfected with 0.5 μg of the L1 retrotransposition reporter plasmid pYX017 or pYX015 [76] and 0.5 μg of pcDNA3.1 or wild-type or mutant A3 expression construct resuspended in 3 μl Fugene-HD transfection reagent and 100 μl GlutaMAX-I-supplemented Opti-MEM I reduced-serum medium (Thermo Fisher Scientific). Three days after transfection cultivation, the medium was replaced by complete DMEM containing 2.5 μg/ml puromycin, to select for the presence of the L1 reporter plasmid harboring a puroR-expression cassette. Next day, the medium was replaced once more by puromycin containing DMEM medium and 48 hours later, transfected cells were lysed to quantify dual-luciferase luminescence. Dual-luciferase luminescence measurement: Luminescence was measured using the Dual-Luciferase Reporter Assay System (Promega) following the manufacturer’s instructions. For assays in 6-well plates, 200 μl Passive Lysis Buffer was used to lyse cells in each well; for all assays, 20 μl lysate was transferred to a solid white 96-well plate, mixed with 50 μl Luciferase Assay Reagent II and firefly luciferase (Fluc) activity was quantified using the microplate luminometer Infinite 200PRO (Tecan, Männedorf, Switzerland). Renilla luciferase (Rluc) activity was subsequently read after mixing 50 μl Stop & Glo Reagent into the cell lysate containing Luciferase Assay Reagent II. Data were normalized as described in the results section. We routinely used the retrotransposition-defective L1RP/JM111 (located on pYX015) as the reference Fluc vector and set normalized luminescence ratio (NLR) resulting from cotransfection of pYX015 and pcDNA3.1(+) as 1.
Protein sequence alignment and visualization
Sequence alignment of hA3C and smmA3C-like protein was done by using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/). The alignment file was then submitted to ESPript 3.0 [96] (espript.ibcp.fr) to calculate the similarity and identity of residues between both proteins and to build the alignment figure. Cartoon model of the crystal structure of A3C (PDB 3VOW) was constructed using PyMOL (PyMOL Molecular Graphics System version 1.5.0.4; Schrödinger, Portland, OR).
Protein structural model building
The structural models of hA3C or C2 binding to ssDNA were generated by first aligning the X-ray crystal structure of rhA3G-NTD (PDB ID 5K82 [77]) onto the X-ray crystal structure of hA3F-CTD (PDB ID 5W2M [78]), the latter of which was co-crystallized with ssDNA. Subsequently, the hA3C X-ray crystal structure (PDB ID 3VOW [63]) was aligned onto the NTD of rhA3G, which is structurally similar to hA3C. The ssDNA and the interface region of hA3C were subsequently relaxed in the presence of each other using Maestro [97]. The same program was used to mutate hA3C to obtain the C2 and hA3C.S61P.S188I variants, which was again relaxed in the presence of the ssDNA. Similarly, we obtained a C2 ssDNA binding model based on the ssDNA-binding X-ray crystal structure of hA3A (PDB ID 5SWW [98]).
The alignment of the sequences of the crystal structures was generated with Probcons [99], accessed through Jalview [100], which was also used to visualize the alignment (Suppl. Fig. S9).
hA3C, C2, and hA3C.S61P.S188I were subjected to MD simulations. For this, the above-mentioned structures without the DNA were N- and C-terminally capped with ACE and NME, respectively. The three variants were protonated with PROPKA [101] according to pH 7.4, neutralized by adding counter ions, and solvated in an octahedral box of TIP3P water [102] with a minimal water shell of 12 Å around the solute. The Amber package of molecular simulation software [103] and the ff14SB force field [104] was used to perform the MD simulations. For the Zn2+-ions the Li-Merz parameters for two-fold positively charged metal ions [105] were used. To cope with long-range interactions, the “Particle Mesh Ewald” method [106] was used; the SHAKE algorithm [107] was applied to bonds involving hydrogen atoms. As hydrogen mass repartitioning [108] was utilized, the time step for all MD simulations was 4 fs with a direct-space, non-bonded cut-off of 8 Å. In the beginning, 17500 steps of steepest descent and conjugate gradient minimization were performed; during 2500, 10000, and 5000 steps positional harmonic restraints with force constants of 25 kcal mol-1 Å-2, 5 kcal mol-1 Å-2, and zero, respectively, were applied to the solute atoms. Thereafter, 50 ps of NVT (constant number of particles, volume, and temperature) MD simulations were conducted to heat up the system to 100 K, followed by 300 ps of NPT (constant number of particles, pressure, and temperature) MD simulations to adjust the density of the simulation box to a pressure of 1 atm and to heat the system to 300 K. During these steps, a harmonic potential with a force constant of 10 kcal mol-1 Å-2 was applied to the solute atoms. As the final step in thermalization, 300 ps of NVT-MD simulations were performed while gradually reducing the restraint forces on the solute atoms to zero within the first 100 ps of this step. Afterwards, five independent production runs of NVT-MD simulations with 2 μs length each were performed. For this, the starting temperatures of the MD simulations at the beginning of the thermalization were varied by a fraction of a Kelvin.
Expression and purification of recombinant GST-tagged hA3C and hA3C.WE-RK from HEK293T cells
Recombinant C-terminal GST-tagged hA3C and hA3C.WE-RK were expressed in HEK293T cells and purified by affinity chromatography using Glutathione Sepharose 4B beads (GE Healthcare) as described previously [65]. Cells were lysed 48 h later with mild lysis buffer [50 mM Tris (pH 8), 1 mM PMSF, 10% glycerol, 0.8% NP-40, 150 mM NaCl, and 1X complete protease inhibitor and incubated with GST beads. After 2 h incubation at 4°C in end-over-end rotation, GST beads were washed twice with wash buffer containing 50 mM Tris (pH 8.0), 5 mM 2-ME, 10% glycerol and 500 mM NaCl. The bound GST hA3C and hA3C.WE-RK proteins were eluted with wash buffer containing 20 mM reduced glutathione. The proteins were 90-95% pure as checked on 15% SDS-PAGE followed by Coomassie blue staining. Protein concentrations were estimated by Bradford’s method.
Electrophoretic mobility shift assay (EMSA) with hA3C-GST and hA3C.WE-RK-GST
EMSA was performed as described previously [65,79,109]. We mixed 20 fmol of 3′ biotinylated DNA (30-TTC-Bio-TEG purchased from Eurofins Genomics, Ebersberg Germany) with 10 mM Tris (pH − 7.5), 100 mM KCl, 10 mM MgCl2, 1 mM DTT, 2% glycerol, and the respective amount of recombinant proteins in a 15 μl reaction mixture, and incubated at room temperature for 30 min. The reaction mixture containing the protein–DNA complex were resolved on a 5% native PAGE gel on ice and transferred to a nylon membrane (Amersham Hybond-XL, GE healthcare) using 0.5 X TBE. After the transfer, the membrane containing protein–DNA complex were cross-linked by UV radiation with 312-nm bulb for 15 min. Chemiluminescent detection of biotinylated DNA was carried out according to the manufacturer’s instruction (Thermo Scientific LightShift Chemiluminescence EMSA Kit).
Phylogenetic inference
In order to study the evolution of the A3-Z2 domains a representative set of 62 primate A3C, A3D, and A3F gene sequences were collected from GenBank (https://www.ncbi.nlm.nih.gov/genbank), as follows: 26 A3C sequences, 12 A3D sequences, and 21 A3F sequences. The phylogenetic relationships and divergence times among the species used were retrieved from http://www.timetree.org (Suppl. Fig. S10). A3 sequences from the northern tree shrew Tupaia belangeri were included as an outgroup to the primate ones. As A3D and A3F sequences contain each two Z2 domains, they were split into the corresponding N- and C-termini. The alignments were performed at the amino acid level using MAFFTv7.380 (http://mafft.cbrc.jp/alignment/software/) [110]. Phylogenetic inference was performed using RAxMLv8 [111], at either the nucleotide level under the GTR+Γ model or at the amino acid level under the LG+Γ model. Node support was evaluated applying 5,000 bootstrap cycles. Phylogenies at the nucleotide level were also calculated after introducing constraints in the tree, forcing monophyly of each clade A3D_N and C-termini, A3F_N and C-termini, New World monkeys A3C, and catarrhine A3C. Differences in maximum likelihood between alternative topologies for the same alignment were evaluated by the Shimodaira-Hasegawa test. Ancestral state reconstruction of amino acids in the loop A3_Z2 loop1 was performed only for the supported clades using RAxMLv8. A tanglegram with the two phylogenies was drawn with Dendroscope v3.6.3 [112]. Final layouts were done with Inkscape 0.92.4.
Statistical analysis
Data were represented as the mean with SD in all bar diagrams. Statistically significant differences between two groups were analyzed using the unpaired Student’s t-test with GraphPad Prism version 5 (GraphPad Software, San Diego, CA, USA). A minimum p-value of 0.05 was considered as statistically significant.
FUNDING
This work was supported by a grant from the research commission of the medical faculty of the Heinrich-Heine-University Düsseldorf (grant #2019-13 to CM and HG). KB is supported by the German Academic Exchange Service (DAAD). ZZ was supported by China Scholarship Council (CSC). CK and GGS are supported by the German Ministry of Health (grant # G115F020001). CM is supported by the Heinz-Ansmann foundation for AIDS research. The Center for Structural Studies is funded by the Deutsche Forschungsgemeinschaft (DFG Grant number 417919780 and INST 208/761-1 FUGG).
Competing interests
The authors have declared that no competing interests exist.
AUTHOR CONTRIBUTIONS
Conceptualization: AAJV and CM
Data curation: AAJV, KB, CGWG, FB, UH, CK, SB, GGS, IGB, DH, and HG
Formal analysis: AAJV, KB, CGWG, FB, ZZ, AS, UH, SB, CK, GGS, IGB, DH, HG, and CM
Funding acquisition: AAJV, CM, and HG
Investigation: AAJV, KB, CGWG, FB, UH, CK, HG, and IGB
Methodology: AAJV, KB, CGWG, SB, HG, and CM
Project administration: CM
Resources: GGS, IGB, HG, and CM
Supervision: CM, DH, and AAJV
Validation: AAJV and CM
Visualization: AAJV, KB, CGWG, FB, CK, GGS, IGB, HG, and CM
Writing – original draft: AAJV
Writing – review and editing: AAJV, KB, CGWG, FB, ZZ, AS, UH, CK, SB, GGS, DH, IGB, HG, and CM
SUPPLEMENTARY FIGURES
ACKNOWLEDGMENTS
We thank Wioletta Hörschken for excellent technical assistance. We thank Michael Emerman, Jens-Ove Heckel, Henning Hofmann, Yasumasa Iwatani, Nathanial R. Landau, Neeltje Kootstra, Bryan Cullen, Jonathan Stoye, Harald Wodrich, and Jörg Zielonka for reagents. The following reagents were obtained through the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH: a monoclonal antibody to HIV-1 p24 (AG3.0) from Jonathan Allan1. HG is grateful for computational support and infrastructure provided by the “Zentrum für Informations- und Medientechnologie” (ZIM) at the Heinrich-Heine-University Düsseldorf and the computing time provided by the John von Neumann Institute for Computing (NIC) to HG on the supercomputer JUWELS at Jülich Supercomputing Centre (JSC) (user ID: HKF7).
Footnotes
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.
- 11.↵
- 12.↵
- 13.
- 14.
- 15.
- 16.
- 17.↵
- 18.↵
- 19.
- 20.
- 21.↵
- 22.↵
- 23.
- 24.
- 25.
- 26.
- 27.↵
- 28.↵
- 29.
- 30.↵
- 31.↵
- 32.
- 33.
- 34.
- 35.↵
- 36.
- 37.
- 38.
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.
- 44.
- 45.↵
- 46.↵
- 47.↵
- 48.
- 49.
- 50.
- 51.
- 52.↵
- 53.
- 54.
- 55.↵
- 56.↵
- 57.↵
- 58.
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵