Abstract
Epigenome editing is an attractive way to manipulate gene expression. However, editing efficiencies depend on the DNA sequence context in a manner that remains poorly understood. Here we developed a novel system in which any protein can be recruited at will to a GFP reporter. We named it ParB/ANCHOR-mediated Inducible Targeting (PInT). Using PInT, we tested how CAG/CTG repeat size affects the ability of histone deacetylases to modulate gene expression. We found that repeat expansion reduces the effectiveness of silencing brought about by HDAC5 targeting. This repeat-length specificity was abolished when we inhibited HDAC3 activity. Our data guide the use of these histone deacetylases in manipulating chromatin. PInT can be adapted to study the effect of virtually any sequence on epigenome editing.
Introduction
Chromatin structure impinges on every DNA-based transaction, from replication and DNA repair to transcription. Thus, it is not surprising that epigenome editing is being harnessed both to understand basic molecular mechanisms and to treat disease1. Epigenome editing is now most commonly carried out via the use of the domain of a chromatin modifier, or EpiEffector, fused to a catalytically dead Cas9 (dCas9). The fusion protein is targeted to a locus of choice by way of a customizable single guide RNA (sgRNA)2-10. Examples of dCas9-mediated epigenome editing include altering chromatin states by either targeting Krüppel-associated box (KRAB)6 or the histone acetyltransferase domain of p3002, thereby reducing or promoting enhancer function, respectively. Moreover, epigenome editing using Cas9-based approaches have been used to modify disease phenotypes in cells and in vivo11,12.
It is currently not possible to predict whether targeting a specific locus with a particular dCas9-EpiEffector fusion will result in efficient chromatin modification and alteration of gene expression. Several reasons have been proposed to account for this, ranging from the sequence of the sgRNA and the distance of its target from the transcriptional start site3-5, the chromatin structure already present at the target locus13-19, and/or the exact EpiEffector used2,4,10,17,19. Indeed, the same EpiEffector targeted at different loci can have very different effects10,17, highlighting that DNA context affects EpiEffectors in ways that are not understood.
Some DNA sequences can have profound effects on nucleosome positioning and chromatin structure20. A prime example of this is the expansion of CAG/CTG repeats, which causes 14 different neurological and neuromuscular diseases21,22. In healthy individuals, these sequences have less than 35 repeats at any one disease locus. However, they can expand and reach up to thousands of triplets. Once expanded, CAG/CTG repeats lead to changes in gene expression in their vicinity and to a heterochromatic-like state23-26. How these repetitive sequences might affect epigenome editing is unknown.
Here, we developed a method to understand how DNA sequence context can influence epigenome editing efficiency. We named the system ParB/ANCHOR-mediated induced targeting (PInT). With PInT, any protein of interest can be targeted near a sequence of choice. ParB, a bacterial protein, forms oligomers once it nucleates at its non-repetitive binding site, INT27. Fusing a protein of interest to ParB leads to the recruitment of many of the desired molecules to the INT locus. The targeting is inducible as we coupled ParB/ANCHOR to a chemically induced proximity (CIP) system derived from plants28. The target sequence is embedded in a GFP mini gene29 such that the effect of targeting on gene expression is easily monitored. Using PInT, we uncovered an unexpected effect of expanded CAG/CTG repeats on the effectiveness of histone deacetylase 5 (HDAC5) to modulate gene expression and found that this was due to the catalytic activity of HDAC3.
Results
ParB/ANCHOR-mediated induced targeting (PInT)
We designed PInT (Fig. 1) to be modular and highly controllable. It contains a GFP mini gene that harbours two GFP exons flanking the intron of the mouse Pem1 gene29,30. A doxycycline-inducible TetOn promoter drives the expression of the reporter. This cassette is always inserted at the same genomic location as a single copy integrant in T-Rex Flp-In HEK293 cells. Inside the intron, we inserted a 1029 bp non-repetitive sequence, INT, that contains four binding sites for dimers of the Burkholderia cenocepacia ParB protein. Once bound to INT, ParB oligomerizes in a sequence-independent manner, recruiting up to 200 ParB molecules31. This ParB/ANCHOR system was first used in live yeast cells to visualize double-strand break repair27. More recently, it has been used to monitor the mobility of a genomic locus upon activation of transcription32 and to visualize viral replication33 in live mammalian cells. We made the system inducible by fusing ParB to a plant protein called ABSCISIC ACID INSENSITIVE (ABI), which dimerizes with PYRABACTIN RESISTANCE1-LIKE (PYL) upon addition of abscisic acid (ABA) to the culture medium28. ABA is a plant hormone that is not toxic to human cells, making this CIP system especially convenient. Within 319bp of the INT sequence, there is a cloning site that can be used to insert any DNA motif. Thus, fusing any protein of interest to PYL allows for full temporal control over the recruitment of a protein of interest near a DNA sequence of choice.
It was important to determine whether the components of PInT affect the expression of the GFP reporter. We first tested whether ABA changed GFP expression in GFP(CAG)0 cells29. These cells carry the GFP mini gene without the INT sequence or any additional sequences in the intron (see Table S1 and Fig. S1 for details on cell line construction). We found that treatment with up to at least 500 µM of ABA, which induces the dimerization between PYL and ABI, did not affect GFP expression (Fig. S2AB). We also transiently transfected GFP(CAG)0 cells with plasmids expressing the ParB-ABI fusion. This had no detectable effect on the behaviour of the reporter (Fig. S2C). We next inserted the INT sequence inside the Pem1 intron and integrated this construct using site-directed recombination, generating GFP-INT cells. These cells contain INT but no additional sequence within the intron. They do not express ParB-ABI. We found that the insertion of the INT sequence had little, if any, discernable effect on the GFP expression (Fig. S2D). We conclude that by themselves, the individual components of PInT do not significantly interfere with GFP expression.
We then stably integrated both the GFP-INT reporter and the ParB-ABI fusion to generate GFP-INT-B cells. We found a decrease in GFP expression that correlated with high levels of ParB-ABI (Fig. S2EFG), suggesting that the binding of ParB-ABI has a predictable effect on the GFP reporter. To avoid any complication, we integrated ParB-ABI early in the construction pipeline such that all the cell lines presented here contain the same amount of ParB-ABI (Fig. S1).
Next, we determined the efficiency of targeting PYL to the INT sequence and the consequences on GFP expression. We used nB-Y cells, which contain the GFP mini gene with the INT sequence, stably express both ParB-ABI (B) and PYL (Y), and contain n CAG repeats, in this case either 16, which is in the normal range, or an expanded repeat of 91 triplets (Fig. 2A, Fig. S3A). We found, using chromatin immunoprecipitation followed by qPCR (ChIP-qPCR), that only 0.1% of the input DNA could be precipitated when we treated the cells with the vehicle, DMSO, alone. By contrast, the addition of ABA to the cell media increased the association of PYL to the INT locus significantly, reaching 1.9 and 2.5% of the input chromatin pulled down in 16B-Y or 91B-Y cells, respectively (Fig. 2B). These results demonstrate the inducible nature of the system and show that the presence of the expansion does not interfere with the targeting efficiency. Importantly, PYL targeting had no effect on GFP expression as measured by flow cytometry (Fig. 2C). We conclude that PInT works as an inducible targeting system and that PYL targeting is efficient and does not further affect gene expression.
HDAC5 silencing depends on CAG/CTG repeat size
We next sought to test whether we could manipulate GFP expression using HDAC5. This class IIa deacetylase impacts gene silencing and heterochromatin maintenance34,35 as well as cell proliferation35,36. The PYL-HDAC5 fusion was functional since GFP-INT cells transiently expressing this fusion had slightly lower GFP expression than those expressing PYL alone (Fig. S4A). We created isogenic nB-Y-HDAC5 cells that express stably a PYL-HDAC5 fusion and have 16 or 59 CAG repeats within the GFP reporter (Fig. 3A). We found that adding ABA to these cells led to an increase in pull-down efficiency of PYL-HDAC5 at the INT locus from 0.06% to 2.2% in 16B-Y-HDAC5 cells and from 0.1% to 3% of input in the presence of 59 repeats (Fig. 3B). This was accompanied by a significant 2-fold decrease in GFP expression in 59B-Y-HDAC5 cells, whereas the decrease was of 3 folds in 16B-Y-HDAC5 cells (Fig. 3C, P=0.001 and P= 0.0015 using a paired Student’s t-test comparing conditions with ABA to those with DMSO only in 16B-Y-HDAC5 and 59B-Y-HDAC5 cells, respectively). Remarkably, the decrease in expression was significantly lower in the context of an expanded repeat (Fig. 3C, P=0.0001 comparing the decrease in expression upon ABA addition between the 16B-Y-HDAC5 and 59B-Y-HDAC5 using a Student’s t-test). Targeting efficiency of PYL-HDAC5 does not account for the repeat size-dependent effect since it was slightly higher in 59B-Y-HDAC5 than in 16B-Y-HDAC5 cells (Fig. 3B). To determine whether the effect is due to targeting at the INT locus, we transiently expressed PYL-HDAC5 in GFP(CAG)0B cells, which have no INT in their GFP reporter but express ParB-ABI. Adding ABA to these cells did not affect GFP expression (Fig. S4BC), suggesting that the presence of the INT sequence is essential. Moreover, PYL-HDAC5 targeting reduced the levels of acetylated histone H3 (acH3) (P=0.0001 and P=0.024 comparing DMSO treated and ABA-treated 16B-Y-HDAC5 and 59B-Y-HDAC5, respectively, using a Student’s t-test), as measured by ChIP-qPCR. The decrease in acH3 upon targeting was greater in 16B-Y-HDAC5 than in 16B-Y cells (P=0.006 using a Student’s t-test), consistent with a role for HDAC5 in silencing gene expression.
Interestingly, the acH3 levels at the INT sequence were similar between 16B-Y and 91B-Y and between 16B-Y-HDAC5 and 59B-Y-HDAC5 (Fig. 3DE, P=0.95, and P=0.25, respectively using a Student’s t-test), suggesting that the acH3 levels are unaffected by the expansion. We conclude that PYL-HDAC5 targeting silences better the lines with the shorter repeats.
The N-terminal domain of HDAC5 mediates silencing
Class I HDACs derive their catalytic activity in vitro from a conserved tyrosine residue that helps coordinate a zinc ion essential for catalysis37. By contrast, class IIa enzymes, like HDAC5, have a histidine instead of tyrosine at the analogous site, which considerably lowers HDAC activity37. In fact, the H1006Y mutant had more than 30-fold increase in its HDAC activity compared to the wild type enzyme37. To determine whether the catalytic activity of HDAC5 potentiates the decrease in GFP expression upon targeting, we compared the silencing activity of wild-type PYL-HDAC5, the H1006A loss-of-function mutant, and the H1006Y gain-of-function mutant by transient transfection in 40B cells, which contain the GFP-INT reporter with 40 CAGs and express ParB-ABI (Fig. 4A). The effect on silencing seen upon targeting of the wild-type PYL-HDAC5 fusion was lower when delivered by transient transfection compared to the stable cell lines. Nevertheless, under these conditions, targeting PYL-HDAC5-H1006A or PYL-HDAC5-H1006Y could both silence the transgene compared to targeting PYL alone (Fig 4B; P= 0.01 and 0.0008, respectively, using a Student’s t-test), suggesting that tampering with the catalytic activity of HDAC5 does not influence silencing of our GFP reporter. Moreover, targeting PYL fused to the catalytic domain of HDAC5 did not shift GFP expression (Fig. 4B). Indeed, we find that the silencing activity was contained within the N-terminal part of HDAC5, which characterizes class IIa enzymes. Further truncations (Fig. 4AB) are consistent with a model by which the coiled-coil domain of HDAC5, which is responsible for homo and heterodimerization of class IIa enzymes in vitro38, contains the silencing activity.
PYL-HDAC3 targeting enhances GFP expression independently of its catalytic activity
HDAC5 is thought to mediate histone deacetylation by recruiting other HDACs, including HDAC339. Therefore, we hypothesized that PYL-HDAC3 targeting should have the same effect on GFP expression as PYL-HDAC5 targeting. To address this directly, we made a PYL-HDAC3 fusion and overexpressed it in 40B cells without targeting (Fig. S4D). We found that there was a slight decrease in GFP expression, suggesting that the construct could silence gene expression. Next, we generated nB-Y-HDAC3 cells and compared GFP intensities with and without ABA. Contrary to our initial hypothesis, we found that targeting PYL-HDAC3 in both 16B-Y-HDAC3 and 89B-Y-HDAC3 increased GFP expression by 1.5 fold (Fig. S5AB, P=0.0004 and P=0.001 using paired Student’s t-tests comparing ABA and DMSO treatments in 16B-Y-HDAC3 and 89B-Y-HDAC3, respectively). The effect appeared direct since adding ABA to GFP(CAG)0B cells transiently expressing PYL-HDAC3 did not affect GFP expression (Fig. S4E). The increase in GFP expression in nB-Y-HDAC3 cells was accompanied by an efficient targeting of the PYL-HDAC3 fusion (Fig. S5C) and an increase in acH3 levels (Fig. S5D). However, treatment with the HDAC3-specific small molecule inhibitor RGFP96640 did not affect the increase in GFP expression in neither 16B-Y-HDAC3 nor 89B-Y-HDAC3 cells (Fig. S5E). We conclude that targeting PYL-HDAC3 increases GFP expression independently of its HDAC activity, consistent with the observation that HDAC3 has an essential role during development that does not involve its HDAC activity41.
HDAC3 activity is required for the repeat size-specificity upon HDAC5-mediated silencing
Although HDAC3 targeting did not have the expected effect on GFP expression, evidence shows that its catalytic activity is implicated in HDAC5-mediated silencing39. To determine the potential catalytic role of HDAC3 in this context, we targeted PYL (Fig. 5A) or PYL-HDAC5 (Fig. 5B) to our GFP reporter in nB-Y and nB-Y-HDAC5 cells while cultivating the cells in the presence of RGFP966. We find that although this treatment had no effect on PYL targeting (Fig. 5A), it abolished the allele-length specificity of PYL-HDAC5 targeting, leading to a silencing efficiency of 2.4 and 2.5 folds for 16B-Y-HDAC5 and 59B-Y-HDAC5, respectively (Fig. 5B, P= 0.77 using a Student’s t-test). This is in contrast to the RGFP966-free conditions where targeting PYL-HDAC5 silenced better the normal-sized allele (Fig. 3). These results suggest that HDAC3 mediates the CAG repeat size-dependency upon PYL-HDAC5 targeting.
Discussion
We presented here a novel assay to investigate the effect of a DNA sequence of interest on the efficiency of a chosen EpiEffector in altering gene expression. As an example of how the DNA context may affect the activity of an EpiEffector, we showed that expanded CAG/CTG repeats decrease the silencing efficiency of HDAC5. Moreover, we determined that this allele-length specificity depends on HDAC3 activity, highlighting the potential of PInT in uncovering unique mechanistic insights. These data provide evidence that local DNA sequence context is an important determinant of epigenome editing, independently of the efficiency or mode of targeting.
PInT could be used for many different applications. First, the intron can host sequences beyond CAG/CTG repeats. Indeed, the GFP mini gene we used here, without the targeting components, was recently used to monitor the effect of a RNA polymerase III gene on RNA polymerase II-mediated transcription42. Second, it is often difficult to differentiate between a chromatin modifier changing gene expression because of a local effect on chromatin structure or indirectly through changes in the transcriptome. PInT allows making that distinction thanks to its inducible nature. Indeed, we found that overexpressing PYL-HDAC5 had a small effect on gene expression at the GFP reporter and that targeting it further decreased expression. We could conclude that PYL-HDAC5 can act locally to silence the transgene. This is useful in dissecting the mechanisms of action of EpiEffectors. Third, we demonstrated, using mutants and truncations of HDAC5, that we can quickly screen for protein domains and mutants that are most effective in modulating gene expression. Thus, PInT could be used to design peptides with sufficient activity to be useful in downstream epigenomic editing applications, for example when using dCas9 fusions in vivo. A current limitation of the S. pyogenes Cas9 for in vivo applications is its large size, which is at the limit of what adeno-associated viral vectors can accommodate43. Even with the smaller orthologues, fitting a dCas9 fusion inside a gene delivery vector is a challenge. Therefore, being able to trim an EpiEffector down to a small domain may help optimizing downstream applications and translation.
The observation that HDAC5 targeting has a differential effect on gene expression depending on the size of the repeat tract is surprising. Our data suggest that the deacetylase activity of HDAC3 is required for this effect. Importantly, we cannot currently rule out that RGFP966 may inhibit other HDACs that would be responsible for this effect. Nevertheless, this small molecule is highly selective for HDAC340, making this HDAC the most likely candidate for driving allele-specific silencing. HDAC3 could be setting up an asymmetry between the two size alleles in several ways. For instance, it could deacetylate histones (those residues not recognized by the pan-acetylated histone H3 antibody that we used) or non-histone proteins in the vicinity of the expanded CAG/CTG repeat prior to HDAC5 targeting. More work is required to understand further the mechanism of the repeat length-specific silencing.
Several studies have suggested that the ectopic insertion of an expanded CAG/CTG repeat in mice could induce changes in chromatin structure in the abutting sequences. An early example was the random insertion of arrays of transgenes, each carrying 192 CAGs44, which led to the silencing of the transgenes independently of the site of genomic integration. In addition, inserting a 40 kb human genomic region containing the DMPK gene along with an expansion of 600 CTGs45, or a 13.5Kb region containing the human SCA7 gene with 92 CAGS46 all led to changes in chromatin marks near the expansion. It has been unclear, however, whether the presence of endogenous sequence elements, like CpG islands47 and CTCF binding sites26,48, is necessary for this effect. Our data show that 91 CAGs, without the flanking sequences normally present at the DMPK gene from whence this repeat was cloned49, does not lead to significant changes in the levels of acetylated histone H3 in its vicinity. These data suggest that the flanking sequence elements play important roles in the induction and/or maintenance of heterochromatic marks surrounding expanded CAG/CTG repeats.
Recently, a number of studies have proposed that silencing the expanded repeat allele without affecting the expression of the normally sized allele may lead to a novel therapeutic approach for expanded CAG/CTG repeats50-52. However, only one factor, which is essential for mouse development52, has been identified so far. We speculate that PInT may be adapted to screen for allele length-specific silencers, which could help uncover novel therapeutic options for expanded CAG/CTG repeat disorders.
Materials and Methods
Cell culture conditions and cell line construction
The majority of the cell lines used, including all the parental lines, were genotyped by Microsynth, AG (Switzerland) and found to be HEK293.2sus. They were free of mycoplasma as assayed by the Mycoplasma check service of GATC Biotech. The cells were maintained in DMEM containing 10% FBS, penicillin, and streptomycin, as well as the appropriate selection markers at the following concentrations: 15 µg ml-1 blasticidin, 1µg ml-1 puromycin, 150µg ml-1 hygromycin, 400 µg ml-1 G418, and/or 400 µg ml-1 zeocin. The incubators were set at 37 °C with 5% CO2. Whereas FBS was used to maintain the cells, dialyzed calf serum was used at the same concentration for all the experiments presented here. The ABA concentration used was 500 µM, unless otherwise indicated. Doxycycline (dox) was used at a concentration of 2 µg ml-1 in all experiments.
A schematic of cell line construction and pedigree is found in Figure S1, and the lines are listed in Table S1. This table includes the plasmids made for cell line construction. The plasmids used for transient transfections are found in Table S2. For each cell line, single clones were picked and tested for expression of ParB-ABI and PYL-fusions by western blotting using the protocol described before30. Briefly, whole cell extracts were obtained, and their protein content was quantified using the Pierce BCA Protein Assay Kit (ThermoScientific). Proteins were then run onto Tris-glycine 10% SDS PAGE gels before being transferred onto nitrocellulose membrane (Axonlab). The membranes were blocked using the Blocking Buffer for Fluorescent Western Blotting (Rockland), and primary antibodies were added overnight. Membranes were then washed followed by the addition of the secondary antibody (diluted 1 to 2000). The fluorescent signal was detected using an Odyssey Imaging System (Li-CoR). All antibodies used are found in Table S3. To assess repeat sizes, we amplified the repeat tracts using oVIN-0459 and oVIN-0460 with the UNG and dUTP-containing PCR as described53 and then Sanger-sequenced by Microsynth AG (Switzerland). The sequences of all the primers used in this study are found in Table S4.
The ParB-INT sequence system used here is the c2 version described previsouly27, except that the ParB protein was codon-optimized for expression in human cells. It is also called ANCHOR1 and is distributed by NeoVirTech. ParB-ABI (pBY-008), PYL (pAB-NEO-PYL), PYL-HDAC5 (pAB(EXPR-PYL-HDAC5-NEO)) and PYL-HDAC3 (pAB(EXPR-PYL-HDAC3-NEO)) constructs were randomly inserted and single clones were then isolated (Table S1). GFP-reporter cassettes were inserted using Flp-mediated recombination according to the manufacturer’s instruction (Thermo Scientific). Single colonies were picked and screened for zeocin sensitivity to ensure that the insertion site was correct.
Targeting assays
For targeting assays involving transient transfections, cells were plated onto poly-D-lysine-coated 12-well plates at a density of 6×105 cells per well and transfected using 1 µg of DNA per well and Lipofectamine 2000 or Lipofectamine 3000 (Thermofisher Scientific). 6 hours after transfection, the medium was replaced with one containing dox and ABA or DMSO. 48h after the transfection, the cells were split, and fresh medium with dox and ABA or DMSO was replenished. On the fifth day, samples were detached from the plate with PBS + 1 mM EDTA for flow cytometry analysis.
In the case of the stable cell lines, cells were seeded at a density of 4×105 per well in 12-well plates. The media included dox and ABA or DMSO. The medium was changed 48 hours later and left to grow for another 48 hours. The cells were then resuspended in 500µl PBS + 1 mM EDTA for flow cytometry analysis.
Flow cytometry and analysis
We used an Accuri C6 flow cytometer from BD and measured the fluorescence in at least 12 500 cells for each treatment. The raw data was exported as FCS files and analyzed using FlowJo version 10.0.8r1.
Chromatin immunoprecipitation
For chromatin immunoprecipitation, the cells were treated as for the targeting experiments except that we used 10 cm dishes and 4×106 cells. After 96 hours of incubation, paraformaldehyde was added to the medium to a final concentration of 1% and the cells were incubated for 10 minutes at room temperature. The samples were then quenched with 0.125 M PBS-glycine for 5 minutes at room temperature. Samples were then centrifuged, the supernatant was discarded, and the cell pellets were washed with ice-cold PBS twice. The samples were split into 107 cell aliquots and either used immediately or stored –75 °C for later use. Sonication was done using a Bioruptor for 25 to 30 min. DNA shearing was visualized by agarose gel electrophoresis after crosslink reversal and RNase treatment. 20% of sonicated supernatant was used per IP, with 3 μg anti-FLAG (M2, Sigma), anti-PAN acetylated H3 (Merck), or anti-IgG (3E8, Santa Cruz Biotechnology) on Protein G Sepharose 4 Fast Flow beads (GE healthcare). The samples were incubated at 4°C overnight and then washed with progressively more stringent conditions. After the IP, the samples were de-crosslinked and purified using a QIAquick PCR purification kit (Qiagen) and analyzed using a qPCR.
Quantitative PCR
Quantitative PCR was performed with the FastStart Universal SYBR Green Master Mix (Roche) using a 7900HT Fast Real-Time PCR System in a 384-Well Block Module (Applied Biosystems™). Primers used to detect enrichment at the INT sequence and at ACTA1 gene are listed in Table S4. Ct values were analyzed using the SDS Software v2.4. The percentage of input reported was obtained by dividing the amount of precipitated DNA for the locus of interest by the amount in the input samples multiplied by 100%.
Statistics
We determined statistical significance in the targeting experiments using a two-tailed paired Student’s t-test because the samples treated with DMSO and ABA were from the same original population and treated side-by-side. For the ChIP samples, the test used was a two-tailed Student’s t-test. All the statistical analyses were done using R studio version 3.4.0. We concluded that there was a significant difference when P < 0.05.
Author contributions
BY performed the experiments except for those presented in Fig. 4, which were done by ACB. ACB and LA helped BY in generating the cell lines. BY, ACB, and VD designed the experiments. BY and VD wrote the paper and prepared the figures.
Acknowledgements
We thank John H. Wilson and Kerstin Bystricky for sharing reagents as well as Fisun Hamaratoglu, Helder C. Ferreira, Ana C. Marques, Johanna E. Martin, and Nastassia Gobet for critical reading of the manuscript. This work was funded by SNSF professorship grants #144789 and #172936 to V.D.