Abstract
The methylcytosine dioxygenase Tet3 is highly expressed as a specific isoform in oocytes and zygotes but essentially absent from later stages of mouse preimplantation development. Here, we show that Tet3 expression promotes transdifferentiation of embryonic stem cells to trophoblast-like stem cells. By genome-wide analyses we demonstrate that TET3 associates with and co-occupies chromatin with RNA Polymerase II. Tet3 expression induces a global increase of transcription and total RNA levels, and its presence further enhances chromatin accessibility in regions open for transcription. This novel function of TET3 is not specific to the oocyte isoform, independent of its catalytic activity, the CXXC domain, or its interaction with OGT, and is localised in its highly conserved exon 4. We propose a more general role for TET3 promoting open chromatin and enhancing global transcription during changes of cellular identity, separate from its catalytic function.
Introduction
The Ten-eleven translocation methylcytosine dioxygenase (TET) family of proteins catalyse the oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) and further oxidation products (Tahiliani et al. 2009; Ito et al. 2010, 2011; He et al. 2011). Dilution or removal of these modified bases is one mechanism by which DNA methylation can be reset in the genome (Branco et al. 2012; Kohli and Zhang 2013; Rasmussen and Helin 2016). In addition to the catalytic domain, TET1 and TET3 encode a CXXC domain, which is proposed to aid in targeting the enzymes to specific genomic regions (Xu et al. 2011, 2012; Long et al. 2013). While less studied, TET proteins also have non-catalytic roles, in particular by recruiting other chromatin and epigenetic modifiers, most notably O-linked N-acetylglucosamine transferase (OGT), to their genomic targets (Vella et al. 2013; Chen et al. 2013; Deplus et al. 2013; Ito et al. 2014).
Of the three family members, Tet3 is the least understood, partially due to its unique expression pattern. In contrast to Tet1 and Tet2, Tet3 is robustly expressed in mouse oocytes and zygotes, after which it is rapidly silenced and largely absent by the 4-cell stage (Gu et al. 2011; Iqbal et al. 2011; Wossidlo et al. 2011), coinciding with overall degradation of maternal RNA (Li et al. 2010, Fig. EV1A). In the current model, TET3 is responsible in part for demethylation of the paternal genome and also contributes to the removal of 5mC in the maternal pronucleus (Gu et al. 2011; Iqbal et al. 2011; Wossidlo et al. 2011; Santos et al. 2013; Guo et al. 2014; Peat et al. 2014; Shen et al. 2014). However, this view has recently been challenged: Amouroux and co-workers argue that initial loss of paternal 5mC is mechanistically uncoupled from 5hmC formation (Amouroux et al. 2016). Given TET3’s striking expression pattern and its proposed role in the resetting of epigenetic marks in the early embryo, it was rather unexpected that lack of TET3’s catalytic function was compatible with pre-implantation development (Gu et al. 2011; Peat et al. 2014; Tsukada et al. 2015). We were therefore curious to investigate potential other functions of TET3 unrelated to DNA demethylation.
Here, through integration of transcriptome analysis, assessment of chromatin accessibility and genome-wide chromatin-immunoprecipitation, together with proteomic analysis and global transcription/RNA assays, we provide detailed evidence that TET3 induces hypertranscription and open chromatin independent of its catalytic function. We discuss these results in the light of a role for TET3 in facilitating changes of cell identity.
Results
Tet3 is expressed during cellular transitions and promotes change of cell identity
Oocytes produce very high levels of Tet3 (Gu et al. 2011; Iqbal et al. 2011; Wossidlo et al. 2011). Analysis of RNA-seq data from oocytes (Smallwood et al. 2011) and quantitative reverse-transcription PCR indicated the existence of an oocyte specific promoter and a characteristic splice isoform of Tet3. By analysing deep RNA-sequencing (RNA-seq) data of mouse oocytes (Veselovska et al. 2015), we found that the Tet3 transcript produced in oocytes indeed originates from an upstream promoter which adds a small oocyte specific exon and mostly omits exon 2 which encodes the CXXC domain (Fig. 1A) yet retains the catalytic domain. The promoters upstream of exon 2 and exon 4 which are active in other tissues are not used in oocytes. Supporting our analysis, an oocyte specific Tet3 isoform has also recently been reported by Jin et al. (2016).
This led us to hypothesise that the oocyte specific isoform of Tet3 may have a different function to somatically expressed Tet3. Since its expression window coincides with the establishment of totipotency in vivo, we explored the possibility that expression of Tet3 is involved in the fundamental reorganisation from gametes to zygote and potentially enhances cellular potency. To test this, we ectopically expressed the oocyte specific isoform of Tet3 (Tet3 for brevity) in ES cells which normally express only very low levels of the gene (Wossidlo et al. 2011) and analysed their potential to transdifferentiate into trophoblast stem-like (TSL) cells.
ES cells are derived from the inner cell mass of the blastocyst at a developmental stage at which the first lineage segregation has already occurred: the inner cell mass is the precursor of all embryonic lineages but cannot contribute to the extra-embryonic lineages as a firm lineage barrier has already been established (Rossant 2008). Analogously, in vitro, ES cells do not readily differentiate into their extra-embryonic counterpart, trophoblast stem (TS) cells (Ng et al. 2008). However, trans-differentiation assays, which employ defined culture conditions, can coax ES cells into resembling TS cells (TS-like cells), albeit at very low frequency (Fig. 1B, C). Transdifferentiation of ES into TS-like cells can be boosted by DNA demethylation (Ng et al. 2008), the ablation of embryonic pluripotency factors such as Oct4, or ectopic overexpression of early trophoblast transcription factors including Cdx2 (Cambuli et al. 2014). We compared the transdifferentiation ability of TET3 expressing ES cells to control ES cells (E14) and well characterised transdifferentiation models (constitutive activation of the Ras ATPase or Oct4 knock-out (Cambuli et al. 2014)). In contrast to control ES cells, many colonies of TET3 expressing cells subjected to transdifferentiation conditions displayed a flat, epithelial-like morphology reminiscent of TS cells (Fig. EV1B). Additionally, the percentage of cells expressing the surface marker CD40 was substantially elevated in Tet3 overexpressing ES cells (24 % compared to 5 %), nearly reaching the level of the Ras-induced transdifferentiation (30 %, Figure 1C). The cell surface marker CD40 is expressed in TS cells, but not in ES cells (Rugg-Gunn et al. 2012). Of note, the ability of cells to acquire features of TS-like cells was dependent on the level of TET3, with Tet3 high-expressing cells moving more towards the trophoblast lineage than Tet3 low-expressing cells (Figure 1C, EV1C).
Interestingly, Tet3 expression is very low in ES cells (Wossidlo et al. 2011) and not substantially upregulated in TS cells (expression data from Adachi et al. 2013). TET3 is therefore unlikely to induce expression of a lineage-specific set of genes. Thus, we interpret the finding that TET3 enhances transdifferentiation from ES to TS-like cells as an ability to expand cellular potency and/or facilitate changes of cell identity.
Intriguingly, apart from its roles in early embryonic development TET3 expression changes have also been reported in a number of seemingly unrelated studies: For example, while Tet3 levels are low in both ES and epiblast stem (EpiES) cells, its transcript is temporarily upregulated during differentiation from ES to EpiES cells (Veillard et al. 2014). Moreover, Tet3 is higher in embryos produced by somatic cell nuclear transfer compared to in vitro fertilisation in bovine (Hosseini et al. 2016), relating its upregulation to somatic reprogramming. Also, its expression was found to be beneficial for intracellular sperm injection (ICSI) outcome in humans (Ni et al. 2016). The unifying theme of these reports is the presence and potential role of Tet3 at transition stages.
Extending these observations, we monitored Tet3 expression in an experimental system in which not only the endpoints but also intermediate stages are well characterised, namely reprogramming from mouse embryonic fibroblasts (MEFs) to induced pluripotent stem (iPS) cells. Using data from (Milagre et al. 2017) we found that TET3 is strongly upregulated during early reprogramming (day 6) but almost absent in late stages (day 21, 29) and fully reprogrammed iPS cells (Fig. 1D). Importantly, at day 6, only cells that will go on to be reprogrammed express high levels of Tet3, but those that prove refractory do not, suggesting that the peak in Tet3 expression may be required for successful initiation of reprogramming.
To further explore a potential role in cell identity transitions, we investigated Tet3 levels during primordial germ cell (PGC) development in vivo (data from Seisenberger et al. 2012). Tet3 transcript levels increase substantially from E11.5 to E16.5 (Fig. 1E). Of note, this occurs in both male and female PGCs, arguing against an early build-up of transcript for the high levels observed in the mature oocyte.
Since these examples of Tet3 expression coincide with a change, but not necessarily an increase, in cellular potency, and the molecular analysis of different Tet3 isoforms did not support a unique function for oocyte specific TET3 (see below), we propose that TET3 can facilitate changes in cell identity. The transition from highly specialised games to totipotent zygote would present an extreme case of reworking cellular identity, and is accompanied by the highest Tet3 expression levels.
TET3 expression increases transcription and global RNA levels
We next analysed whether the increased ability to change cell fate was a result of specific transcriptional changes. For this we performed total RNA-seq on TET3 overexpressing ES and control cells (E14 stably transfected with a Tet3 expression construct or empty vector, see Methods and Table S6). Tet3 expression levels were similar to expression levels in oocytes (Figure EV2). To our initial surprise only nine genes were robustly differentially expressed (Fig. 2A, Table S1; for analysis and interpretation of differentially expressed genes using less stringent filtering see Text S1, Fig. S1). No gene ontology groups or biological pathways were prominent, and we could not infer any biological significance of the differentially expressed genes. Interestingly, analysis of the repetitive fraction of the transcriptome showed a dramatic upregulation of MERVL endogenous retroviral elements (Fig. 2B). MERVL expression is characteristic for the transcriptome of 2-cell embryos (Kigami et al. 2003; Evsikov et al. 2004; Peaston et al. 2004) and drives a network of early embryonic genes (Macfarlan et al. 2011). While genes part of this network are generally upregulated in Tet3 overexpressing cells, they do not pass stringent thresholds for differential expression. MERVL elements are also expressed during iPSC reprogramming at a time when Tet3 expression peaks (Eckersley-Maslin et al. 2016).
Strikingly, however, we noticed a globally increased exon to intron ratio in TET3 expressing cells suggesting that there was overall more mature relative to nascent transcript (Fig. 2C). Amongst other possibilities, a higher exon/intron ratio would be observed in RNA-seq data if the cells contained more RNA in total since mature transcripts comprise the bulk of the total read count to which samples are normalised. This could be due to a global change in transcription, RNA processing and/or turnover. Unfortunately, since we had not anticipated a global effect, our experimental setup did not feature controlled cell numbers nor external references which made it impossible to discriminate between these possibilities (Percharde et al. 2017b). However, the finding prompted us to investigate these further.
We next carefully measured the RNA content in a defined number of cells using sensitive fluorimetric quantitation. Indeed, we found that TET3 expressing cells contained 1.5-fold more total RNA than control cells (29 pg/cell vs. 20 pg/cell, Fig. 2D). Additionally, to directly measure transcriptional output of TET3 overexpressing versus control cells and distinguish between an increase in transcription rate and RNA processing and/or turnover, we monitored the production of nascent RNA using a fluorescence based assay (Jao and Salic 2008). Importantly, TET3 overexpressing cells showed 2-fold higher signals for nascent RNA than control cells (Fig. 2E, F; specificity of antibody Fig. S2A). Taken together, data from RNA-seq, nascent transcription assays and quantification of total RNA demonstrate that TET3 substantially increases transcriptional output and raises the RNA content per cell.
TET3 binds to chromatin regions occupied by RNA Polymerase II
We next asked which genomic regions were bound by TET3 using chromatin immunoprecipitation followed by sequencing (ChIP-seq) in ES cells ectopically expressing TET3. Binding was widespread throughout the genome and predominantly occurred at promoter regions and CpG islands, but was also enriched at active enhancers and transcribed genes (Fig. 3A, B; antibody specificity Fig. S2B).
Intriguingly, we noted that TET3 was enriched at genomic features known to be bound by RNA Polymerase II (PolII; Fig. EV3; PolII ChIP data from Rahl et al. 2010). Furthermore, the general binding pattern on the genic as well as the genomic level was remarkably similar (Fig. 3B). We therefore explored whether there was a quantitative relationship between TET3 and PolII binding. Indeed, promoters that were not bound by PolII also did not show TET3 occupancy, while progressively higher PolII binding was associated with higher TET3 occupancy (Fig. 3C, D). Of note, TET3 and PolII not only co-occupied genomic regions highly bound by the respective proteins, but also regions which showed moderate PolII enrichment, such as gene bodies of actively transcribed genes (Fig. 3B, E). Co-occupancy of genomic regions by TET3 and PolII was confirmed by rapid immunoprecipitation mass spectrometry of endogenous proteins (RIME, (Mohammed et al. 2013, 2016)) in which immunoprecipitation of TET3 co-purified several subunits of the PolII complex (Fig. 3F, Table S2, antibody specificity Supp. Fig. S2C).
TET3 enhances chromatin accessibility in RNA Polymerase II bound regions
Prompted by the global increase in transcription and RNA levels, we investigated chromatin accessibility by performing ATAC-seq. This assay exploits the preferential activity of transposase on exposed stretches of DNA in chromatin (Buenrostro et al. 2013). Overall, the pattern of accessible chromatin was very similar between TET3 overexpressing and control ES cells (Fig. 4A, Fig. EV4), which is in line with the observation that the transcriptional pattern does not change dramatically between the samples. Importantly, however, accessible chromatin tended to have a higher ATAC-seq signal in TET3 overexpressing cells (Fig. 4A). Accordingly, for the vast majority of accessible regions identified by MACS peak calling (Zhang et al. 2008) in both sample groups, the ATAC-seq signal was higher in TET3 overexpressing cells (Fig. 4B). When MACS peaks were called independently for control and TET3 overexpressing cells, 85 % of control peaks were found in both samples (8893 out of 10451, Fig. 4C), with an additional 9920 peak regions called in TET3 overexpressing cells. Upon closer inspection, the latter were regions of moderate chromatin accessibility which, while present, did not pass the peak threshold in control cells (Fig. EV4). Moreover, chromatin was significantly more accessible in TET3 overexpressing cells at regions of high PolII occupancy (Fig. 4D), where chromatin is open for transcriptional activity (Fig. 4A) which underlines the link between TET3 and PolII in modulating global transcription. Taken together, ChIP and ATAC-seq results indicate that TET3 colocalises with PolII at open chromatin and enhances accessibility of those regions.
The global effect of TET3 on transcription is independent of catalytic activity and CXXC domain but encoded by evolutionarily conserved exon 4
TET3 is best known for its ability to convert 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) and its role in DNA demethylation (Pastor et al. 2013). We therefore investigated whether TET3’s ability to globally enhance transcription was linked to DNA cytosine methylation or its oxidation products. Surprisingly, forced expression of TET3 in ES cells did not result in global changes of 5mC or 5hmC levels (Fig. 5A), which is likely due to the naturally high abundance of TET1 and TET2 in these cells (Yue et al. 2014). To further explore the relationship with DNA methylation we overexpressed a deletion mutant lacking the entire catalytic domain (TET3trunc, Fig. 5B, Table S3). Remarkably, this caused the same global shift in exon/intron ratios as full-length Tet3 (Fig. 5C). In contrast, the catalytic domain on its own did not induce the characteristic transcriptional changes (Fig. EV5A). Moreover, like TET3, TET3trunc was also able to enhance conversion of ES to TS-like cells in a transdifferentiation assay (Fig. EV5B) suggesting that catalytic activity was not required to facilitate cellular transitions. Additionally, to rule out the possibility that catalytic activity was recruited through interaction with other TET proteins, we introduced TET3trunc into ES cells deleted for all three TET proteins (TET triple knock out cells, TET TKO, Dai et al. 2016). Analysis of total nascent RNA showed a global increase of transcription in these cells (Fig. 5D). We conclude that the observed effects uncover a novel function of TET3 which is independent of the protein’s catalytic activity or domain.
TET3 has previously been shown to partner with O-linked N-acetylglucosamine transferase (OGT) thereby influencing transcription independent of its oxidising function (Vella et al. 2013; Chen et al. 2013; Deplus et al. 2013; Ito et al. 2014). We confirmed TET3’s strong interaction with OGT by RIME (Table S2), however, this interaction was lost upon deletion of the catalytic domain (Fig. EV5D, Table S4, Deplus et al. 2013; Ito et al. 2014). Furthermore, genes identified by RNA-seq as upregulated were still activated by TET3 upon OGT knock-down (Fig. EV5E). Thus, the novel function described here is not mediated by the interaction of TET3 and OGT.
We next aimed to determine if the TET3 variant produced in oocytes was functionally different from the somatic variant. The most striking difference between the protein variants is the presence or absence of a CXXC domain encoded in exon 2 (Fig. 1A, 5B, Table S3). While in somatic tissues the CXXC domain is predominantly present (referred to as TET3CXXC), this exon is skipped in the Tet3 transcript found in oocytes (referred to as TET3). CXXC domains bind CpG rich sequences in DNA (Long et al. 2013) and have thus been proposed to be responsible for TET protein targeting. We therefore refer to the somatic variant as TET3CXXC. Strikingly, when we compared genomic binding profiles of TET3 and TET3CXXC by ChIP-seq, we found them to be virtually identical (Fig. 5E, F, EV5C) arguing that the CXXC domain does not influence targeting of TET3 in this system. Moreover, forced expression of TET3CXXC induced the same transcriptional changes as TET3 (Fig. EV5F) for a panel of genes affected by TET3 overexpression.
We attempted to further narrow down the region responsible for TET3’s transcriptional function. Tet3trunc consisting of the oocyte exon, and exons 3 and 4 (Fig. 5B) was further truncated to exon 4, as the other two encode only 11 and 19 amino acids, respectively. Indeed, specific overexpression of exon 4 alone was able to elicit the same transcriptional effect on genes affected by overexpressing full-length Tet3 (Fig. EV5G). Interestingly, while generally exons from Tet genes are similarly conserved across placental mammals, exon 4 displays much higher conservation scores than the corresponding exons in Tet1 and Tet2 (Fig. 5G, EV5H). Furthermore, comparison of TET1/2/3 at the protein level revealed that while the amino acid sequences of their catalytic domains are very similar, the sequences encoded by the large exon 4 are not (Fig. 5H). Overall, this supports the idea that Tet3 exon 4 carries a conserved function that is unique within the TET proteins.
Discussion
Defining and preserving cell identity is crucial for multicellular organisms. Once established, cell identity is protected by several levels of epigenetic regulation including DNA methylation, chromatin modifications and remodelling, and spatial arrangement of the genome (Barrero et al. 2010; Allis and Jenuwein 2016). However, cell identity must not be immutable as the life cycle of an organism requires cells to transition between states. Often these cell fate transitions entail changes in developmental potency of the cell and are accompanied by major alteration of the chromatin environment and transcriptional landscape (Chen and Dent 2014; Lee et al. 2014; Apostolou and Hochedlinger 2013). The beginning of a new generation marks the most extreme case of cell identity change: Upon fusion of two highly specialised, transcriptionally quiescent cells, the totipotent zygote is created which completely re-organises the parental genomes to start its own transcriptional program (Clift and Schuh 2013; Borsos and Torres-Padilla 2016; Percharde et al. 2017a). Much more frequent than this unique event, however, are cell identity changes during differentiation that restrict cellular potency. These occur for example at the exit from pluri- or multipotency during development and adult life of an organism (Krishnakumar and Blelloch 2013; Lee et al. 2014; Soufi and Dalton 2016). A common theme to all these transitions is widespread restructuring of chromatin and the rewiring of transcriptional networks. It has recently been proposed that during such transitions, global transcription is temporarily upregulated, a phenomenon termed hypertranscription (Percharde et al. 2017a). Here, we provide evidence that TET3 increases transcription, RNA levels and chromatin accessibility genome-wide and may thereby be involved in promoting changes of cell identity.
In this manuscript we show examples of Tet3 being expressed or upregulated during major cellular transitions. Tet3’s prominent presence in very early embryos can be viewed in this light: Highly abundant Tet3 transcript in oocytes/zygotes is rapidly and completely depleted by the 4-cell stage (Tan and Shi 2012) compatible with a function during this profound cellular reorganisation that later on is no longer required and possibly detrimental. Interestingly, global transcriptional activation mediated by TET3 also resulted in the upregulation of MERVL elements which orchestrate expression of a network of early embryonic genes (Macfarlan et al. 2011) critical for early preimplantation development (Kigami et al. 2003). This is in line with the possibility of TET3 playing a role in zygotic genome activation. Other examples of TET3 upregulation when cells undergo identity changes include exit from pluripotency (Veillard et al. 2014), iPS reprogramming (Milagre et al. 2017), SCNT reprogramming (Hosseini et al. 2016) and PGC specification (Seisenberger et al. 2012). In fact, PGCs have recently been shown to exhibit increased RNA levels, elevated transcription, increased cell size and upregulation of ribosomal protein transcripts (Percharde et al. 2017b), features that are mirrored in Tet3 expressing ES cells. An active role rather than a mere correlation is supported by the ability of ectopically expressed TET3 to enhance transdifferentiation from ES to TS cells, a system in which TET3 is not normally present. Our findings support the proposal that global hypertranscription may be an important component of cell identity changes during development (Percharde et al. 2017a).
Our results group TET3 with other factors globally enhancing transcription. For example, the chromatin remodeller Chd1 has been shown to enhance the activity of PolII by removing nucleosomal barriers (Skene et al. 2014; Guzman-Ayala et al. 2015). Upon deletion, global transcriptional output is reduced and embryos fail to sustain epiblast development (Guzman-Ayala et al. 2015). In a mechanistically different fashion, the global regulator c-Myc amplifies transcription in cancer cells by stimulating transcriptional pause release via P-TEFb (Rahl et al. 2010). Interestingly, hypertranscription in PGCs is also dependent on Myc/Max and P-TEFb (Percharde et al. 2017b). It will be interesting to explore further whether TET3 uses the same pathways or acts in a parallel manner.
In contrast to our expectations, we found that the transcriptional changes brought about by the oocyte and somatic isoforms of Tet3, as well as their chromatin occupancy were almost indistinguishable. This argues against a specific function of TET3 in the oocyte/zygote but supports the idea that TET3 generally promotes cellular transitions. Like many genes, Tet3 has an oocyte specific promoter (Veselovska et al. 2015) which likely functions to ensure temporally restricted high expression of Tet3 rather than producing a protein with a different function.
TET3 is clearly a multifunctional protein and its oxidase activity is only one mode of action. Several other reports have shown additional non-catalytic functionalities of the TET proteins (for an overview see (Lian et al. 2016, Figure 3). Most notably, like TET1 and TET2, TET3 interacts strongly with O-linked N-acteylglucosamine transferase (OGT) and plays an important role in the recruitment of OGT to chromatin, where it influences transcription through GlcNAcylation of histones and chromatin modifying complexes such as SET1/COMPASS (Vella et al. 2013; Chen et al. 2013; Deplus et al. 2013; Ito et al. 2014). However, the transcription and chromatin changes induced by TET3 are independent of its association with OGT. TET3 has also been shown to interact with REST and H3K36 methyltransferases to mediate transcriptional activation (Perera et al. 2015). Additionally, TET3 activity is regulated by CRL4 (Yu et al. 2013) and its intracellular location affected by post-translational modification through OGT (Zhang et al. 2014). It is noteworthy that the transcriptional function of Tet3 is independent of its catalytic and CXXC domains since other previously reported non-catalytic activities of TET proteins are still mediated by the catalytic domain (Deplus et al. 2013; Yu et al. 2013; Zhang et al. 2014; Ito et al. 2014).
Interestingly, and presumably a result of the multiple global activities of TET3, previous studies have reported different and sometimes contradictory loss of function effects. The comparison is potentially confounded by the use of different knock-out strategies which have deleted different parts of the transcript (Table S5). Given the evidence for a functional truncated transcript in vivo (ensemble transcript Tet3-002, ENSMUST00000056191.1, HAVANA project), certain knock-out designs may not result in the absence of the entire protein and therefore preserve some of the non-catalytic functions of TET3. For instance, the knock-out generated in our lab produces a frame shift and premature stop codon upstream of the catalytic domain (Santos et al. 2013; Peat et al. 2014), and while this completely abolishes catalytic activity, a truncated transcript including exon 4 is still present and translated. Two separate studies have deleted exon 4 (referred to as exon 3 in (Tsukada et al. 2015) and exon 2 in (Kang et al. 2015)). Interestingly, Tsukada and colleagues report that TET3 contributes to the fine-tuning of zygotic transcription after DNA synthesis, although their findings point to an inhibitory effect. Kang and colleagues report increased transcriptome variability in Tet1/3 double knock-outs in individual 8-cell blastomeres and blastocysts, concomitant with variable delayed or aborted development.
Taken together, our findings support TET3 as a multifaceted protein with potentially complementary functions; whereas the C-terminal half promotes the resetting of epigenetic marks, we provide evidence that the largely uncharacterised exon 4 can globally increase transcription. While in vitro temporary hypertranscription is sufficient to facilitate cellular transitions, in vivo it may function in concert with the removal of DNA methylation and deposition of chromatin modifications. We propose that TET3’s combined functions create a favourable chromatin environment for transitioning between cell identities. Importantly, we envision TET3 as a facilitator rather than an driver of change which still allows the protein to exert its catalytic function in systems not primed for change, for example the brain (Hahn et al. 2013). It is clear we are only beginning to understand the complex interplay of chromatin remodelling, epigenetic marks on DNA, and global transcription that enable change of the epigenetically well-guarded cellular identity.
Materials and Methods
Cloning of Tet3 variants
Tet3 variants were cloned from cDNA using MII oocytes (oocyte specific Tet3) or embryoid bodies (Tet3CXXC) and recombined into pDONR221 (Invitrogen) by Gateway cloning. Several expression constructs were used: Constitutive strong expression of the Tet3 transgene was achieved using vectors based on PB-DST-BSD which was a kind gift from Jose Silva. Transgenes can be inserted into this backbone via Gateway cloning and can then be expressed from a CAG promoter. Resulting plasmids from the pIG300 series (this study) do not contain a fluorescent marker, plasmids from the pIG400 series (this study) create a C-terminal eGFP fusion for the transgene. Inducible Tet3 expression was achieved using constructs based on PB_TAG_PB_tetO2_iresGFP which was a kind gift from Peter Rugg-Gunn (pIG200 series in this study). Transgenes can be inserted via Gateway cloning, and expression is driven by a tetracycline responsive CMV minimal promoter with eGFP being expressed from an internal ribosome entry site (IRES). A list of plasmids used for the generation of cell lines is provided in Table S6 and sequences are available on request.
Cell culture and cell line construction
E14 mouse embryonic stem cells were grown under standard serum/LIF conditions (DMEM, 4,500 mg/l glucose, 4 mM L-glutamine, 110 mg/l sodium pyruvate, 15 % fetal bovine serum, 1 U/ml penicillin, 1 mg/ml streptomycin, 0.1 mM nonessential amino acids, 50 mM b-mercaptoethanol, and 1000 U/ml LIF). TET3 KO cells and the parental cell line were a gift from Fabio Spada and Guo-Liang Xu and were cultured under the following conditions: DMEM, 4,500 mg/l glucose, 4 mM L-glutamine, 110 mg/l sodium pyruvate, 20 % fetal bovine serum, 1 U/ml penicillin, 1 mg/ml streptomycin, 0.1 mM nonessential amino acids, 50 mM b-mercaptoethanol, 2 mM GlutaMAX (Gibco), 1000 U/ml LIF, 1 μM PD0325901 and 3 μM CHIR99021. E14 ES lines harbouring Tet3 variants were generated by transfection with FuGENE6 (Promega), plasmid integration via the piggyBAC system (Kim and Pyykko 2011) and selection with the relevant drug. Control cell lines carry the empty vector and were equally selected. For the inducible expression system, piggyBAC vectors were co-transfected with a plasmid carrying a reverse tetracycline responsive transactivator (rtTA2s-M2). Unless otherwise indicated the pool of stable integrants was used, and TET3 expressing cells enriched by flow sorting for GFP fluorescence using a BC Influx High-Speed Cell Sorter. For the generation of clonal lines individual colonies were picked and expanded.
RNA isolation, qPCR and total RNA-sequencing
RNA was isolated using Qiagen RNeasy Micro columns and treated with DNaseI (Ambion). cDNA was generated using 0.5 – 1 μg RNA (Invitrogen SuperScript II) and qPCR performed using the Brilliant III SYBR mix (Agilent Technologies). Relative quantification was performed using the comparative CT method with normalisation to Hsp90 and Atp5b levels. Primer sequences are available on request. Opposite strand-specific total RNA libraries with RiboZero rRNA depletion (Illumina TruSeq Kit) were prepared from 200 ng – 1 μg DNase treated RNA by the Sanger Institute Illumina bespoke pipeline. Sequencing was performed as paired-end 75 bp runs on the Illumina HiSeq 2500 Rapid Run platform. At least three biological replicates were used, generating 28-177 × 106 mapped reads per sample (Table S7).
hmC and mC mass spectrometry
Approximately 350 ng DNA was digested to single nucleotides using the DNA Degradase Plus kit (Zymo). LC-MS/MS was performed on a Q-Exactive mass spectrometer (Thermo Scientific) fitted with a nanoelectrospray ion-source (Proxeon). Mass spectral data were acquired in MS/MS mode at a relative collision energy of 10%, selecting the parent ions at m/z 228.1 (C), 242.1 (mC) and 258.1 (hmC) with a 1 amu isolation width. Fragment ion spectra were acquired over the m/z range 100-300 at a nominal resolution setting of 35,000 (at m/z 200), and peak areas for the fragment ions 112.0505 (C), 126.0662 (mC) and 142.0611 (hmC) were obtained from extracted ion chromatograms of the relevant scans.
RNA-seq analysis
Total RNA-Seq reads were trimmed using Trim Galore v0.4.2 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) using default parameters to remove the standard Illumina adapter sequence. They were then mapped to the mouse GRCm38 genome assembly using Tophat 2.0.12 guided by the gene models from the Ensembl v70 release.
Figure 2: Prior to RNA isolation cell lines carrying different Tet3 constructs were enriched for cells with TET3 expression by flow sorting for GFP fluorescence using a BD Influx High-speed Cell Sorter. Data analysis was performed in SeqMonk (http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) using the following parameters: Reads in exons were quantitated with the RNA-seq quantitation pipeline, counts normalised per million total reads per sample and log2 transformed. Raw counts were used for DESeq2 analysis (Love et al. 2014). Differentially expressed genes were identified using the intersection of DESeq2 statistics (FDR<0.05) and an intensity difference filter (p<0.05 with Benjamini and Hochberg multiple testing correction, https://www.bioinformatics.babraham.ac.uk/projects/seqmonk/Help/5%20Filtering/5.2%20Statistical%20Filters/5.2.4%20Intensity%20Difference%20Filter.html). Functional enrichment analysis was performed using g:profiler (http://biit.cs.ut.ee/gprofiler/).
Figure S1: Differentially expressed genes were identified using the intersection of DESeq2 statistics (FDR<0.05) and a dynamic fold-change filter (https://www.bioinformatics.babraham.ac.uk/projects/seqmonk/Help/5%20Filtering/5.2%20Statistical%20Filters/5.2.4%20Intensity%20Difference%20Filter.html). For the comparison of groups ‘up’, ‘down’ and ‘random’ only genes were used which showed a minimum coverage by RNA-seq reads (log2 read count > -2). For quantitative data (e. g. gene length) two random sets of genes were generated of which one is shown (random samples were very similar throughout). For categorical data (e. g. fraction of paused genes), three random sets of genes were generated and average with standard deviation is shown. Statistical analysis was performed in Seqmonk or GraphPad Prism 6.
ChIP-Seq
Chromatin immunoprecipitation was performed essentially as described in (Schmidt et al. 2009). 6 15 cm dishes of ES cell lines expressing different TET3 isoforms (301, 305, 307, see Table S6) were cross-linked and harvested. 10 ug of a TET3 antibody developed with Millipore (ABE290) was used per precipitation, and cross-linked, sonicated nuclear material was used as input control. Libraries for Illumina sequencing from precipitated DNA were prepared using the Diagenode MicroPlex Library Preparation Kit and were sequenced on 2 lanes of an Illumina HiSeq2500 as 50 bp single-end reads. For samples and their respective read counts see Table S7.
ChIP-Seq reads were trimmed using Trim Galore v0.4.2 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) using default parameters to remove the standard Illumina adapter sequence. They were mapped to the mouse GRCm38 genome assembly using Bowtie 2 (v2.2.5, default parameters).
Figure3: Data analysis was performed in SeqMonk (http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) using the following parameters: Reads were quantitated in running windows (1 kb), counts normalised to the total number of sequences per sample and log2 transformed. Probes were filtered for those showing at least some background binding (normalised log2 read count per 1 kb >3). ChIP-seq data for N-terminal RNA PolII was taken from (Rahl et al. 2010). Feature analysis was performed using the following annotations: promoters: 500 bp upstream of mRNA; enhancers: H3K4me1 signature from (Creyghton et al. 2010); active enhancers: H3K27ac signature from (Creyghton et al. 2010); active genes: RNA PolII bound promoters (data from (Rahl et al. 2010) and transcribed (RNA-seq data from this study); inactive genes: non bound by RNA PolII (data from Rahl et al., 2010) and not transcribed (RNA-seq data from this study). Statistical analysis was performed in Seqmonk or GraphPad Prism 6.
ATAC-seq
ATAC-seq was performed as per (Buenrostro et al. 2013) using 10,000 cells per each of three biological replicates and 15 cycles of PCR amplification. Samples were barcoded, pooled and sequenced across two lanes of a HiSeq2500 as 75bp paired-end reads. For samples and their read counts see Table S7. ATAC-seq reads were quantified as log2 read count per million reads over 1 kb running windows. Visualisation, quantitation and statistical analysis was performed in Seqmonk and GraphPad Prism 6.
Conservation analysis
Conservation scores across placental mammals for Tet family members were obtained using the PhastCons track in the UCSC genome browser (Siepel et al. 2005) and a zoomed in view of Tet3 exon 4 and corresponding exons of other Tet genes is shown in Fig 4. Similarity at protein levels between TET proteins was analysed using plotcon (http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/plotcon.html).
Nascent transcription imaging assay
Newly synthesised RNA was visualised using the Click-iT Plus Alexa Fluor Picolyl Axide Toolkit (Molecular Probes) according to instructions of the manufacturer. In brief, cells were pulsed with 5-ethynyl uridine (EU) at 1 mM for 20 minutes and then fixed in 2 % formaldeyde for 30 min at room temperature. Cells were cytospun onto poly-lysine slides and permeabilised using PBS/0.5% triton X-100 for 30 min. TET3 was visualised by immunofluorescence using anti-TET3 antibody (ab290, Millipore, 1:500) as primary and anti-rabbit Alexa Fluor 647 (A31573, Molecular Probes, 1:500) as secondary antibody. EU incorporated into RNA was labelled with picolyl azide Alexa Fluor 555 via Click-iT reaction using a 1:1 ratio of CuSO4 and copper protectant. DNA was stained using DAPI (1:1000). Imaging was done on a Nikon A1R+ resonant scanning confocal microscope and signal intensities were quantified using Volocity software. To exclude effects due to slide to slide variation of signal intensity, TET3 expressing and control cells were mixed on slides at the beginning of the experiment and later identified by TET3 immunofluorescence. 5EU pulsing was performed twice and up to five slides were stained per experiment. Absolute fluorescence was different between slides (as expected) but overall results are consistent. Statistical analysis was done with GraphPad Prism 6 and signal distributions on one slide are shown.
Quantification of RNA content
Defined numbers of TET3 expressing and control cells were collected by flow cytometry using a BD Influx High-speed Cell Sorter. Five independent sorts were performed on cells from different passages and the same number of cells collected for both cell lines. Numbers per experiment were 116, 200, 200, 180 and 181 × 103. For each cell line only cells showing high GFP fluorescence were collected with gates constant between cell lines and experiments. RNA was isolated using Qiagen RNeasy Micro Kit according to the instructions of the manufacturer (which includes DNase treatment). RNA was quantitated as 1:100 and 1:200 dilutions using the Qubit RNA HS Assay Kit (Thermo Fisher Scientific).
Transdifferentiation assay
TS base media consisting of RPMI 1640 (Gibco) supplemented with 20 % FBS (Gibco), 1 mM sodium pyruvate (Gibco), 50 U/ml penicillin-streptomycin (Gibco) and 0.05 mM beta-mercaptoethanol (Gibco) was conditioned by incubation with irradiated MEF cells on cell culture dishes for two days and then passed through a 0.22 μm filter. Complete TS cell medium was prepared by combining 70 % conditioned media, 30 % TS base media, 20 ng/ml beta-foetal growth factor and 1 μg/ml heparin. The indicated ES cell lines were seeded at a density of 400 cells/cm2 onto a layer of irradiated MEF cells in full TS medium without antibiotic selection. Media was changed every 48 hours. After six days, cells were first visually examined by phase-contrast microscopy and then harvested with trypsin. Cells were incubated with anti-CD40 antibody (AF440, R&D systems, 1:50) for 30 min, then washed and incubated with anti-goat Alexa Fluor 647 antibody (A21447, Thermo Fisher Scientific, 1:500) and anti-THY1-PE (12-0900-81, eBioscience, 1:2500, feeder detection). To assess cell viability, 1 μg/ml DAPI (Sigma) was added. Immunofluorescent signal was then analysed on a BD LSRFortessa flow sorter.
RIME
RIME proteomics experiments were performed as previously described (Mohammed et al. 2013, 2016). For SILAC RIME experiments E14 ES cells were grown in R/K deficient SILAC DMEM (PAA E15-086) and supplemented with 800 μM L-Lysine 13C615N2 hydrochloride and 482 μM L-Arginine 13C615Nhydrochloride (Sigma-Aldrich) for “heavy”-labelled media or 800 μM L-Lysine 13C615N2 hydrochloride and 482 μM L-Arginine 12C614N4 hydrochloride for “light”-labelled media. Table S2 contains metrics for proteins that were identified by proteomics after TET3 pull-down but not in the IgG control. Table S4 contains proteins identified by SILAC RIME and the log2 ratio of peptide counts of TET3 over TET3trunc samples.
Accession numbers
A GEO reviewer link has been created for record GSE94688.
Conflict of interest
CK, JP, TH and WR are named inventors on the patent application WO2014096800 entitled “Novel Method” filed by the Babraham Institute with priority date 17th December 2012.
Author contributions
CK, JP and WR conceived and designed the study. CK performed experiments, analysed data and wrote the paper. JP performed experiments and analysed data. ME-M performed experiments, analysed data and edited the paper. TH and JP characterised the oocyte isoform of Tet3 and did evolutionary comparisons. HM performed RIME experiments. SA helped analyse sequencing data. WD and WR helped interpret data and supervised the study. WR edited the paper.
Acknowledgements
We thank Ines Milagre, Lenka Veselovska and Sebastien Smallwood for sharing RNA-seq data before publication. We would also like to thank Francesco Cambuli, Dominika Dudzinska and Myriam Hemberger for reagents and advice on the transdifferentiation assays, Felix Krueger for bioinformatics processing, David Oxley for mass spectrometry analysis of nucleosides, Hanneke Okkenhaug for quantification of fluorescent imaging signals and Anne Segons-Pichon for statistical advice. We thank Millipore for the generation of the TET3 antibody, and Fatima Santos for initial tests. We thank the Wellcome Trust Sanger Institute Bespoke Sequencing Team for RNA-seq library preparation and sequencing, the Proteomics Core Facility at the Cancer Research UK Cambridge Institute and the Babraham Institute Flow Cytometry Facility. We thank Fabio Spada and Guo-Liang Xu for TET3 TKO cells. We are grateful to Irene Min and John Lis for sharing their lists of genes for RNA Polymerase pausing categories. We thank all members of the Reik lab for helpful discussions. This work was funded by the Wellcome Trust (095645/Z/11/Z) and the BBSRC (BB/K010867/1). ME-M is supported by a Marie Sklodowska-Curie Individual Fellowship and JP by the Rutherford Foundation Trust.