Summary
The mammalian FACT complex is a highly conserved histone chaperone with essential roles in transcription elongation, histone deposition, and maintenance of stem cell state. FACT is essential for viability in pluripotent cells and cancer cells, but otherwise dispensable for most mammalian cell types. FACT deletion or inhibition can block reprogramming of fibroblasts to induced pluripotent stem cells, yet the molecular mechanisms through which FACT regulates cell fate decisions remain unclear. To determine the mechanism by which FACT regulates stem cell identity, we used the auxin-inducible degron system to deplete murine embryonic stem cells of FACT subunit SPT16 and subjected depleted cells to genome-wide factor localization, nascent transcription analyses, and genome-wide nucleosome profiling. Inducible depletion of SPT16 reveals a critical role in regulating targets of the master regulators of pluripotency: OCT4, KLF4, MYC, NANOG, and SOX2. Depletion of SPT16 leads to increased nucleosome occupancy at genomic loci occupied by these transcription factors, as well as gene-distal regulatory sites defined by DNaseI hypersensitivity. This heightened occupancy suggests a mechanism of nucleosome filling, wherein the sites typically maintained in an accessible state by FACT are occluded through loss of FACT-regulated nucleosome spacing. 47% of transcription arising from gene-distal regions bound by these factors is directly dependent on FACT, and putative gene targets of these non-coding RNAs are highly enriched for pluripotency in pathway analyses. Upon FACT depletion, transcription of Pou5f1 (OCT4), Sox2, and Nanog are downregulated, suggesting that FACT not only co-regulates expression of the encoded proteins’ targets, but also the pluripotency factors themselves. We find that FACT maintains cellular pluripotency through a complex regulatory network of both coding and non-coding transcription.
Introduction
The process of transcription, or polymerase-driven conversion of a DNA template to RNA, is essential to all life and is highly regulated at all stages (reviewed in (Cramer, 2019; Kornberg and Lorch, 1999; Liu et al., 2013; Roeder, 2019)). A major barrier to transcription by RNA Polymerase II (RNAPII) is the presence of assembled nucleosomes occluding access to the DNA template (reviewed in (Kujirai and Kurumizaka, 2020; Kwak and Lis, 2013; Lorch and Kornberg, 2020; Lorch and Kornberg, 2017; Venkatesh and Workman, 2015)). A nucleosome consists of a tetramer of two copies each of histones H3 and H4, with two H2A-H2B heterodimers which together form the histone octamer with ~147 base pairs of DNA wrapped around this histone octamer (Lorch and Kornberg, 2020; Luger et al., 1997). Nucleosomes are the basic unit that facilitate DNA compaction into a structure known as chromatin (Lorch and Kornberg, 2020; Luger et al., 1997). Chromatin is highly dynamic and carefully regulated to promote or repress expression of certain genes as dictated by cell signaling, environmental conditions, and master regulators of cell fate. The nucleosome can be altered through inclusion of histone variants and histone modifications (reviewed in (Henikoff and Ahmad, 2005; Kouzarides, 2007; Martire and Banaszynski, 2020)). Histone modifications are epigenetic post-translational marks that signify particular regions of chromatin; for example, trimethylation of histone H3 at lysine residue 4 (H3K4me3) is found at regions of active transcription, while acetylation of histone H3 at lysine 27 (H3K27ac) identifies canonical active enhancer marks (reviewed in (Bannister and Kouzarides, 2011; Kouzarides, 2007; Marmorstein and Zhou, 2014)).
In addition to histone variants and histone modifications, chromatin regulation also comes in the form of chromatin-modifying enzymes, including nucleosome remodeling factors that translocate DNA to permit mobilization of nucleosomes to regulate accessibility, and histone chaperones, noncatalytic proteins that are responsible for adding and removing histone components, including both core histones and their variant substitutes (reviewed in (Avvakumov et al., 2011; De Koning et al., 2007; Hammond et al., 2017; Ransom et al., 2010; Venkatesh and Workman, 2015)). To create an RNA product, RNA Polymerase II (RNA Pol II) coordinates with these histone chaperones to overcome the physical hinderance of nucleosome-compacted DNA (reviewed in (Formosa, 2012; Hsieh et al., 2013; Kujirai and Kurumizaka, 2020; Kulaeva et al., 2013; Petesch and Lis, 2012)). RNAPII itself can facilitate this nucleosome disassembly (Ranjan et al., 2020), but the polymerase is often assisted by the various histone chaperones that can facilitate removal of H2A/H2B dimers (as well as other combinations of histone proteins) and subsequent replacement after the polymerase has passed (Fei et al., 2018; Lee et al., 2017; Liu et al., 2020; Wang et al., 2018). One prominent histone chaperone is the FAcilitates Chromatin Transcription (FACT) complex.
The mammalian FACT complex is a heterodimer composed of a catalytic subunit, Suppressor of Ty 16 homolog (SPT16) and an HMG-containing subunit that facilitates localization and DNA binding, Structure-Specific Recognition Protein 1 (SSRP1) (Belotserkovskaya et al., 2003; Liu et al., 2020; Orphanides et al., 1998; Orphanides et al., 1999). In S. cerevisiae, the system in which much FACT characterization has been done, Suppressor of Ty 16 (SPT16) forms a complex with two proteins, Nhp6 and Pob3, that fulfill the roles of SSRP1 (Brewster et al., 1998; 2001; Formosa et al., 2001; Orphanides et al., 1998; Orphanides et al., 1999; Wittmeyer and Formosa, 1997). FACT regulates passage through the nucleosomal roadblock for both RNAPII and replication machinery (Abe et al., 2011; Belotserkovskaya et al., 2003; Belotserkovskaya et al., 2004; Formosa, 2008; 2012; Formosa and Winston, 2020; Hsieh et al., 2013; Orphanides et al., 1998; Orphanides et al., 1999; Tan et al., 2006; Tettey et al., 2019). Given these dual roles in transcription and DNA replication, FACT has been thought to be crucial for cell growth and proliferation (Abe et al., 2011; Belotserkovskaya et al., 2004; Formosa et al., 2001; Garcia et al., 2011; Hertel L., 1999; Orphanides et al., 1999; Tan et al., 2006). More recent data has shown that while FACT is not required for cell growth in most healthy adult cell types, FACT is highly involved in cancer-driven cell proliferation as a dependency specific to cancerous cells; this dependency has been targeted using a class of FACT inhibitors known as curaxins, with promising results in anticancer drug treatment studies (Chang et al., 2019; Chang et al., 2018; Garcia et al., 2011; Gasparian et al., 2011; Kantidze et al., 2019). Curaxins inhibit FACT through a trapping mechanism whereby FACT is redistributed away from transcribed regions to other genomic loci, while the complex tightly binds to nucleosomes and cannot be easily removed (Chang et al., 2018). While cancer cell proliferation is FACT-dependent, FACT expression is nearly undetectable in most non-cancerous adult mammalian tissues; indeed, FACT appears to be dispensable for cell viability and growth in most non-cancerous and differentiated cell types (Garcia et al., 2011; Garcia et al., 2013; Safina et al., 2013). Formosa and Winston have recently suggested a unifying model for FACT action wherein cellular FACT dependency results from chromatin disruption and tolerance of DNA packaging defects within the cell (Formosa and Winston, 2020).
While FACT did not initially seem essential for cell proliferation outside of the context of cancer, more recent work has demonstrated heightened FACT expression and novel requirement in undifferentiated (stem) cells (Garcia et al., 2011; Garcia et al., 2013; Kolundzic et al., 2018; Mylonas and Tessarz, 2018; Shen et al., 2018). Stem cell chromatin is highly regulated by well-characterized features, including a largely accessible chromatin landscape and bivalent chromatin, which is epigenetically decorated with both active (e.g., H3K4me3) and repressive (e.g., H3K27me3) modifications (Azuara et al., 2006; Bernstein et al., 2006; de Dieuleveult et al., 2016; Harikumar and Meshorer, 2015; Klein and Hainer, 2020; Meshorer and Misteli, 2006; Vastenhouw and Schier, 2012; Voigt et al., 2013; Young, 2011). Embryonic stem (ES) cells specifically regulate their chromatin to prevent differentiation from occurring until appropriate, thereby preserving their pluripotent state. Pluripotency, or the capacity to mature into any cell type in an adult organism, is maintained by a suite of master regulators that work to repress differentiation-associated genes and maintain expression of genes that promote this pluripotent state, including the well-studied transcription factors OCT4, SOX2, KLF4, MYC, and NANOG, often referred to as master regulators of pluripotency (Chambers et al., 2003; Ding et al., 2012; Hall et al., 2009; Kim et al., 2018; Klein and Hainer, 2020; Masui et al., 2007; Mitsui et al., 2003; Pardo et al., 2010; Romito and Cobellis, 2016). While the main functions of the master regulators are to maintain pluripotency and prevent improper differentiation through regulation of gene expression, a majority of their chromatin binding sites are to gene-distal genomic regions, likely performing important regulatory functions at these locations (Lodato et al., 2013). These transcription factors, along with chromatin modifiers, form the foundation of gene regulation and provide a molecular basis for pluripotency. FACT has been shown to interact with several pluripotency- and development-associated factors, including OCT4 (Ding et al., 2012; Pardo et al., 2010), WNT (Hossan et al., 2016), and NOTCH (Espanola et al., 2020); in particular, affinity mass spectrometry has demonstrated a direct, physical interaction between FACT and OCT4 (Ding et al., 2012; Pardo et al., 2010). In addition, FACT has recently been functionally implicated in maintaining stem cells in their undifferentiated state (Kolundzic et al., 2018; Mylonas and Tessarz, 2018; Shen et al., 2018). Indeed, FACT depletion by SSRP1 shRNA knockdown led to a faster differentiation into neuronal precursor cells, along with increased expression of genes involved in neural development and embryogenesis (Mylonas and Tessarz, 2018). In both C. elegans and murine embryonic fibroblasts (MEFs), FACT was shown to impede transition between pluripotent and differentiated states; in C. elegans, FACT was identified as a barrier to cellular reprogramming of germ cells into neuronal precursors, while in MEFs, FACT inhibition prevented reprogramming to induced pluripotent stem cells (Kolundzic et al., 2018; Shen et al., 2018). These experiments have confirmed a dependency for FACT in pluripotent cells that is not found in differentiated fibroblasts (Kolundzic et al., 2018; Shen et al., 2018). While these data establish FACT as essential in pluripotent cells, the mechanism through which FACT acts within undifferentiated cells to maintain their state is currently unclear. Interestingly, SSRP1 knockout in murine ES cells are viable and shows no effect on expression of the pluripotency factor OCT4 (Chen et al., 2020); however, conditional knockout of SSRP1 in mice is lethal due to a loss of progenitor cells resulting in hematopoietic and intestinal failures (Goswami et al., 2021). These disparities may be related to described FACT-independent roles of SSRP1 (Li et al., 2007; Marciano et al., 2018) but nonetheless highlight inconsistencies regarding the role of FACT in pluripotent cells.
Here, we establish a molecular mechanism by which the FACT complex maintains pluripotency in murine ES cells using auxin-inducible degron-tagged SPT16 for proteasomal degradation coupled with genomic and transcriptomic techniques. As the majority of OCT4 binding occurs at gene-distal regulatory sites, we sought to determine whether FACT may regulate OCT4, SOX2, and NANOG, along with their regulatory targets, at non-genic locations (Lodato et al., 2013). We identify frequent co-regulation between FACT and master regulators of pluripotency and altered nucleosome positioning following FACT depletion. Furthermore, we identified extensive regulation of non-coding transcription by the FACT complex. SPT16 binding is highly enriched at putative enhancers, and transcription of enhancer RNAs (eRNAs) from 42% of putative enhancers are altered upon FACT depletion, including eRNAs transcribed from enhancers of Pou5f1, Sox2, and Nanog.
Results
Inducible depletion of the FACT complex triggers a loss of pluripotency
To determine whether FACT regulation of pluripotency-associated genes is critical to stem cell identity, we performed proteasomal degradation of the FACT subunit SPT16 via the auxin-inducible degron (AID) system (Fig. 1A). Briefly, we used Cas9-directed homologous recombination to insert a 3XV5 tag and a modified 39 amino acid AID tagged (based on the AID47 and AID* sequences (Brosh et al., 2016; Morawska and Ulrich, 2013)) at the C-terminus of endogenous Supt16 in ES cells that have osTIR1 already integrated within the genome (see Methods) (Baker et al., 2016). Throughout the following described experiments, the osTIR1 cell line, without any AID-tagged proteins, is used as the control cell line (hereafter referred to as “Untagged”). Following 24 hours of treatment with 3-IAA, SPT16 protein levels were effectively reduced by proteasomal degradation relative to the vehicle treatment control (EtOH; Fig. 1B, Fig. S1A), whereas 6-hour treatment had modest to no reduction in SPT16 levels. We note that, as previously established, depletion of SPT16 triggers a corresponding loss of expression of SSRP1 protein (Fig. 1B, Fig. S1B) (Safina et al., 2013). Consistent with a role for FACT in stem cell identity and viability, within 24 hours, ES cell colonies began to show phenotypic changes indicative of cellular differentiation, including a loss of alkaline phosphatase activity and morphological changes (Fig. 1C, Fig. S1C). This phenotypic change was most apparent between 24 and 48 hours of FACT depletion; however, most cells could not survive 48 hours of FACT depletion. While it was previously suggested that FACT requirement in stem cells is a result of cellular stress induced by trypsinization (Shen et al., 2018), we note that cells had been left undisturbed for 48 hours prior to protein depletion, suggesting that trypsinization is unrelated to the differentiation defect or the requirement for FACT. To confirm that FACT requirement is a condition of ES cell state, and not culture conditions, we conducted a growth assay in which cells were undisturbed for 72 hours before being treated with either 3-IAA or EtOH over a 24-hour timecourse. Indeed, these experiments showed cell death (~50%) after 24 hours of FACT depletion that did not occur in vehicle-treated or untagged cells (Supplemental Fig. 1D).
The FACT complex is enriched at pluripotency factor binding sites
To determine where FACT is acting throughout the genome, we performed the chromatin profiling technique CUT&RUN on the V5-tagged SPT16 protein (Skene and Henikoff, 2017). Importantly, attempts at profiling SPT16 or SSRP1 localization with antibodies targeting the proteins directly was not successful in our hands. SPT16-V5 CUT&RUN recapitulates some known FACT binding trends, including those identified through ChIP-seq (Fig. 2A). However, CUT&RUN also provides heightened sensitivity, allowing for higher resolution profiling and investigation of FACT binding (Hainer et al., 2019; Hainer and Fazzio, 2019; Meers et al., 2019a; Skene and Henikoff, 2017). Individual SPT16-V5 CUT&RUN replicates display a higher Pearson correlation than FACT ChIP-seq data, suggesting greater replicability (Fig. S2A). Overall, FACT ChIP-seq data and SPT16-V5 CUT&RUN data are generally agreeable at peaks called from the orthogonal dataset (Fig. 2A, Fig. S2B). In both the SPT16-V5 CUT&RUN data and FACT subunit ChIP-seq, we see strong complex binding at the pluripotency-regulating genes Nanog and Sox2 (Fig. 2B). We compared peaks called from CUT&RUN data using SEACR and ChIP-seq data using HOMER and identified generally similar patterns of localization to genomic features (Fig. 2C) (Heinz et al., 2010; Meers et al., 2019b). We identified 18,910 nonunique peaks called from SPT16-V5 CUT&RUN data, 112,781 nonunique peaks from SSRP1 ChIP-seq data, and 51,827 nonunique peaks from SPT16 ChIP-seq data. CUT&RUN data was more enriched at promoters and unclassified regions, while ChIP-seq datasets were more enriched at repetitive regions and intergenic regions (Fig. 2D).
While we note that more peaks were called from both ChIP-seq datasets, we caution against interpreting raw peak numbers due to greatly differing sequencing depth and false discovery rates employed by the respective peak-calling algorithms.
Having identified FACT binding sites, we subjected genic peaks called from CUT&RUN data to Gene Ontology (GO) term analysis (Fig. 2D) (Zhou et al., 2019). GO term analysis identified numerous pluripotency- and development-associated pathways. To assess this association in an orthogonal way, we performed sequence motif analysis of CUT&RUN peaks using HOMER (Fig. 2E) (Heinz et al., 2010). The top three most enriched sequence motifs were those recognized by the transcription factors SOX2, KLF5, and OCT4-SOX2-TCF-NANOG, all of which regulate cellular pluripotency or differentiation (Bourillot and Savatier, 2010; Chambers et al., 2003; Hall et al., 2009; Klein and Hainer, 2020; Masui et al., 2007; Mitsui et al., 2003; Pardo et al., 2010). Together, these results suggest that FACT is key in maintaining pluripotency of mES cells through coordinated co-regulation of target genes with the master regulators of pluripotency.
FACT regulates expression of the master regulators of pluripotency as well as their targets
While we established co-regulation of pluripotency-associated targets by chromatin binding, it remained unclear whether FACT directly regulates the expression of the master regulators of pluripotency themselves. We therefore performed nascent RNA-sequencing (TT-seq) following depletion of SPT16 for a direct readout of FACT’s effects on transcription of these regulators. FACT depletion after 24 hours of 3-IAA treatment significantly altered the expression of 12,992 annotated genes, displaying both derepression (34% up, 6,783) and impaired maintenance of target genes (31% down, 6,209). Significantly downregulated transcripts include those encoding OCT4 (Pou5f1), SOX2, NANOG, and KLF4, while transcription elongation factors were upregulated, such as subunits of the Polymerase-Associated Factors (PAF1) complex, the DRB Sensitivity Inducing Factor (DSIF) member SPT4A, and the histone chaperone SPT6 (Fig. 3A-B, Fig. S3A-B). Intriguingly, SPT6 has been shown to maintain mES cell pluripotency through Polycomb opposition and regulation of superenhancers (Wang et al., 2017). Heightened expression of transcription elongation factors may be the result of a compensatory mechanism through which FACT-depleted cells attempt to overcome this deficiency, or the result of direct repression of these factors by FACT.
To determine whether the reduced transcription of pluripotency factors was due to FACT action or another mechanism of cellular differentiation, we treated cells with 3-IAA for 3 or 6 hours to deplete cells of FACT protein more acutely, prior to morphological indicators of cellular differentiation, and performed RT-qPCR (Fig. 3C, Fig. S3C). Importantly, FACT protein levels are only modestly reduced at these time points (Fig. S1A), and transcript levels are largely unaltered. Furthermore, expression of pluripotency regulators was not affected, suggesting that moderate levels of FACT protein are sufficient to sustain pluripotency. Between 6 and 24 hours of 3-IAA treatment, however, cells begin to differentiate, and transcription of pluripotency factors are severely reduced (Figs. 1C, Fig. S1C, 3A-B).
FACT co-regulates targets of master pluripotency regulators
Having established that FACT regulates expression of the master pluripotency regulators and their targets, we next sought to identify whether this regulation occurs at the genes themselves, or at distal regulatory elements. As a majority of OCT4, SOX2, and NANOG binding sites are gene-distal (Lodato et al., 2013) and FACT subunit ChIP-seq correlates poorly with genes that change expression upon SSRP1 knockdown (Mylonas and Tessarz, 2018), we hypothesized that FACT may also bind at gene-distal regulatory sites. Indeed, both SPT16-V5 CUT&RUN and previously published FACT subunit ChIP-seq (Mylonas and Tessarz, 2018) show strong occupancy over gene-distal OCT4 ChIP-seq peaks, suggesting co-regulation of pluripotency factor targets (Fig. 3D). As FACT is a general elongation factor, and many gene-distal elements have transcription initiating from within the element (such as enhancers), we wanted to determine whether pluripotency factors are also enriched at FACT binding sites. We therefore visualized published OCT4, SOX2, and NANOG ChIP-seq data (Marson et al., 2008) over SPT16-V5 CUT&RUN peaks (Fig. 3E). All three pluripotency factors display extensive binding at SPT16-V5 binding sites, supporting the idea of FACT and pluripotency factor co-regulation.
Finally, as there is a known interaction between OCT4 and acetylation of histone H3 at lysine 56 (H3K56ac) (Tan et al., 2013; Xie et al., 2009), we hypothesized that FACT binding may correlate with H3K56ac. In support of this hypothesis, FACT and H3K56ac are known to interact in S. cerevisiae (McCullough et al., 2019). As such, we examined whether this interaction is conserved in mES cells, although H3K56ac occurs at less than 1% of total H3 loci in mammalian cells. We plotted published H3K56ac ChIP-seq data over SPT16-V5 CUT&RUN peaks (Fig. 3F; ChIP-seq data from GSE47387 (Tan et al., 2013)). While H3K56ac does not appear enriched directly at FACT binding sites, the mark is highly enriched in flanking regions, particularly on directly adjacent histones. The association between FACT and H3K56ac further highlights FACT’s role in pluripotency maintenance, given the known interplay between OCT4 and H3K56ac.
FACT depletion moderately disrupts nucleosome positioning genome-wide
As FACT’s biochemical function is exchange of histone H2A/H2B dimers, we hypothesized that FACT may maintain pluripotency via appropriate nucleosome occupancy and positioning, including at the gene-distal sites where OCT4, SOX2, and NANOG frequently bind. To address nucleosome positioning directly, we performed micrococcal nuclease digestion followed by deep sequencing (MNase-seq) following FACT depletion after 24 hours of 3-IAA treatment. MNase-seq results suggest a consistent mechanism of nucleosome-filling at FACT-bound regulatory regions genome-wide. Visualizing MNase-seq data at peaks called from SPT16-V5 CUT&RUN, we observe an increase in nucleosome occupancy directly over SPT16-V5 peaks following SPT16 depletion (Fig. 4A). At OCT4 binding sites, we observe an increase in nucleosome occupancy following FACT depletion (Fig. 4B-C). Intriguingly, this mechanism of nucleosome filling is not restrained to genic FACT-binding sites; at TSS-distal DNaseI hypersensitive sites (DHSs), used as a proxy for gene-distal regulatory regions, a similar phenomenon of nucleosome filling occurs (Fig. 4D-E). Merged technical replicates of non-differential MNase-seq data are plotted separately for each condition and biological replicate over TSS-distal DHSs in Fig. S4.
To further characterize the effects of FACT depletion on gene-distal nucleosome regulation, we classified FACT binding at a number of features defining regulatory regions, including H3K27ac ChIP-seq peaks (Fig. 5A), H3K4me1 ChIP-seq peaks (Fig. 5B), gene-distal DHSs (Fig. 5C), and H3K56ac ChIP-seq sites (Fig. 5D). At each of these sites marking putative regulatory regions (typically enhancers), FACT is bound, according to both SPT16-V5 CUT&RUN data and FACT subunit ChIP-seq data. To confirm that FACT is present at putative enhancers, we defined DHSs that were also decorated by either H3K27ac or H3K4me1, 2 putative enhancer marks, and visualized FACT localization profiling at these sites (Fig. 5E). Indeed, both CUT&RUN and previously published ChIP-seq showed enrichment of FACT binding at putative enhancers. Although FACT binds many regulatory regions marked by DHSs, we note that FACT binding is not enriched at putative silencers, defined by the presence of a TSS-distal DHS and an H3K27me3 ChIP-seq peak (Fig. S5). To determine whether FACT binding contributes to nucleosome maintenance at gene-distal DHSs, we examined nucleosome occupancy over FACT-bound gene-distal DHSs (Fig. 5F). Upon 24-hour 3-IAA treatment for FACT depletion, we see a marked increase in nucleosome occupancy directly over the DHS, suggesting a possible mechanism of nucleosome-filling, wherein FACT is typically responsible for maintaining accessible chromatin at gene-distal regulatory elements. Genome-wide, however, FACT-mediated nucleosome disruption is relatively minor, and is likely tied directly to transcription by RNA Pol II, as is the case in S. cerevisiae (Feng et al., 2016); this disruption may itself be the result of transcription by RNA Pol II that cannot be repaired by FACT, as recent work has suggested (Farnung et al., 2021; Formosa and Winston, 2020; Liu et al., 2020).
FACT depletion alters non-coding transcription at gene-distal regulatory sites
While FACT binding is strongly enriched at many promoters of genes displaying expression changes following FACT depletion but not at unchanged genes, there are still other promoters of genes with altered expression following FACT depletion that do not appear to be bound by FACT (Fig. S6A-F); as such, FACT may maintain or repress expression of these target genes through gene-distal regulatory elements. As gene-distal DHSs are often sites of non-protein-coding transcription, including enhancers where enhancer RNAs (eRNAs) are produced, we sought to determine whether this mechanism of accessibility maintenance by FACT may regulate non-coding transcription known to arise from these regions, specifically focusing on enhancers (reviewed in (Kaikkonen and Adelman, 2018; Li et al., 2016; Patty and Hainer, 2020). Out of 70,586 putative regulatory regions (defined as gene-distal DNaseI hypersensitive sites), 57,954 were sites of nascent transcription detected in our TT-seq datasets, the majority of which are likely to encode eRNAs. In analyzing our TT-seq data after FACT depletion, we identified 15,410 FACT-regulated ncRNAs (26.6%), with more ncRNAs derepressed (16%) than repressed (10.6%) by FACT depletion. Taking only the ncRNAs transcribed from regions marked by both a DHS and either H3K4me1 or H3K27ac as putative eRNAs, we identified 14,889 transcripts, with 22% of putative eRNAs derepressed and 20% stimulated upon FACT depletion. Since the majority of OCT4 binding sites are gene-distal, and because FACT binds at both gene-distal DHSs (Fig. 5C) and gene-distal OCT4 binding sites (Fig. 3D) and regulates chromatin accessibility at these regions (Fig. 5E), we sought to determine whether FACT regulates these ncRNAs as a possible means of pluripotency maintenance. Therefore, to examine trends at well-defined enhancers of pluripotency factors, we determined nascent transcription from previously annotated superenhancers known to be marked by eRNA transcription (Blinka et al., 2016; Li et al., 2014; Whyte et al., 2013) (Fig. 6A, Fig. S7A-B). Assuming that each ncRNA is paired with (and potentially regulates) its nearest gene, we performed pathway analysis on putative ncRNA regulatory targets (Fig. 6B). Among the most significantly enriched categories for putative targets of upregulated ncRNAs were mechanisms associated with pluripotency, white fat cell differentiation, and WNT signaling, while putative targets of downregulated ncRNAs were enriched for pluripotency networks, TGF-ß signaling, and WNT signaling (Fig. 6B). In line with effects on coding genes, FACT appears to have both stimulatory and repressive roles on ncRNAs in close proximity to genes associated with pluripotency.
To determine whether FACT depletion may stimulate transcription from all regulatory elements marked by DHSs, we examined FACT binding and transcription from putative silencers (defined by gene-distal DHSs that overlap H3K27me3 ChIP-seq peaks). FACT does not appear to be capable of stimulating transcription from putative silencers, as there is no discernable enrichment for FACT binding, nor is there an increase in transcription from these regions following FACT depletion (Fig. S5).
We next sought to identify putative regulation by FACT of genes via proximal regulatory elements—specifically promoter upstream transcripts (PROMPTs). PROMPTs were identified by genomic location (within 1 kb of an annotated TSS and transcribed divergently to the mRNA); 5,522 PROMPTs were significantly altered by FACT depletion out of 23,257 expressed putative PROMPTs (padj < 0.05). More PROMPTs were repressed by FACT than stimulated, with 14% significantly increasing (3,345) and 9.4% significantly decreasing (2,177). Regulation of approximately 20% of putative PROMPTs remains in line with known roles for transcriptional regulation by FACT, and repression of PROMPTs is consistent with FACT’s known role in preventing cryptic transcription S. cerevisiae (Jeronimo et al., 2015; Mason and Struhl, 2003).
In sum, FACT displays both repressive and permissive effects on transcription arising from genes and gene-distal regulatory regions (Fig. 3B, Fig. 6B-E). While FACT stimulates and impedes transcription through direct action at some gene promoters, a large class of genes with FACT-regulated transcription are not bound by FACT, suggesting gene-distal regulatory mechanisms (Fig. S6A-F). Given the overlap between FACT binding and various enhancer-associated histone modifications (e.g., H3K27ac, H3K4me1, H3K56ac; Fig. 5A), it is likely that this gene-distal regulation occurs predominantly through association with enhancers of FACT-regulated genes. Among the most affected classes of FACT-regulated genes are those that regulate pluripotency and stem cell identity (Fig. 3A-B, Fig. 3D). Expression of these pluripotency factors is regulated by enhancers and superenhancers; as eRNA transcription from these gene-distal regulatory regions is compromised following FACT depletion (Fig. 6A, Fig. S7A-B), the mechanism through which FACT regulates stem cell pluripotency appears to depend on these enhancers.
Discussion
FACT is an essential regulator of stem cell pluripotency
The role for FACT in pluripotent cells has drawn recent interest but remained mechanistically unclear. Here we provide an analysis of FACT function in murine embryonic stem (mES) cells. Using a combination of localization, transcriptomic, and nucleosome profiling genome-wide methods, our data indicate that FACT regulates pluripotency factors through maintenance of master pluripotency regulators themselves and through gene-distal mechanisms. Given the genomic loci at which FACT binds and the effects of FACT depletion on their transcription, FACT likely performs dual roles in transcriptional regulation: facilitation of pluripotency through both coding and non-coding pluripotency-promoting elements, and repression of differentiation-promoting elements. Based on these data, we propose a model where FACT maintains paused RNA Pol II at transcribed regions to repress transcription of differentiation-associated genes and non-coding RNAs that may themselves repress pluripotency factors (Fig. 7). Simultaneously, FACT maintains expression of pluripotency factors, through both genic (RNA Pol II pause release) and gene-distal (enhancer) mechanisms. FACT tends to repress transcription of both coding and non-coding elements at approximately 1.5 times the amount the complex stimulates transcription of coding-and non-coding elements. Amount of FACT-dependent mRNA transcription (both stimulated and repressed) are largely consistent between our data and experiments performed in S. cerevisiae (Feng et al., 2016), mES cell lines (Chen et al., 2020), and in a mouse model (Goswami et al., 2021).
FACT regulates gene-distal DNaseI hypersensitive sites to alter transcription arising from regulatory elements
Elucidating a mechanism of FACT action remains complicated by the duality of the complex’s roles; at some loci, FACT works to repress transcription of regulatory elements, while others are positively regulated to promote transcription of their genic targets (Fig. 6B, D). Indeed, FACT’s role at gene-distal regulatory elements seems to mirror the complex’s role at genic regions; facilitating removal of nucleosomes to maintain expression when necessary, and reconstruction of nucleosomes to limit expression. While our data indicate that FACT’s more prominent role at gene-distal DHSs is repression of transcription, the complex both facilitates and impedes coding and non-coding transcription, including through direct mechanisms (Fig. S2B). The classes of RNAs regulated by FACT do not appear solely categorized by ES cell requirement, however, as GO-term analysis identified many distinct pathways among the most enriched for each class of RNA (Fig. 6 A-C).
FACT likely either occludes binding sites for master regulators or represses transcription arising following action by these regulators
It is tantalizing to speculate that FACT must maintain accessible chromatin for interaction by the master regulators of pluripotency themselves; however, established pioneering activity by OCT4 and SOX2 suggests that the master regulators are not entirely dependent on FACT action (Dodonova et al., 2020; Michael et al., 2020; Soufi et al., 2015; Tan and Takada, 2020). FACT depletion has been shown to redistribute histone marks in D. melanogaster and S. cerevisiae and therefore, disruption of pluripotency-relevant histone marks (e.g. H3K56ac) may be one mechanism through which pluripotency maintenance is affected in FACT-depleted cells (Ding et al., 2012; Jeronimo et al., 2019; Pardo et al., 2010; Tan et al., 2013; Tettey et al., 2019; Xie et al., 2009). Futhermore, acetylation is increased at nucleosomes predicted to stall RNA Pol II (Martin et al., 2021), and therefore, altered transcription-associated disruption caused by FACT depletion may be responsible for histone modification shuffling. This shuffling of histone modifications likely disrupts recruitment of factors that maintain gene expression by sensing histone marks (e.g. recognition of methylated lysine residues on histones by CHD1 and CHD2); this disrupted factor recruitment and retention may explain many reductions in transcript abundance following FACT depletion. As FACT binding correlates with CHD1, CHD2, and gene expression (Mylonas and Tessarz, 2018) and may remove CHD1 from partially unraveled nucleosomes (Farnung et al., 2021; Jeronimo et al., 2020), CHD1 may also become trapped on chromatin without FACT-dependent displacement, thereby reducing expression of target genes.
FACT-mediated transcriptional repression may be due to loss of RNA Pol II pausing
RNA Pol II pausing is a phenomenon that occurs at the promoters of coding genes, as well as at eRNAs and PROMPTs (Gressel et al., 2019; Henriques et al., 2018; Tettey et al., 2019). FACT has been shown to maintain pausing of RNA Pol II at coding promoters (Tettey et al., 2019), and therefore a plausible model emerges through which FACT represses transcription from these regions by maintaining RNA Pol II pausing to silence improper transcription. Given the enrichment of pluripotency- and differentiation-associated pathways found for the putative targets of these non-coding elements, this RNA Pol II pausing-mediated silencing may be the mechanism through which FACT prevents changes in cellular identity (i.e., reprogramming to iPSCs from fibroblasts) (Kolundzic et al., 2018; Mylonas and Tessarz, 2018; Shen et al., 2018; Tettey et al., 2019).
As many groups have suggested, the act of transcription by RNA Pol II itself may be responsible for destabilization of nucleosomes, creating a genomic conflict for FACT to resolve (Farnung et al., 2021; Formosa and Winston, 2020; Goswami et al., 2021; Jeronimo et al., 2020; Liu et al., 2020); as FACT interaction with the nucleosome is promoted by transcription in S. cerevisiae (Martin et al., 2018), transcription-promoted conflict resolution is a unifying mechanism of FACT action. With FACT depleted, this nucleosome destabilization likely compounds issues created by failure to maintain RNA Pol II pausing; it is likely that this combination of genome destabilization and failure to reassemble is responsible for the vast majority of derepressed transcription following FACT depletion. This model is further strengthened by a lack of FACT binding enrichment at putative silencers (Fig. S5A, S5C), and these regions do not display improper transcription after FACT depletion (Fig. S5B, D), suggesting that derepression by FACT depletion is not sufficient to induce transcription alone, but requires pre-initiated and paused RNA Pol II.
Together the work presented here supports prior studies and enhances our understanding of the mechanistic role for FACT in mammalian pluripotent systems. Future work should aim to address the interplay between FACT, pluripotency factors, and histone modifications (such as H3K56ac) and the potential redistribution of modifications in contributing to alteration in cis-regulatory elements when FACT is lost or altered in disease settings.
Author Contributions
D.C.K. and S.J.H. designed the study and wrote and edited the manuscript. D.C.K. performed most experiments. K.N.M created the cell lines. S.M.L. performed CUT&RUN experiments. D.C.K. analyzed the data with assistance from S.J.H.
Declaration of interests
The authors declare no conflicting interests related to this project.
METHODS
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Sarah Hainer (sarah.hainer{at}pitt.edu).
Materials availability
Plasmids and cell lines generated in this study are available on request. All resources generated in this study must be acquired via a Material Transfer Agreement (MTA) granted by the University of Pittsburgh.
Data and code availability
This paper analyzes existing, publicly available data. These accession numbers for the datasets are listed throughout the manuscript. Any additional information required to analyze the data reported in this paper is available from the lead contact upon request.
METHOD
Cell Lines
Mouse embryonic stem cells were derived from E14 (Hooper et al., 1987). Male E14 murine embryonic stem cells were grown in feeder-free conditions on 10 cm plates gelatinized with 0.2% porcine skin gelatin type A (Sigma) at 37°C and 5% CO2. Cells were cultured in Dulbecco’s Modified Eagle Medium (Gibco), supplemented with 10% Fetal Bovine Serum (Sigma, 18N103), 0.129mM 2-mercaptoethanol (Acros Organics), 2 mM glutamine (Gibco), 1X nonessential amino acids (Gibco), 1000U/mL Leukemia Inhibitory Factor (LIF), 3 μM CHIR99021 GSK inhibitor (p212121), and 1 μM PD0325091 MEK inhibitor (p212121). Cells were passaged every 48 hours using trypsin (Gibco) and split at a ratio of ~1:8 with fresh medium. Routine anti-mycoplasma cleaning was conducted (LookOut DNA Erase spray, Sigma) and cell lines were screened by PCR to confirm no mycoplasma presence.
Auxin Inducible Degradation
Cell lines were constructed in an E14 murine ES cell line with osTIR1 already integrated into the genome. SPT16 was C-terminally tagged using a 39 amino acid mini-AID construct also containing a 3xV5 epitope tag (Kubota et al., 2013; Natsume et al., 2016; Nishimura et al., 2009; Nishimura and Kanemaki, 2014). Two homozygous isolated clones were generated using CRISPR-mediated homologous recombination with Hygromycin B drug selection and confirmed by PCR and Sanger Sequencing.
Cells were depleted of AID-tagged SPT16 protein by addition of 500 μM 3-Indole Acetic Acid (3-IAA, Sigma) dissolved in 100% EtOH and pre-mixed in fresh medium. Cells were incubated with 3-IAA or 0.1% EtOH (vehicle) for 24 hours to effectively deplete the FACT complex and confirmed by Western blotting. Importantly, cells were cultured on 10 cm plates undisturbed for 48 hours prior to AID depletion, ensuring that relevant effects are not due to passaging-related disturbances.
Alkaline Phosphatase Staining
Cells were treated with EtOH or 3-IAA as described above, with alkaline phosphatase staining after 6, 24, and 48 hours. Treated cells were washed twice in 1X Dulbecco’s Phosphate-Buffered Saline (DPBS, Gibco) and crosslinked in 1% formaldehyde (Fisher) in DPBS for five minutes at room temperature. Crosslinking was quenched with 500 mM glycine and cells were washed twice in 1XDPBS. Cells were stained with VECTOR Red Alkaline Phosphatase Staining Kit (Vector Labs) per manufacturer’s instructions in a 200 mM Tris-Cl buffer, pH 8.4. 8 mL working solution was added to each 10 cm plate and incubated in the dark for 30 minutes before being washed with DPBS and imaged.
Western blotting
Western blotting was performed using a mouse monoclonal anti-V5 epitope antibody (Invitrogen 46-0705, lot 1923773), a mouse monoclonal anti-SSRP1 antibody (BioLegend 609702, lot B280320), and a mouse monoclonal anti-beta-actin loading control (Sigma). Secondary antibody incubations were performed with goat polyclonal antibodies against either rabbit or mouse IgG, (BioRad 170-6515, lot, BioRad 170-6516, lot). Crude protein extractions were performed using RIPA buffer (150 mM NaCl, 1% NP-40 CA-630, 0.5% sodium deoxycholate, 0.1% sodium dodecyl sulfate, 25 mM Tris-Cl, pH 7.4) with freshly added protease inhibitors (Thermo Fisher) and flash-frozen immediately after extraction. Samples were quantitated using the Pierce BCA Protein Assay kit (Thermo Fisher). 20 μg were diluted in RIPA buffer with 10 mM dithiothreitol (DTT) and Laemmeli sample buffer before being loaded on 7.5% Tris-acrylamide gels for Western blotting. Proteins were transferred to nitrocellulose membranes (BioTrace) via a Criterion tank blotter (BioRad) at 100V for one hour and stained with 0.5% Ponceau S (Sigma) in 1% acetic acid to confirm proper transfer. Membranes were blocked in 5% milk in PBST prior to overnight primary antibody incubation at 4°C. Membranes were then washed and incubated in secondary antibody (Bio-Rad) for one hour at room temperature, washed, and developed with SuperSignal West Pico chemiluminescent reagent (Thermo) for 5 minutes at room temperature.
CUT&RUN
CUT&RUN was performed as described (Hainer et al., 2019; Hainer and Fazzio, 2019; Patty and Hainer, 2021; Skene and Henikoff, 2017), using recombinant Protein A/Protein G-MNase (pA/G-MN) (Meers et al., 2019a). Briefly, 100,000 nuclei were isolated from cell populations using a hypotonic buffer (20 mM HEPES-KOH, pH 7.9, 10 mM KCl, 0.5mM spermidine, 0.1% Triton X-100, 20% glycerol, freshly added protease inhibitors) and bound to lectin-coated Concanavalin A magnetic beads (200 μL bead slurry per 500,000 nuclei) (Polysciences). Immobilized nuclei were chelated with blocking buffer (20 mM HEPES, pH 7.5, 150 mM NaCl, 0.5mM spermidine, 0.1% BSA, 2mM EDTA, fresh protease inhibitors) and washed in wash buffer (20 mM HEPES, pH 7.5, 150 mM NaCl, 0.5mM spermidine, 0.1% BSA, fresh protease inhibitors). Nuclei were incubated in wash buffer containing primary antibody (anti-V5 mouse monoclonal, Invitrogen 46-0705, lot 1923773) for one hour at room temperature with rotation, followed by incubation in wash buffer containing recombinant pA/G-MN for 30 minutes at room temperature with rotation. Controls lacking a primary antibody were subjected to the same conditions but incubated in wash buffer without antibody prior to incubation with pA/G-MN. Samples were equilibrated to 0°C and 3 mM CaCl2 was added to activate pA/G-MN cleavage. After suboptimal digestion for 15 minutes, digestion was chelated with 20 mM EDTA and 4 mM EGTA, and 1.5 pg MNase-digested S. cerevisiae mononucleosomes were added as a spike-in control. Genomic fragments were released after an RNase A treatment. After separating released fragments through centrifugation, fragments isolated were used as input for a library build consisting of end repair and adenylation, NEBNext stem-loop adapter ligation, and subsequent purification with AMPure XP beads (Agencourt). Barcoded fragments were then amplified by 14 cycles of high-fidelity PCR and purified using AMPure XP. Libraries were pooled and sequenced on an Illumina NextSeq500 to a depth of ~10 million mapped reads.
CUT&RUN data analysis
Paired-end fastq files were trimmed to 25 bp and mapped to the mm10 genome with bowtie2 (options -q -N 1 -X 1000) (Langmead and Salzberg, 2012). Mapped reads were duplicate-filtered using Picard (Picard Tools, Broad Institute) and filtered for mapping quality (MAPQ ≥ 10) using samTools (Li et al 2009). Size classes corresponding to FACT footprints (1-120 bp) were generated using samTools (Li et al., 2009). Reads were converted to bigWig files using deepTools (options -bs 1 --normalizeUsing RPGC, --effectiveGenomeSize 2862010578) (Ramirez et al., 2014), with common sequencing read contaminants filtered out according to ENCODE blacklisted sites for mm10. Heatmaps were generated using deepTools computeMatrix (options -a 2000 -b 2000 -bs 20 --sortRegions keep --missingDataAsZero) and plotHeatmap (options --sortRegions keep --colorMap coolwarm --dpi 300) (Ramirez et al., 2014). Peaks were called from CUT&RUN data using SEACR, a CUT&RUN-specific peak-calling algorithm with relaxed stringency and controls lacking primary antibody used in lieu of input data (Meers et al., 2019a). Motifs were then called from these peaks using HOMER with default settings (Heinz et al., 2010). Pathway analysis was performed on peaks present in at least 2/4 SPT16-V5 CUT&RUN experiments using HOMER and the WikiPathways database, then plotted in GraphPad Prism 10, with the y-axis representing rank of enrichment (Heinz et al., 2010).
One-dimensional heatmaps were generated by the same pipeline for CUT&RUN and ChIP-seq data. Matrices generated using deepTools computeMatrix as above were averaged by position relative to reference point using plotProfile with the option –outFileNameMatrix. Average position scores per technical replicate were then averaged together and translated to colorimetric scores using ggplot2.
Transient Transcriptome Sequencing
TT-seq was performed using a modified method (Dolken et al., 2008; Duffy et al., 2015; Radle et al., 2013; Schwalb et al., 2016). 500 mM 4sU (Carbosynth T4509) was dissolved in 100% DMSO (Fisher). Following protein depletion as above, cells were washed with 1X DPBS (Corning), resuspended in medium containing 500 μM 4sU, and incubated at 37°C and 5% CO2 for five minutes to label nascent transcripts. After washing cells with 1X DPBS, RNA was extracted with TRIzol and fragmented using a Bioruptor Pico for one cycle at high power. Thiol-specific biotinylation of 100 ug of total RNA was carried out using 10X biotinylation buffer (100 mM Tris-Cl, pH 7.4, 10 mM ethylenediaminetetraacetic acid) and EZ-Link Biotin-HPDP (Pierce 21341) dissolved in dimethylformamide at 1 mg/mL. Biotinylation was carried out for 2h away from light with 1000 rpm shaking at 37°C. RNA was extracted with chloroform and precipitated using NaCl and isopropanol. Labeled RNA was separated from unlabeled RNA via a streptavidin C1 bead-based pulldown (DynaBeads, Thermo). In brief, beads were washed in bulk in 1 mL of 0.1N NaOh with 50mM NaCl, resuspended in binding buffer (10mM Tris-Cl, pH 7.4, 0.3M NaCl, 1% Triton X-100) and bound to RNA for 20 minutes at room temperature with rotation. Beads bound to labeled RNA were washed twice with high salt wash buffer (5 mM Tris-Cl, pH 7.4, 2M NaCl, 1% Triton X-100), twice with binding buffer, and once in low salt wash buffer (5 mM Tris-Cl, pH 7.4., 1% Triton X-100). Nascent RNA was recovered from beads using two elutions with 100mM dithiothreitol at 65°C for five minutes with 1000 rpm shaking. Recovered nascent RNA was then extracted with PCI and chloroform, then isopropanol precipitated.
Strand-specific nascent RNA-seq libraries were built using the NEBNext Ultra II Directional Library kit, with the following modifications: 200 ng of fragmented RNA was used as input for ribosomal RNA removal via antisense tiling oligonucleotides and digestion with thermostable RNase H (MCLabs) (Adiconis et al., 2013; Morlan et al., 2012). rRNA-depleted RNA samples were treated with Turbo DNase (Thermo) and purified by silica column (Zymo RNA Clean & Concentrator). RNA was fragmented at 94°C for five minutes and subsequently used as input for cDNA synthesis and strand-specific library building according to manufacturer protocol. Libraries were pooled and sequenced via Illumina NextSeq 500 to a sequencing depth of a minimum of 40 million mapped reads.
TT-seq data analysis
Paired-end fastq files were trimmed and filtered using Trim Galore (Krueger, 2015), then aligned to the mm10 mouse genome using STAR (options --outSAMtype SAM -- outFilterMismatchNoverReadLmax 0.02 --outFilterMultimapNmax 1). Feature counts were generated using HTSeq (options –stranded=reverse -f bam -r pos -m union) for genes, PROMPTs, and DHSs based on Gencode VM25 genomic coordinates (see next paragraph) (Anders et al. 2015). Reads were imported to R and downstream analysis was conducted using DESeq2 (Love et al., 2014) and plotted using EnhancedVolcano (Blighe K, 2021). Pathway analysis was performed on all significantly up- and downregulated genes separately using HOMER with the WikiPathways database (Heinz et al., 2010). Significance was defined as DESeq2 adjusted p-value < 0.05. Top five enriched categories were plotted in GraphPad Prism 10 against −log10 p-value, with manually curated categories added from the top 50 hits. Y-axes indicate pathway enrichment ranking.
Non-coding transcripts were identified by removing all transcription within 1kb of annotated mm10 coding genes from the previously described gene-distal DNaseI hypersensitive sites (GSM1014154) (Consortium, 2012; Davis et al., 2018; Thurman et al., 2012). PROMPTs were called by genomic location (within 1 kb of an annotated mm10 TSS and divergently transcribed to the TSS). ncRNAs and PROMPTs were assigned to the closest coding gene and pathway analysis was conducted as above.
Reverse Transcription and quantitative PCR (RT-qPCR)
RT-qPCR was performed as previously described (Hainer et al., 2015). Briefly, RNA was extracted from cells using TRIzol following treatment with either 3-IAA or EtOH for 0, 3, and 6 hours. 1 μg of RNA was used as input for reverse transcription, and quantitative PCR was performed using 5 μM PCR primers targeting the gene of interest with KAPA SYBR green master mix. Technical replicates shown represent the average of three individual qPCR reactions for each treatment/target/condition group. Error bars shown represent the standard deviation of two replicates for each combination.
Micrococcal Nuclease Sequencing (MNase-seq)
MNase-seq was performed as previously described (Hainer et al., 2015). In brief, 5M cells were depleted of FACT proteins using a 24-hour treatment with EtOH (vehicle) or 500 μM 3-IAA, crosslinked using 1% formaldehyde for 15 minutes at RT, and quenched with 500 mM glycine. Cells were lysed in hypotonic buffer (10 mM Tris-Cl, pH 7.5, 10 mM NaCl, 2 mM MgCl2, 0.5% NP-40, 0.3 mM CaCl2, and 1X protease inhibitors) and subjected to 5 minutes of digestion with MNase (TaKaRa) at 37°C before chelation with EDTA and EGTA. Samples were treated with RNase A (Thermo) for 40 minutes at 37C and 1000 rpm shaking. Crosslinks were reversed overnight at 55°C and chromatin was digested with Proteinase K, then used as input for a paired-end library build.
10 pg S. cerevisiae MNase-digested DNA was added to 1 μg input DNA for library builds and treated with Quick CIP (NEB) for 30 minutes and heat-inactivated. End repair was then performed using T4 DNA Polymerase (NEB), T4 Polynucleotide Kinase (NEB), and Klenow DNA Polymerase (NEB) simultaneously. A-overhangs were added to sequences via treatment with Klenow Polymerase without exonuclease activity and Illumina paired-end TruSeq adapters were added using Quick Ligase (NEB). Barcoded DNA was purified using AMPure XP beads (Agencourt) and amplified by high-fidelity PCR (KAPA). Completed libraries were subjected to silica column purification (Zymo DNA Clean & Concentrator) and sequenced via Illumina NextSeq 500 to a sequencing depth of ~50 million mapped reads.
MNase-seq data analysis
Paired-end fastq files were trimmed to 25 bp and mapped to the mm10 genome with bowtie2 (using the options -q -N 1 -X 1000) (Langmead and Salzberg, 2012). Mapped reads were duplicate-filtered using Picard (Picard Tools, Broad Institute) and filtered for mapping quality (MAPQ ≥ 10) using samTools (Li et al., 2009). Reads were then sorted into nucleosome- (135-165 bp), subnucleosome-(100-130 bp), and transcription factor- (1-80 bp) sized fragments using samTools (Li et al., 2009). Nucleosome-sized reads were converted to bigWig files using deepTools (options -bs 1 --normalizeUsing RPGC, --effectiveGenomeSize 2862010578), with common sequencing read contaminants filtered out according to ENCODE blacklisted sites for mm10 (Ramirez et al., 2014). Differential bigwigs were generated using deepTools bigwigCompare (default options) (Ramirez et al., 2014). Heatmaps were generated using deepTools computeMatrix (options --referencePoint TSS -a 2000 -b 2000 -bs 20 --sortRegions keep --missingDataAsZero) and plotHeatmap (options --sortRegions keep --colorMap coolwarm --dpi 300) (Ramirez et al., 2014). Differences in nucleosome occupancy were plotted by generating matrices in deepTools as above, then dividing average scores for each individual bin by the absolute value of the sum of the dataset, creating a measure of changes in relative occupancy. Relative occupancy changes were plotted as a metaplot using GraphPad Prism 10.
QUANTIFICATION AND STATISTICAL ANALYSIS
Statistical details for each experiment shown can be found in the accompanying figure legends. Where indicated, “n” designates technical replicates for the same biological sample, while biological replicates are referred to as “clone 1” and “clone 2” to differentiate between independently targeted cell lines. Statistical tests were used in TT-seq analyses as per the default parameters for DESeq2, with a correction applied to minimize fold change of lowly-expressed transcripts (LFCshrink), as well as motif analysis (default HOMER parameters) and peak-calling (default SEACR and HOMER parameters for CUT&RUN and ChIP-seq datasets, respectively). Any error bars shown represent one standard deviation in both directions. Significance was defined as a p-value < 0.05 by the respective test performed (indicated with “*”). No data or subjects were excluded from this study. Average values for CUT&RUN, ChIP-seq, and MNase-seq datasets were determined by computing the mean of coverage at each base pair throughout the genome between replicates. Merged replicates indicates sum of read-coverage normalized tracks generated for each individual replicate.
Acknowledgments
We thank members of the Hainer Lab for critical reading of the manuscript. We thank the ENCODE Consortium, the ENCODE production laboratories, and all other members of the scientific community who generated datasets that were essential to the completion of this study. We’d like to thank the Stewart lab for generation of the osTIR integrated ES cell line. This project used the NextSeq500 available at the University of Pittsburgh Health Sciences Sequencing Core at UPMC Children’s Hospital of Pittsburgh for sequencing with special thanks to its director, William MacDonald. This research was supported in part by the University of Pittsburgh Center for Research Computing through the resources provided. This work was supported by the Samuel and Emma Winters Foundation, 2018-2019 (to S.J.H.) and the National Institutes of Health Grant Number R35GM133732 (to S.J.H.).
Footnotes
Cell line information and minor edits