Abstract
Much effort has been devoted to understand how chromatin modification regulates development and disease. Despite recent progress, however, it remains difficult to achieve high sensitivity and reliability of chromatin-immunoprecipitation-coupled deep sequencing (ChIP-seq) to map the epigenome and global transcription factor binding sites in cell populations of low cell abundance. We present a new Atlantis dsDNase-based technology, aFARP-ChIP-seq, that provides accurate profiling of genome-wide histone modifications in as few as 100 cells. By mapping histone lysine trimethylation (H3K4me3) and H3K27Ac in group I innate lymphoid cells from different tissues, aFARP-ChIP-seq uncovers potentially distinct active promoter and enhancer landscapes of several tissue-specific NK and ILC1. aFARP-ChIP-seq is also highly effective in mapping transcription factor binding sites in small number of cells. Since aFARP-ChIP-seq offers reproducible DNA fragmentation, it should allow multiplexing ChIP-seq of both histone modifications and transcription factor binding sites for low cell samples.
Introduction
Chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) is a powerful technique for genome-wide mapping of the binding of chromatin regulators and epigenetic modifications, which has contributed greatly to both basic and translational research (Park, 2009; Furey, 2012). For example, the accurate mapping of epigenome changes in cell populations at distinct developmental stages facilitate our understanding of epigenetic mechanisms by which different cell lineages establish their unique transcriptional programs. Unfortunately, two major limitations restrict the utility of conventional ChIP-seq method in studying rare cell types isolated directly from tissues. The first is fragmentation. Although sonication is the most commonly used approach for chromatin fragmentation in ChIP-seq, it can result in epitope damage and thus reducing the immunoprecipitation efficiency especially when the initial material is limited (Stathopulos et al., 2004). The inconsistencies in sonication-based chromatin fragmentation results in a low-throughput processing of samples because each sample needs to be tested for specific setting of sonication power and time. This impedes its adaptation for reliably processing of multiple samples. Although micrococcal nuclease (MNase) has been used as an alternative to sonication for chromatin fragmentation, MNase often causes chromatin over-digestion (Brind’Amour et al., 2015). Another difficulty is chromatin loss during multiple steps of ChIP-seq operation, which makes it difficult to obtain high-quality mapping in a small number of cells (Park, 2009).
Several strategies have been employed to reduce the number of cells needed to produce high quality ChIP-seq. One method is based on increasing DNA amplification cycles during the sequencing library building, which allowed high quality ChIP-seq in thousands of cells (Adli et al., 2010; Adli and Bernstein, 2011; Shankaranarayanan et al., 2011; Ng et al., 2013). The major deficiency of the amplification-based method is that the low-abundance of some ChIPed chromatin may be underrepresented or lost, which makes the method not applicable to ultralow cell numbers. Another method is barcoding of the fragmented chromatin in individual samples followed by sample pooling (Lara-Astiaso et al., 2014; Rotem et al., 2015; van Galen et al., 2016). By pooling multiple barcoded samples, the increased total ChIPed chromatin helps to reduce chromatin loss in subsequent steps. The low efficiency of ligating barcode adaptors to chromatin fragments (<10%), however, significantly limits the application of the method for low cell numbers (Gury-BenAri et al., 2016b), because the majority of chromatin failed to be barcoded. Indeed, between 10,000 to 20,000 sorted hematopoietic cells were needed for the barcode-based iChIP-seq studies (Lara-Astiaso et al., 2014).
By combining microfluidics and DNA barcoding, Rotam et al reported the mapping of chromatin states at single-cell resolution and identified a spectrum of heterogeneity defined by differences in chromatin signatures of pluripotency and differentiation priming among mouse embryonic stem cells (mESCs) (Rotem et al., 2015). While this single-cell method is useful for identifying subpopulations of cells with the aid of single-cell RNA-seq data, the low number of valid sequencing reads (500 −10,000) per cell makes the method insufficient for de novo regulatory site identification (Rotem et al., 2015). Another recently published method, ChIPmentation, utilizes Tn5 transposon mediated adapter addition to immunoprecipitated chromatin still bound to beads. This greatly simplified the ChIP procedure, thereby reducing the time and cell input requirement (Schmidl et al., 2015). However, since chromatin loss still occurs during the sonication step, ChIPmentation requires ~10,000 cells for accurate mapping of histone modifications of H3K4me3 and H3K27me3.
Innate lymphoid cells (ILCs) are a family of recently defined lymphoid cells that belongs to the innate counterpart of lymphocytes (Rankin et al., 2013; Diefenbach et al., 2014; Eberl et al., 2015). ILCs are present throughout the body and they are often found at the barrier surfaces of tissues. These cells play important roles in early defense against pathogens and they promote tissue repair and maintenance. Their inappropriate activation could contribute to inflammation and autoimmune diseases (Rankin et al., 2013; McKenzie et al., 2014). Developing from the common ILC progenitor, three classes of ILCs have been defined, including group 1, group 2, and group 3 ILCs. These ILCs are characterized based on their analogous cytokine-production profiles of the adaptive T cell subsets (Diefenbach et al., 2014). The group 1 ILCs consist of conventional Natural Killer cell (NK) and ILC1. Controversial classification exists in this subgrouping because they have functional similarities and shared expression of many cell surface markers (Jiao et al., 2016). Recent global transcriptome profiling of group 1 ILCs isolated from different peripheral tissues has begun to facilitate the subdivision of the NK and ILC1 subsets (Robinette et al., 2015). For example, the small intestine intraepithelial (siIEL) ILC1 is found as a unique subset with distinct developmental and functional properties (Fuchs et al., 2013). The unique transcription and cytokine profiles suggest that siIEL ILC1 may belong to a distinct ILC1 subset different from some of the other ILC1 or NK subsets (Robinette et al., 2015). However, the lack of high quality epigenome and chromatin protein binding profiles due to the difficult to perform ChIP-seq in the low abundance siIEL ILC1 has limited additional in-depth study of the lineage identity and origin of these cells (Sciumè et al., 2017).
We have recently report two techniques, recovery via protection (RP)-ChIP-seq and favored amplification RP-ChIP-seq (FARP-ChIP-seq) for low-cell-number epigenome profiling (Zheng et al., 2015). These two ChIP-seq methods are based on the idea that if a rare cell population is not lost during the initial fixation and wash steps, and if chromatin loss is minimized during ChIP and library building, it should be possible to recover low-abundance chromatin without increasing DNA amplification cycles. Indeed, by employing these two methods, we were able to obtain reproducible mapping in as few as 500 cells. Here, we report a new dsDNase-based FARP-ChIP-seq method for high-fidelity genome-wide profiling using as few as 100 cells with broad applications such as profiling of epigenetic differences of group 1 ILCs from different tissue origins and transcription factor binding in splenic B cells. The reliability and consistency of dsDNase-based chromatin fragmentation among different samples in our method also allows multiplexing of ChIP-seq operations.
Results
Atlantis dsDNase-based FARP-ChIP-seq (aFARP-ChIP-seq) offers better mapping of histone marks than FARP-ChIP-seq
We recently reported RP-ChIP-seq and FARP-ChIP-seq that allow high quality epigenome mapping in only 500 cells. However, the increase in sequencing depth required due to the presence of carrier DNA leads to an increase in mapping costs (Zheng et al., 2015). To reduce both the sequencing costs and the number of cells needed for successful FARP-ChIP-seq, we wish to further improve the recovery of chromatin of interest. We reasoned that sonication used for chromatin fragmentation could destroy the epitopes for subsequent immunoprecipitation, thereby resulting in a significant loss of chromatin of interest especially when the input cell number is low. To overcome this limitation, we tested enzymatic digestion-based chromatin fragmentation. We found that the commonly used micrococcal nuclease (MNase)-based fragmentation is compatible for FARP-ChIP-seq and it resulted in ~70% increase (from 16% to 27%) of H3K4me3 reads mapping to mouse genome compared to FARP-ChIP-seq at the same read depth for 500 mESCs (Fig 1A and B). However, it is challenging to control MNase activity and its propensity to over digest chromatin due to its exonuclease activity impedes reliable parallel processing of multiple samples, especially when dealing with different samples (Brind’Amour et al., 2015; van Galen et al., 2016). Therefore, we sought to identify an alternative DNase for chromatin fragmentation.
Among the DNases, the Atlantis dsDNase (Zymoresearch) is a double-stranded DNA-specific endonuclease that cleaves phosphodiester bonds in DNA to yield fragments with 5’-phosphate and 3’-hydroxyl termini, which are ideal for chromatin fragmentation. Our tests demonstrated that Atlantis dsDNase digestion of unfixed or paraformaldehyde fixed nuclei from different cell types and cell numbers at 0.5 Unit (U) for 20-30 min at 37 °C yielded consistent chromatin fragmentation (Figure S1 and data not shown). This reliable fragmentation and suitable DNA length distribution are optimal for ChIP-seq (Park, 2009) and it is amenable for simultaneously fragment chromatin in different samples (Figure 1A). Thus, the Atlantis dsDNase may replace the conventional MNase for chromatin fragmentation in a variety of genome studies.
We next attempted at incorporating Atlantis dsDNase for chromatin fragmentation in our FARP-ChIP-seq and we referred to this method as Atlantis-FARP-ChIP-seq (aFARP-ChIP-seq). Since the total chromatin marked by H3K4me3 is much lower compared to other histone modifications, it is most challenging in obtaining sufficient chromatin for high quality H3K4me3 profiling using low cell numbers. Thus, we initially applied aFARP-ChIP-seq to map H3K4me3 in 500 mESCs (Figure 1A; also see the method section). We found that aFARP-ChIP-seq resulted in a ~3-fold increase of DNA reads of interest compared to FARP-ChIP-seq at the same read depth for 500 mESCs (Figure 1B). The two biological replicates were highly consistent (Figure 1C), demonstrating the reproducibility of this method. To further evaluate aFARP-ChIP-seq, we performed genome-wide comparisons of H3K4me3 signal intensity against the datasets generated from the standard ChIP-seq of 107 mESCs (Jia et al., 2012). This revealed that aFARP-ChIP-seq using 500 mESCs was highly correlated with the standard ChIP-seq of 107 mESCs (Figure 1D). Analyses of specific chromatin regions revealed that aFARP-ChIP-seq could uncover H3K4me3 peaks reliably. More importantly, aFARP-ChIP-seq generated higher H3K4me3 signal intensity compared to FARP-ChIP-seq (Figure 1E). Thus, aFARP-ChIP-seq yields accurate and consistent epigenome profiles and it performs better compared to FARP-ChIP-seq relying on the sonication-based or MNase-based chromatin fragmentation in 500 cells (Figure 1B).
aFARP-ChIP-seq epigenome mapping in as few as 100 cells
The improved mapping efficiency of aFARP-ChIP-seq suggests that it could be used in less than 500 cells. To test this, we mapped H3K4me3 using 100 mESCs. We digested the fixed mESCs for 20 min. We found that the H3K4me3 mapping in 100 mESCs showed good consistency between the two biological replicates (Figure 2A). Importantly, despite increased noise, the chromatin profiles generated from 100 mESCs were still informative and were similar to the 500-cell aFARP-ChIP-seq (Figure 2B). By using the MACS2 program with identical parameters, the peaks called for our previous standard ChIP-seq of 107 mESCs datasets (Jia et al., 2012) and aFARP-ChIP-seq of 100 mESCs also have a good degree of overlap (Figure 2C).
Next, we used the receiver-operating characteristic (ROC) curve to compare the H3K4me3 maps obtained by FARP-ChIP-seq or aFARP-ChIP-seq. The Area Under the ROC curve (AUC) is a standard metric for quantifying balanced sensitivity and specificity. By using different cutoffs to calculate the true-positive and false-positive rates, we plotted ROC curves for each method, which showed that aFARP-ChIP-seq using 100 or 500 cells provides reliable performances (Figure 2D). These analyses demonstrate that aFARP-ChIP-seq enables analysis of as few as 100 cells.
The H3K4me3 profiling of group 1 innate lymphocyte lineages reflects their distinct functionalities
To test the applicability of aFARP-ChIP-seq, we applied it on challenging in vivo biological samples by profiling histone modifications in innate lymphoid cell (ILC) types, which typically make up 1-5 % of total lymphocytes in peripheral non-gut tissues (Diefenbach et al., 2014). Recent studies indicate that tissue-specific signals have significant impacts on gene expression and activity of ILCs. The influence of local tissue microenvironments on chromatin and gene regulatory landscapes, however, has remained not well understood (Gury-BenAri et al., 2016a; Shih et al., 2016; Sciumè et al., 2017), in part, due to the difficulty in mapping chromatin and gene regulatory landscapes in ILCs isolated from individual animals. We focused our study on the IFN-γ-producing group 1 ILCs, including conventional NK and ILC1, isolated from individual mice (Cortez and Colonna, 2016)
Since there is a lack of unique and consistent markers in NK cells and ILC1 cells in various organs, we used different sorting strategies according to previously published protocols for each tissue, including spleen, mesenteric lymph nodes (mLN), liver and small intestine intraepithelia (Figure 3A and Figure S2A) (Robinette et al., 2015). We then applied aFARP-ChIP-seq to profile H3K4me3 using NK or ILC1 sorted from each tissue from one mice without further in vitro culturing. For each aFARP-ChIP-seq analyses, we used 1000-2000 cells that were estimated based on the cell numbers sorted by Fluorescence Activated Cell Sorting (FACS). We found consistent maps for H3K4me3 (Figure S2B-H) between biological replicates in the ILC1 and NK cells in different tissues, which allowed us to examine the chromatin landscapes in these two group 1 ILC sub-lineages.
We identified 789 up- and 271 down-regulated H3K4me3 peaks in spleen, 604 up- and 557 down-regulated peaks in mLN, and 966 up- and 490 down-regulated H3K4me3 peaks in liver in ILC1 compared to NK cells (fold change > 1.5, FDR < 0.05.) (Figure 3B-D). We found those exhibiting differentially H3K4me3 peaks correlated with the lineage-specific developmental program and functionality of each subset such as IL7r (ILC1), Eomes (NK), and Gzma (NK, cytotoxic machinery) (Figure 3E).
Specifically, the H3K4me3 levels on Eomes is greater than twofold in NK cells than in ILC1 cells in all tissues analyzed (Figure 3E and Table S1), which is consistent with previous finding that Emoes is a marker for NK cells (Gordon et al., 2012; Daussy et al., 2014; Zhang et al., 2018). Interestingly, we found that siIEL ILC1 isolated from small intestines exhibited H3K4me3 peaks on both IL7r and Eomes (Figure 3E). This suggests functional plasticity of this unique ILC1 probably due to their constant exposure to varied environmental signals from microbiome and nutrients in the gut. The Multi-Dimensional Scaling (MDS) plot also revealed that siIEL ILC1 to be distinct from both ILC1 and NK derived from different peripheral tissues (Figure 3F). The elevated H3K4me3 peak in TGF-β locus in siIEL ILC1 compared to NK and other ILC1 cells (Figure 3G) is consistent with the unique roles of TGF-β in the development and function of siIEL ILC1 (Robinette et al., 2015; Cortez et al., 2016)
H3K27Ac mapping reveals differential enhancer landscapes in the siIEL ILC1 compared to the other ILC1
It is well known that gene expression programs in ILC subsets in different tissues can reflect distinct patterns of enhancer activity that in turn reflects the differential transcription factor binding profiles (Hallikas et al., 2006; Spitz and Furlong, 2012; Heinz et al., 2015). To probe the regulatory circuitry that specifies siIEL ILC1 and its functions, we next mapped the active enhancer mark H3K27Ac (Creyghton et al., 2010) in the same set of NK and ILC1 isolated from individual mice as described above (see Figure 3 and S2) using estimated 1000-2000 cells (based on FACS sorting) for each ChIP-seq experiment. The biological replicates of our maps were highly consistent with one another (Figure 4A and Figure S3A-F). We then focused on analyzing the active enhancers in ILC1 from spleen, lymph nodes, liver, and small intestine. We identified total 21417 enhancers that are active in at least one ILC1 cell type from at least one of the four tissues profiled (Figure 4B and Table S2). Consistent with previous finding of an early developmental acquisition of common chromatin organization in ILCs (Shih et al., 2016), the ILC1 cells from siIEL shared 9892 enhancers with the other ILC1 from spleen, lymph node, and liver (Figure 4B). We also identified 2100 enhancers that are unique to siIEL ILC1 (Figure 4B).
By analyzing the top five genes located most proximally to the up-regulated enhancers in the siEIL ILC1 (compared to the other three ILC1), we found Ahrr, Cnih3 (Figure 4C and Figure S3G), Ccny, Sec24d, and Rin2. Interestingly, a recent study reported that the expression of AhRR in colonic intraepithelial lymphocytes prevents excessive IL-1β production and Th17/Tc17 differentiation, implicating the physiologic importance of AhRR in balancing intestine inflammation (Brandstätter et al., 2016). Our finding that Ahrr gene is marked strongly by active enhancers in siIEL ILC1 suggests that siEIL ILC1 could also use Ahrr in modulating inflammation in the small intestines. Our analyses also indicate that the other genes, such as Cnih3 and Ccny, could also play important roles in siEIL ILC1. Although additional studies are required to validate this possibility, the high-quality enhancer profiling achieved using aFARP-ChIP-seq in small number of cells directly isolated from tissues should facilitate the identification of candidate genes that function in lineage specification or functional plasticity of cells in vivo such as the ILC1 subsets in different tissue microenvironments in individual mice.
Since transcription factor binding is a key determinant of enhancer activity, we next attempted to identify potential transcription factors that could regulate the enhancer landscape in siEIL ILC1 by searching for the enrichment of transcription factor binding motifs in H3K27Ac peaks identified in these cells. This allowed the identified a full set of transcription factor signatures of siEIL ILC1. Among these, we found significant enrichment of sequence motifs known to be bound by Fli1 (or ETS), IRF1, RunX1, Zfx and Gata3 (Figure 4D), which have been shown to play important roles for the development and function of ILC1 in general (Rankin et al., 2013; Diefenbach et al., 2014; Tanriver and Diefenbach, 2014). These results suggest that the shared transcriptional regulatory elements underlying either development or functionality of ILC1 subpopulations across the different tissue origins. Together, our analyses show that the high quality H3K4me3 and H3K27Ac datasets we generated for NK and ILC1 in different tissues can serve as valuable resources.
aFARP-ChIP-seq is applicable for high quality mapping of transcription factor binding sites in small number of cells
Genome-wide profiles of transcription factor binding sites have been largely restricted to tissue culture cell lines due to the requirement of a large number of cells (>106 cells) as the starting material for the traditional ChIP-seq approach (Valouev et al., 2008; Ouyang et al., 2009). To test if aFARP-ChIP-seq can facilitate the mapping of transcription factor binding sites in relatively small number of cells sorted from tissues, we mapped the binding sites of the ETS-family transcription factor, PU.1. PU.1 is widely expressed in hematopoietic lineages and plays key regulatory roles in early hematopoiesis and B-cell development (Klemsz et al., 1990). As proof of principle, we performed genome-wide mapping of PU.1 in splenic B cells using 1×105 or 1×104 cells in each aFARP-ChIP-seq. Several lines of evidence suggest that our profiling identified bona fide PU.1 binding sites in the isolated B cells. First of all, our data showed proper read distributions around genomic loci of genes that have previously been reported to be bound by PU.1 in B cells, such as Blnk (encoding B cell linker protein), Btk (encoding Bruton tyrosine kinase), and Fcgr2b (encoding FcγRIIb), and our parallel mapping using isolated T cells showed that these genes do not have PU.1 binding as expected (Figure 5A and Figure S4) (Schweitzer et al., 2006; Xu et al., 2012; Solomon et al., 2015). Additionally, we observed a similar PU.1 binding pattern around TNF locus as those obtained using iChIP-seq in 1x 104 dendritic cells derived by in vitro differentiation of mouse bone marrow cells (Figure 5B) (Lara-Astiaso et al., 2014).
To further validate the results of our aFARP-ChIP-seq in the isolated splenic B cells, we screened for the sequences of the enriched read regions for potential transcription factor binding motifs. We found that the top motif for PU.1-bound regions contain a canonical ETS motif, 5′-GGAA-3′ (Figure 5C) (Solomon et al., 2015). Therefore, the reproducibility and sensitivity of aFARP-ChIP-seq should allow genome wide characterization of chromatin binding proteins including transcription factors in small number of cells directly sorted from tissues.
Discussion
By searching for DNases that allow reliable chromatin fragmentation under wider range of conditions than the commonly used MNase, we found that the Atlantis dsDNase can fragment chromatin reproducibly in different cell types and in different number of cells using a relatively wide range of incubation time without over digesting chromatin. Thus, the Atlantis dsDNase is a superior choice over MNase in applications involving chromatin fragmentation.
By applying FARP-ChIP-seq on the Atlantis dsDNase generated chromatin (aFARP-ChIP-seq), we are able to produce high-quality ChIP-seq datasets in as few as 100 cells. We have shown that addition of carriers in FARP-ChIP-seq allows capture of low abundance chromatin without excessive PCR amplification, thereby greatly improves the fidelity and reproducibility of genome-wide chromatin mapping in small number of cells. However, a ~5-fold increase of sequencing depth is required to obtain sufficient reads of interest by FARP-ChIP-seq (Zheng et al., 2015), which increases sequencing costs. The improved chromatin fragmentation by Atlantis DNase in aFARP-ChIP-seq greatly increased the recovery of low abundance chromatin compared to FARP-ChIP-seq. Indeed, we show that aFARP-ChIP-seq offers a ~3-fold increase of DNA reads compared to FARP-ChIP-seq at the same read depth for H3K4me3 in 500 mESCs. This increase in sequence recovery significantly reduces DNA reads needed and thus sequencing cost, thereby facilitating the use of carrier approach for ChIP-seq in different applications.
Not much effort has been devoted to multiplexing of ChIP-seq in small number of cells because of the difficulty in obtaining high quality and fidelity reads and because of the inconsistency of chromatin fragmentation by sonication, tagmentation, or MNase in different samples. We show that optimal chromatin fragmentation and high reproducibility is achieved by Atlantis dsDNase in a range of conditions and different cell number and types (Figue 1 and Figure 3-4). Therefore, aFARP-ChIP-seq offers a good opportunity to multiplexing ChIP-seq library preparations. This, coupled with the high-quality ChIP-seq data and reduced sequence need, should allow parallel epigenome mapping of different cell types isolated from different tissues without further amplification of cells or excessive rounds of PCR amplification of DNA fragments. The simplicity of aFARP-ChIP-seq workflow allows it to be easily established in any lab without specialized equipment, such as microfluidic devices used in MOWChIP-seq and single-cell ChIP-seq (Cao et al., 2015; Rotem et al., 2015). Considering the aFARP-ChIP-seq is fundamentally different from other high-sensitivity ChIP technologies, which rely on either excessive DNA amplification or chromatin indexing, we believe that it offers a viable alternative approach to further reduce read depth and cell number needed in a high throughout format.
By sorting the group 1 ILC subsets in mesenteric lymph node (mLN), spleen, small intestine, and liver from one mouse, we obtained high quality mapping of H3K4me3 and H3K27Ac in the NK and ILC1 cells. This should allow more accurate understanding of how the same kind of immune cell subsets differ in different organs in the same mouse. By obtaining genome wide maps of epigenome and chromatin binding sites for proteins in different cell types in different mouse, aFARP-ChIP-seq should also allow efficient comparisons of the same cell subsets in one organ to reveal how different environments and genotypes influence the genome features. For example, studies suggested that microenvironments in different tissues play important roles in shaping both gene expression and enhancer activities, resulting in tissue-specific identities of macrophages (Lavin et al., 2014). Consistent with the notion that different tissue microenvironments influence tissue resident immune cells, our H3K4me3 MDS analyses reveal a clear separation of NK cells and ILC1s isolated from the spleen, small intestines, mLN, or liver in one mouse (Figure 3F).
We also observed pronounced epigenome differences in siIEL ILC1 isolated from small intestine’s intraepithelial compartment compared to the ILC1s from the other three tissues we analyzed, which is consistent with the recently reported transcriptome differences in the group I ILCs (Robinette et al., 2015). Interestingly, we show that siIEL ILC1 exhibits H3K4me3 peaks on Eomes, a key lineage-determining transcription factor for NK cells. One explanation for this finding is that the siIEL ILC1 identified by cell-surface markers (NKp46+NK1.1+) is a heterogeneous population and it may be composed of both unidentified NK subsets and ILC1 lineage in the gut intraepithelial compartment. Alternatively, these siIEL ILC1 may exhibit lineage and functional plasticity due to the unique gut environment. Consistently, an ILC3-derived ILC1 population has been reported in the mouse gut under the influence of inflammatory stimuli (Li et al., 2016). Importantly, NK cells could be converted into ILC1 by the tumor microenvironment-derived TGF-β signaling (Cortez et al., 2017; Gao et al., 2017), indicating that a fraction of siIEL ILC1 could also be derived from NK cells underlying the unique gut compartment. Identification of signaling molecules involved in the conversion of NK to siIEL ILC1 within the gut epithelial environment could be a key to understanding the functional plasticity of this unique ILC1 population. Given that gut microbiome is thought to exhibit remarkable impact on the regulatory landscape of ILCs (Gury-BenAri et al., 2016a), aFARP-ChIP-seq should greatly aid further study of how ILCs differentially integrate signals from the microbial microenvironment in individual mice to generate phenotypic and functional differences, thereby influencing the health status of individuals.
The ability to achieve accurate map of active enhancer landscapes as revealed by H3K27Ac profiling in individual mice in this study should allow the identification of candidate genes that may function in specific cell types under different tissue microenvironments or external environments different mice experience. For example, compared to ILC1 from other peripheral tissues studied, the siIEL ILC1 displays a significant up-regulated H3K27Ac peaks for the gene Ahrr, which has been reported to play an important role for balancing colon inflammation (Brandstätter et al., 2016). This suggests that Ahrr in siIEL ILC1 could play a similar role in the small intestines. Additionally, the H3K27Ac profiling allowed the identification of transcription factor binding signature enriched in siIEL ILC1. Deep mining of the H3K27Ac datasets in NK1 and ILC1 we mapped should allow the identification of additional candidate transcription factor binding in different tissues and cell subsets, which would facilitate further study of the regulatory logic underlying the developmental and homeostatic processes of group I ILCs.
Our proof-of-concept PU.1 profiling by aFARP-ChIP-seq using 1×104 isolated splenic B cells demonstrate the broad usage of this method for genome-wide mapping of transcription factor binding. The requirement of only 1×104 cells should allow successful mapping for majority of cell types of hematopoietic lineage in individual mice. This is especially important for the mapping of early hematopoietic progenitors because of limited cell number per mice. Additional optimization, including further optimizing digestion conditions and the use of improved ChIP grade antibodies, should allow the reduction of the number of cells needed, thereby further increasing the feasibility and success rate of discovering novel transcriptional regulatory network governing each stage of developmental process.
Materials and methods
Cell lines and animals
E14 mouse embryotic stem cells (mESCs) were cultured in DMEM with 15% fetal calf serum, penicillin/streptomycin, β-mercaptoethanol, L-glutamine, nonessential amino acids, recombinant leukemia inhibitory factor (1000 U/ml, Millipore).
C57BL/6L mice were obtained and maintained at the mouse facility of Carnegie Institution’s Embryology Department. Adult female mice (8 week of age) were used in all experiments. Mice were housed under a strict 12 hr light-dark cycle with food and water ad libitum. All mouse procedures in this study were in accordance with protocols approved by the Institutional Animal Care and Use Committee of the Carnegie Institution for Science.
Antibodies
The following antibodies were obtained from Biolegend: Fluorescein isothiocyanate (FITC)-conjugated anti-NKp46 (clone 29A1.4), PE/Cy7-conjugated anti-NK1.1 (clone PK136), APC-conjugated anti-CD127 (IL-7Rα) (clone A7R34), PE/Cy7-conjugated anti-NK1.1 (clone PK136), PerCP/Cy5.5-conjugated anti-CD49b (clone HMa2), PerCP/Cy5.5-conjugated anti-mouse Ly-6A/E (Sca-1) (clone D7), PE anti-CD253(TRAIL) (clone N2B2), PE/Cy7-conjugated anti-mouse CD117 (clone c-Kit) (clone 2B8), FITC-conjugated anti-mouse CD25 (clone 3C7), Lineage mixed cocktail antibodies (Ter119, Gr1, CD11b, B220, CD3, CD4, CD8), PE-conjugated anti-F4/80 (clone BM8), APC-conjugated anti-CD11c (clone N418), and isotype-matched control monoclonal antibodies.
Antibody against histone H3 lysine 4 trimethylation (H3K4me3) (clone C42D8, #9751S) was from Cell Signaling. Antibodies against acetylated histone H3 lysine 27 (H3K27ac, #ab4729) and control mouse IgG were from Abcam. Antibodies against PU.1 (clone T-21, #sc-352X) were from Santa Cruz Biotechnology. The working concentrations of the above antibodies were used as recommended by the companies, unless otherwise specified below, or in the text and figure legends, in an assay-dependent manner.
Cell identification, isolation, and flow cytometry
All cells from mouse tissues or organs were collected, stained and sorted according to the published standard protocol (Halim and Takei, 2014). Spleens, mesenteric lymph nodes, liver, and small intestine were extracted from C57BL/6J female mice and were dissociated into single cell suspensions by passing through 100µm Falcon cell strainer. After washing with FACS buffer (PBS with 0.3% BSA and 2 mM EDTA), cell suspension was incubated in red blood cell lysis solution (#R7757, Sigma) for 5 min on ice. For spleen and mesenteric lymph nodes, anti-CD3 (#130-095-130) and anti-CD19 (#130-052-201) microbeads (Miltenyi Biotec) were used to remove T cells and B cells, respectively, following the microbead guidelines. For liver, lymphocytes were enriched at the interface between a gradient of 40% and 80% Percoll in Hank’s balanced salt solution. For small intestine, the intestine was cut first longitudinally and then laterally into pieces of approximately 1-2 cm length in Petri dish. The tissues were transferred into 50 ml tube with 20 ml PBS containing 1mM EDTA. The sample was incubated at 37°C for 20 min under continuous 120 rpm rotation. Lymphocytes in small intestine were enriched at the interface between a gradient of 40% and 80% Percoll in PBS. The cells were then stained with DAPI, NKp46, NK1.1, CD127, CD49b, TRAIL, CD25, CD117, Sca-1, Lineage mixed cocktail antibodies (Ter119, Gr1, CD11b, B220, CD3, CD4, CD8), CD11c, I-A/I-E, F4/80 antibodies for 20 min at 4°C. After washing, cells were sorted with FACSAriaTM III cell sorter (BD Bioscience). All data were with FlowJo 9.3.2 software (Tree Star). The cell populations collected were identified as:
Spleen- and mesenteric lymph node-ILC1: CD3-, CD19-, NKp46+, NK1.1+, Spleen- and mesenteric lymph node-NK cells: CD3-, CD19-, NKp46+, NK1.1+, CD127-.
Liver-ILC1 cells: NKp46+, NK1.1+, CD49b-, TRAIL+.
Liver-NK cells: NKp46+, NK1.1+, CD49b+, TRAIL-.
siIEL-ILC1 cells: CD3-, CD19-, NKp46+, NK1.1+.
Spleen-B cells: CD3-, B220+, CD19+.
Spleen-T cells: CD3+, B220-. TCRβ+
FARP-ChIP-seq
Standard FARP-ChIP-seq was performed according to the conditions previously from our lab (Zheng et al., 2015). Briefly, 500 mESCs were mixed with ~5×108 DH5α E.coli and fixed with 1% formaldehyde followed by quenching using 0.125M glycine. After washes, the sample mixtures were sonicated to obtain fragments with a 1/16-inch probe for 15 min at 3 watts by a tip sonicator (Misonix sonicator 3000). Protein G beads (#10004D, Life Technologies) and M-280 streptavidin beads (#11206D, Life Technologies) were pre-blocked with ~5×108 fixed and sonicated E.coli lysate overnight at 4°C. After blocking, 5 ng carrier biotin-DNA was then coupled to 10 µl M-280 streptavidin beads. These treated protein G and streptavidin beads were combined and used to ChIP H3K4me3 in the sonicated E.coli+mESC lysates overnight at 4°C. After de-crosslinking, the precipitated genomic DNA and biotin-DNA were purified by AMPure XP beads (#A63881, Beckman Coulter). Library building steps were performed following Illumina True-Seq protocol with 0.25µM blocker oligo included at the final library amplification step.
MNase-FARP-ChIP-seq
500 mESCs were mixed with ~1×108 DH5α E.coli and fixed with 1% formaldehyde followed by quenching using 0.125M glycine. After fixation and washing, the mixtures were re-suspended directly in nuclear isolation buffer (#NUC101, Sigma). Chromatin was fragmented using 2U/μl MNase (#M2047 NEB) at 25°C for 5 minutes according to the published protocol (Brind’Amour et al., 2015). ChIP, genomic DNA recovery, and sequencing library generation were performed following the FARP-ChIP-seq procedure.
Atlantis-FARP-ChIP-seq (aFARP-ChIP-seq)
mESCs or sorted immune cells were mixed with ~1×108 E.coli (DH5α) and crosslink by 1% formaldehyde and incubated for 8 min at room temperature with moderate shaking. After fixation, glycine was adding to the final concentration of 0.125 M and incubated for 5 min at room temperature to stop the crosslinking by quenching the free formaldehyde. After washing, the mixture was resuspended in the nuclear isolation buffer (#NUC101, Sigma). Then, the samples were digested with 0.5U/100ul Atalantis dsDNase (#E2030, Zymo Research) for 30 min at 37°C (20 min at 37°C for 100 mESCs). The reactions were stopped by 0.5 M EDTA. ChIP, genomic DNA recovery and sequencing library generation were performed following the standard FARP-ChIP-seq.
ChIP sequencing and peak finding
ChIP sequencing was done on Illumina Nextseq-500 and pooled libraries were sequenced at a sequencing depth of ~15-20 million aligned reads per sample. Libraries were prepared in triplicates or duplicates. Reads were mapped to the mouse genome mm9 using the ‘bowtie’ program with -v 2 parameter. Only tags that uniquely mapped to the genome were used for further analysis. ChIP-seq peaks were called using the MACS program (Zhang et al., 2008) with default parameters.
Promoter correlation, whole-genome correlation, and ROC analysis
log2 of H3K4me3 enrichment were plotted for all TSS (2kb up and downstream of TSS) to generate the correlation plots. For the H3K4me3 ROC analysis, we used top 25,000 2-kb windows as “True” to mimic the ~25,000 peaks in benchmark dataset. Then by adjusting the “threshold” to include more top 2-kb windows in the test data, we can calculate the true-positive rate and false-positive rate to get the ROC curve.
Analyses of ChIP-seq of transcription factor binding in splenic B cells
PU.1 peaks of B cells from mouse spleen are called by MACS with p-value threshold of 10−5. The motifs was identified by using Homer (Heinz et al., 2010) to call motifs on PU.1 peak regions.
Supplementary Figure legends
Supplementary Table legends
Table S1. Peaks called from H3K4me3 mapping of ILC1 and NK cells from indicated tissues. Related to Figure 3. Replicate datasets are pooled before peak calling
Table S2. Peaks called from H3K27ac mapping of ILC1 and NK cells from indicated tissues. Related to Figure 4. Replicate datasets are pooled before peak calling.
Table S3. PU.1 peaks called from mapping of B cells from mouse spleen. Related to Figure 5.