ABSTRACT
The challenge of linking intergenic mutations to target genes has limited molecular understanding of human diseases. Here, we show that H3K27ac HiChIP generates high-resolution contact maps of active enhancers and target genes in rare primary human T cell subtypes and coronary artery smooth muscle cells. Differentiation of naïve T cells to T helper 17 cells or regulatory T cells creates subtype-specific enhancer-promoter interactions, specifically at regions of shared DNA accessibility. These data provide a principled means of assigning molecular functions to autoimmune and cardiovascular disease risk variants, linking hundreds of noncoding variants to putative gene targets. Target genes identified with HiChIP are further supported by CRISPR interference and activation at linked enhancers, by the presence of expression quantitative trait loci, and by allele-specific enhancer loops in patient-derived primary cells. The majority of disease-associated enhancers contact genes beyond the nearest gene in the linear genome, leading to a four-fold increase of potential target genes for autoimmune and cardiovascular diseases.
Gene expression programs are intimately linked to the hierarchical organization of the genome. In mammalian cells, each chromosome is organized into hundreds of megabase-sized topologically associated domains (TADs), which are conserved from early stem cells to differentiated cell types1. Within this invariant TAD scaffold, cell type-specific enhancer-promoter (E-P) interactions establish regulatory gene expression programs2. Standard methods require tens of millions of cells to obtain high-resolution interaction maps and confidently assign E-P contacts3-5. Thus, the principles that govern E-P conformation in disease-relevant patient samples are incompletely understood. This gap in understanding is particularly problematic for interpreting the molecular functions of inherited risk factors for common human diseases, which reside in intergenic enhancers or other non-coding DNA features in up to 90% of cases6-9. Such disease-relevant enhancers may not influence the expression of the nearest gene (often reported as the default target in the literature), and instead act in a cell-type specific manner on distant target genes residing up to hundreds of kilobases (kb) away2,10-14. Recently, systematic perturbations of regulatory elements in select gene loci have shown that effects of individual regulatory elements on gene activity can be predicted from the combination of (i) enhancer activity [marked by histone H3 lysine 27 acetylation (H3K27ac) level] and (ii) enhancer-target looping5,15. Here we leverage this insight to capture the combination of these two types of information genome-wide in a single assay, mapping the enhancer connectome in disease-relevant primary human cells.
RESULTS
H3K27ac HiChIP identifies functional enhancer interactions
We recently developed HiChIP, a method for sensitive and efficient analysis of protein-centric chromosome conformation16. Cohesin HiChIP in GM12878 cells identified similar numbers of loops as in situ Hi-C (~10,000) with high correlation (R = 0.83), demonstrating that HiChIP captures loops with high sensitivity and specificity. Here, we evaluated the enhancer and promoter-associated mark H3K27ac17-19 as a candidate factor to selectively interrogate E-P interactions genome-wide. We performed H3K27ac HiChIP in mouse embryonic stem (mES) cells to compare to cohesin HiChIP (Supplementary Fig. 1a, Supplementary Table 1)16. 3,552 of 4,191 H3K27ac HiChIP loops in mES cells were also identified by cohesin HiChIP. The H3K27ac-biased loops (log2 fold-change > 1) spanned shorter distances than cohesin-biased loops, and were enriched for H3K27ac ChIP-seq peaks (78.9%; Supplementary Fig. 1b-f, Supplementary Table 2). Moreover, systematic titration of input material showed H3K27ac HiChIP retained high signal fidelity and reproducibility from 25 million to 50,000 cells as input material (loop signal correlation = 0.918; Supplementary Figs. 2 and 3). Therefore, H3K27ac HiChIP identifies high-confidence chromatin loops focused around enhancer interactions from limited cell numbers.
In order to capture (i) conformational change during T cell differentiation and (ii) cell type-specific chromatin contacts of autoimmune risk variants in protective and pathogenic T cell types, we performed H3K27ac HiChIP on primary human Naïve T cells (CD4+CD45RA+CD25-CD127hi), regulatory T cells (Treg; CD4+CD25+CD127low) and T helper 17 cells (TH17; CD4+CD45RA-CD25-CD127hiCCR6+CXCR5-) directly isolated from donors (Fig. la,b and Supplementary Fig. 4a)20,21. TH17 cells were sorted to include autoimmune disease-relevant pathogenic TH17 cells and to exclude follicular helper T cells with a distinct surface phenotype and immune function (Supplementary Fig. 4a)22-24. Peripheral blood CD4+ T cells were isolated from three healthy subjects, isolated by FACS, and subjected to H3K27ac HiChIP. HiChIP libraries from each subset were high quality; greater than 40% of the reads represented unique paired-end tags (PETs) (Supplementary Fig. 4b-d and Supplementary Table 1). Furthermore, libraries exhibited high 1D signal enrichment at enhancers and promoters, and globally recapitulated publically available H3K27ac ChIP-seq datasets (74.7% overlap of ChIP-seq and 1D HiChIP peaks; Fig. 1c)25. Inspection of the interaction matrix at progressively higher resolution revealed chromatin compartments, TADs, and focal loops, as previously reported in high-resolution Hi-C and HiChIP analyses from cell lines (Fig. 1b)4,16. Importantly, H3K27ac HiChIP maps were capable of identifying focal interactions at 1 kb resolution, which is comparable to in situ Hi-C maps generated from 100-fold more cells and sequenced to 13-fold greater depth4 (Fig. 1b).
Previous saturation perturbation screens demonstrated that functional enhancers can be identified by integrating H3K27ac ChIP-seq signal with chromosome conformation contact strength (Hi-C)5. Since H3K27ac HiChIP combines these two components into one assay, we reasoned that HiChIP signal, which we term Enhancer Interaction Signal (EIS), should identify functional regulatory elements. To validate this prediction, we first generated H3K27ac HiChIP maps in a chronic myelogenous leukemia cell line (K562) as a direct comparison to published high-resolution CRISPR interference (CRISPRi) screens5. We then examined the 3D enhancer landscape of the MYC and GATA1 loci using virtual 4C (v4C) analysis, where a specific genomic position is set as an anchor viewpoint, and all interactions occurring with that anchor are visualized in 2D16. v4C analysis of the MYC promoter demonstrated that EIS in K562 cells captured all functional enhancers identified in the CRISPRi screen (Fig. 2a). Analysis of the GATA1 locus demonstrated a similar agreement between both methods (Fig. 2b). Quantitatively, EIS in K562 cells was significantly correlated with CRISPRi score in the same cell type, whereas EIS in GM12878 (GM; B cell lymphoblast) cells was not correlated with K562 CRISPRi (Spearman’s rho = 0.332 and 0.145; p-value = 9.25 x 10-5 and 0.1246; Fig. 2c).
We found the enhancer landscapes of the MYC promoter to be highly cell-type specific. v4C analysis of the MYC promoter in GM and My-La (CD4+ T cell leukemia) cells showed dramatically different regulatory interactions with the promoter compared to K562 cells (Fig. 2d). To validate EIS specificity, we performed CRISPRi experiments in GM cells using sgRNAs targeting enhancers identified in either GM or My-La HiChIP maps as well as a positive control sgRNA targeting the MYC promoter and a negative control sgRNA targeting lambda phage sequence (Fig. 2e). As expected, we found that CRISPRi of GM, but not My-La, enhancers impacted MYC expression and cell growth in GM cells (Fig. 2e).
Finally, we focused on the CD69 locus, where a high-resolution CRISPRa screen identified three enhancers upstream of the transcription start site26. These sites were also identified by Naïve T cell H3K27ac HiChIP. Moreover, HiChIP identified four additional distal enhancers that were outside the region spanned by the sgRNA tiling array (Fig. 2f and Supplementary Fig. 5). To functionally validate these novel enhancers, we performed CRISPRa experiments in Jurkat cells with sgRNAs targeting these enhancers, the CD69 promoter, the KLRF2 promoter as a locus negative control, and a non-human genome-targeting negative control. We observed a significant increase in CD69 RNA and protein levels in the four HiChIP enhancers compared to negative controls (Fig. 2g and Supplementary Fig. 5). Interestingly, two of the four identified novel enhancers were within promoter regions of distant genes. These findings are in line with previous reports that identified widespread distal gene regulatory functions of promoters genome-wide27,28. Altogether, these results suggest that H3K27ac HiChIP EIS identifies functional regulatory elements, and that enhancers that regulate a gene of interest can differ significantly between cell-types.
Landscape of enhancer interactions in primary T cells
We examined global features of the enhancer connectome associated with cellular differentiation from Naïve T cells to either TH17 cells or Treg cells. We identified a total of 10,706 high confidence loops in the union set of the three cell types (Supplementary Table 2). Analysis of loop read support between biological replicates demonstrated high reproducibility (Supplementary Fig. 4c), and ~91% of loop anchors were associated with either a promoter or enhancer29, as expected, with a median distance of 130 kb (Supplementary Fig. 6a,b). Importantly, high-resolution E-P connectivity maps revealed several features that could not be discerned from 1D epigenomic data (i.e. H3K27ac ChIP-seq or ATAC-seq; Fig. 3a). These features included: (i) ‘enhancer skipping’: enhancers that have stronger EIS with a more distal target promoter, (ii) higher order structures such as ‘enhancer cliques’ (related to loop cliques30): multiple regulatory elements that have strong EIS with a single target promoter, (iii) promoter to promoter interactions13,31, and (iv) ‘enhancer switching’: enhancers that exhibit differential EIS with a target promoter in a cell type-specific manner (Fig. 3a).
We found that EIS contacts were very cell type-specific. After quantile-quantile normalization of contact reads at high-confidence loops (correcting for false positives caused by 1D fragment visibility; Methods), we focused on the top and bottom 5% of EIS ranked by cell-type bias for each pair-wise comparison (Supplementary Figs. 6c-g and 7, Supplementary Tables 3-4). Cell type-specific enhancer loop anchors revealed genes encoding canonical T cell subtype TFs and effector molecules (Fig. 3b, Supplementary Figs. 8 and 9). Deeper v4C analysis of shared and cell type-specific loci pinpointed regulatory elements interacting with each gene promoter of interest as well as local conformational landscape changes (Supplementary Figs. 8 and 9). TF motifs located within cell type-specific loop anchors were enriched for TFs known to drive T cell subtype differentiation and nominated novel TFs involved in regulation (Fig. 3c). Furthermore, cell type EIS bias was associated with differential expression of genes located within corresponding EIS anchors for the same cell type (Spearman’s rho = 0.242 and 0.207; p-value = 4 x 10-15 and 2 x 10-11; Fig. 3d).
Cell type-specific EIS may be driven by cell type-specific enhancer activation (based on H3K27ac ChIP-seq) or stable enhancer activation with cell type-specific looping (Hi-C) in a gene specific manner. We first examined H3K27ac ChIP-seq at differential EIS anchors and found that many biased H3K27ac HiChIP interactions also exhibited biased ChIP-seq signal, as expected. 58.5% of Naïve-biased loops contain at least one Naïve-biased ChIP-seq peak (log2 fold change > 1) located on the anchors. Similarly, 66.7% of TH17-biased and 67.8% of Treg-biased interaction anchors were cell type-specific in 1D (Supplementary Fig. 10a). Therefore, while on average ~64% of the differential EIS corresponded to change in 1D data, ~36% were likely also driven by change in 3D chromatin loop strength. To further assess the contribution of cell type-specific 3D signal to EIS, we examined HiChIP 1D signal at differential EIS anchors. We found that HiChIP 1D signal correlated better with ChIP-seq signal than EIS, with a higher likelihood of differential ChIP-seq signal overlapping differential HiChIP 1D signal compared to 3D, suggesting EIS bias is in part driven by 3D changes (Supplementary Fig. 10b).
We asked whether the integration of reference cell line Hi-C data with primary T cell H3K27ac ChIP-seq could recapitulate HiChIP EIS in primary T cells. We binned Gm Hi-C loops with increasing primary T cell ChIP-seq signal at loop anchors and then determined the overlap of loops in each bin with loops derived from H3K27ac HiChIP. As expected, increased ChIP-seq signal at the Hi-C anchors led to increased overlap with the HiChIP loops. However, the overlap was lower in all T cell subtypes compared to the same analysis performed using GM HiChIP data. These observations demonstrate that cell-type specific 3D interactions can impact EIS independent of differences in 1D ChIP-seq signal (Supplementary Fig. 10c). Similarly, previously generated enhancer-promoter maps obtained from bulk T cells did not identify T cell subtype-specific interactions obtained using H3K27ac HiChIP. To assess the unique information obtained through cell type-specific interaction maps, we compared promoter Capture Hi-C maps in bulk CD4+ T cells to H3K27ac HiChIP maps in Naïve, TH17, and Treg cells14. Strikingly, the most cell type-specific loops in TH17 and Treg (16-fold enriched) demonstrated a low discovery rate in promoter Capture Hi-C T cells (11.83% in 415 loops and 13.83% in 373 loops, respectively; Supplementary Fig. 10d). Many of these subset-specific interactions included genomic loci encoding functionally important effector genes, such as LRRC32. The LRRC32 locus contains Treg-specific loops that are neither visualized in HiChIP maps from Naïve or TH17 cells nor in bulk CD4+ promoter Capture Hi-C maps (Supplementary Fig. 10e). Since primary human TH17 and Treg cells are present in human blood with low frequency, it would also be challenging to generate subset-specific promoter Capture Hi-C maps with published promoter Capture Hi-C protocols. In summary, EIS is derived from a combination of 1D ChIP-seq and 3D interaction signal and cannot be accurately predicted from 3D maps in reference cell lines or unsorted primary cell datasets.
Cell type-specific EIS can occur at sites of shared chromatin accessibility. Paired chromatin accessibility profiles by Assay of Transposase-Accessible Chromatin by sequencing (ATAC-seq)32 from each T cell subset revealed most cell type-specific loop anchors had equivalent chromatin accessibility across all three cell types (Fig. 3e-g). To illustrate this finding, we examined the BACH2 promoter, which exhibits shared chromatin accessibility at enhancers, but increased EIS in Naïve cells (Fig. 3e). Globally, only 14.2%, 27.8%, and 16.5% of Naïve-, TH17-, and Treg-biased loops, respectively, contained at least one biased ATAC-seq peak (log2 fold change > 1) located on the anchors. Furthermore, the majority of cell type-specific TF motifs were observed in shared ATAC-seq peaks within differential interactions, highlighting that these regions are functioning in T cell differentiation (Fig. 3f-g). Altogether, these results suggest that in highly related – yet functionally distinct – cell types, a portion of transcriptional control is achieved through differential chromosome looping, rather than differential chromatin accessibility. This finding is consistent with previous studies which demonstrated that T cell subset-specific TFs, such as Foxp3, act predominantly at pre-accessible chromatin sites to establish subset-specific gene expression33.
Enhancer interactions link disease variants to target genes
The high specificity of EIS enabled us to identify putative target genes of autoimmune disease risk loci in functionally relevant T cell subsets. To achieve this, we used a previously described list of putatively causal variants associated with 21 autoimmune diseases, known as PICS SNPs, which were fine-mapped based on dense genotyping data25. We determined that PICS autoimmune SNPs were significantly enriched in T cell loop anchors, with specific autoimmune diseases showing greater than 5-fold enrichment compared to a shuffled control loop set (Supplementary Fig. 11). Next, we constructed a set of all possible connections between autoimmune risk SNPs and TSS within 1 Mb and measured the EIS for each SNP-TSS pair (Fig. 4a). We aggregated these signals to determine the overall interaction activity in each T cell subtype in each disease (Fig. 4b). We observed high interaction strength enrichments and cell type specificity in autoimmune disease SNPs, but low enrichment and cell specificity in non-immune traits (Fig. 4b). To further visualize HiChIP bias in shared or differential enhancers, we analyzed SNP-TSS interactions grouped by their presence near H3K27ac ChIP-seq peaks (Supplementary Fig. 12a,b). We observed a large number of active SNP-TSS pairs that were present in regulatory regions that were shared between T effector cell types (Treg and TH17), while relatively less EIS signal was observed in SNPs located in cell-type specific enhancers, supporting the concept that many autoimmune disease variants impact common T cell effector/activation pathways25,34. Notably, SNPs present in enhancers shared across all three cell types could still be distinguished by HiChIP bias (Supplementary Fig. 12a,b). For example, although we could not detect cell type bias at risk loci for Alopecia Areata using H3K27ac ChIP-seq (Supplementary Fig. 12a,b and ref. 3), H3K27ac HiChIP identified increased SNP-TSS activity in Treg cells among shared T cell enhancers, consistent with several studies identifying the crucial role of this cell type in disease pathogenesis35. Importantly, autoimmune signal enrichments were not readily apparent from 1D H3K27ac ChIP-seq peaks, aggregated ChIP-seq signal within the TAD containing the SNP, nor cell line H3K27ac HiChIP datasets (Fig. 4b and Supplementary Fig. 12c). Therefore, examining 3D disease variant interactions may capture cell type biases more robustly than 1D epigenomic data. Finally, to validate our findings with an orthogonal dataset, we performed SNP-TSS EIS analysis on an overlapping set of autoimmune disease-associated SNPs obtained from the NHLBI GRASP catalog and observed similar enrichments of specific T cell subsets (Supplementary Fig. 12d).
We leveraged HiChIP to identify potential gene targets of intergenic SNPs, which have classically been paired to the nearest neighboring gene. We overlapped the SNP-TSS pairs with loops to call a discrete set of target pairs. We then performed differential analysis on the SNP-TSS loops to ascertain bias for specific T cell subsets (Fig. 4c and Supplementary Table 5). Examples of biased SNP-TSS pairs included FOXO1 in Naïve T cells (rs9603754), BATF (rs2300604) in Memory T cells, CTLA4 (rs10186048) in Treg cells, and IL2 (rs7664452) in TH17 cells (Fig. 4c and Supplementary Table 5). Next, we sought to characterize the connectivity landscape of the SNP-TSS loops. We identified an average of 1.75 gene targets per autoimmune SNP (ranging from 0 to over 10 target genes), while non-immune traits did not demonstrate an increase in targets (0.33 genes per SNP; Supplementary Fig. 12e). For 684 autoimmune intergenic SNPs, we identified a total of 2,597 HiChIP target genes, representing a four-fold increase in target genes for known disease SNPs (Fig. 4d). Only 367 (~14%) of all targets were the nearest gene to the SNP, while approximately ~86% of SNPs skipped at least one gene to reach a predicted target TSS (Supplementary Fig. 12e). Furthermore, approximately ~45% of SNP to HiChIP target interactions had increased signal compared to the same SNP to nearest gene, despite distance biases.
Target gene validation by eQTL and CRISPRi
HiChIP enhancer-target gene interactions can be validated using previously identified point mutations that alter expression at distantly located genes in T cells—i.e. expression quantitative trait loci (eQTL)36. For example, the celiac disease-associated SNP rs2058660 impacts the expression of the inflammatory cytokine receptor genes IL18RAP, IL18R1, IL1RL1, and IL1RL2, which are known regulators of intestinal T cell differentiation and response37. HiChIP EIS revealed contacts between rs2058660 and each of these predicted gene promoters (Supplementary Fig. 13a). Similarly, the Crohn’s disease risk variant rs6890268 and the multiple sclerosis (MS) risk variant rs12946510 impact the expression of PTGER4 and IKZF3, respectively, and H3K27ac HiChIP also demonstrated clear contacts between these SNPs and their predicted promoter (Supplementary Fig. 13a). Globally, HiChIP contact signal was increased in eQTLs in T cells compared to a distance-matched background loop set (p-value < 2.2 x 10-16; Fig. 4e) or to eQTLs identified in an unrelated cell type (liver; p-value < 2.2 x 1016). The overlap of HiChIP and eQTL loci provides support for chromosome interactions as a physical basis for distal eQTLs10-12 and further validates the HiChIP approach to assign enhancer-target gene relationships.
We next sought to directly validate HiChIP SNP-gene targets using CRISPRi in My-La cells. First, we focused on three loci of interest in primary T cells and then confirmed that the SNP-TSS loops were also present in My-La cells (Fig. 4f and Supplementary Fig. 13b). We then targeted sgRNAs to these SNP-containing enhancers, as well as positive control sgRNAs to the HiChIP target gene promoters and a negative control sgRNA targeting lambda phage sequence. As expected, we observed a significant reduction of RNA levels in the HiChIP target genes upon CRISPRi of its SNP-containing enhancer (Fig. 4f).
Fine-mapping of disease-associated DNA variants
Since SNP-TSS HiChIP signal is capable of identifying target genes of candidate SNPs, we asked whether TSS-SNP HiChIP signals could also be used to nominate functional causal variants within haplotype blocks in a reciprocal manner. We first performed a proof-of-principle analysis using fine-mapped SNPs associated with inflammatory bowel disease (IBD)38 or Type 1 Diabetes (T1D)39 as well as high confidence PICS SNPs and examined EIS from putatively causal SNPs to all gene promoters within 300 kb. EIS from putatively causal SNPs to gene promoters was significantly higher than EIS from a distance-matched set of SNPs within the same LD block to gene promoters (p-value = 2.4 x 10-15, 8.7 x 10-8, 3.9 x 10-3 for IBD fine-mapped SNPs, T1D fine-mapped SNPs, and high confidence PICS, respectively; Fig. 5a and Supplementary Fig. 14a). Next, we assessed the fine-mapping ability of HiChIP EIS at individual loci of interest. We focused on IBD-and MS-associated SNPs neighboring the PTGER4 and SATB1 loci and performed v4C analysis anchored at the gene promoters. We calculated EIS signal at 1 kb resolution and identified specific regions within the linkage disequilibrium (LD) blocks that contained the highest EIS to the target promoters, positioning the likely causal SNPs within these regions (Fig. 5b and Supplementary Fig. 14b). For example, at the PTGER4 locus (Fig. 5b), the ~160 kb genomic interval spanned by LD SNPs in association with Crohn’s disease is refined to two bins of 3kb and 4kb, which both contain PICS SNPs.
We asked whether complex disease-associated loci containing more than one gene could be fine-mapped using HiChIP. We focused on two disease-associated enhancers in between the STAT1 and STAT4 gene promoters (Fig. 5c). These two genes encode transcription factors with distinct roles in immune regulation. Signal transducer and activator of transcription 1 (STAT1) is critical for type I IFN and IFNγ signaling, whereas STAT4 induces TH1 differentiation and IFNγ expression40. We investigated bias of these enhancers to STAT1 and STAT4 and found that, despite comparable linear distance and 1D signal at the promoters, the enhancers were significantly biased to interact with STAT4. Next, we fine-mapped the disease associated SNPs within this locus using 1 kb resolution EIS from the STAT4 promoter, and narrowed down candidate functional variants within the two enhancers (Fig. 5c). In summary, HiChIP EIS can nominate functional causal variants within haplotype blocks, and two-way analysis of target gene identification from an enhancer of interest and high-resolution interaction maps of that enhancer with its target gene can be used to fine-map disease-associated loci containing several candidate genes.
Allelic target gene bias of cardiovascular disease variants
Finally, we asked whether this approach could be applied broadly to other categories of human disease, and whether we could directly test SNP-TSS associations using allele-specific HiChIP. We generated high-resolution E-P maps from primary human coronary artery smooth muscle cells (HCASMC), which can be used to inform variants linked to cardiovascular diseases41. First, to validate cell type specificity, we examined the TCF21 gene promoter, a transcription factor required for the differentiation of HCASMC42 and observed enrichment in HCASMC EIS relative to Naïve T cells (Fig. 6a). We next examined the 9p21.3 locus, which harbors risk associations with several cardiovascular disorders43-45. We found that the promoters of all three genes in the locus interact with one another and with CAD variant-containing enhancers located approximately 100 kb upstream of the CDKN2B promoter (Supplementary Fig. 15). We then generated SNP-TSS target lists using CAD SNPs identified in the CARDIoGRAMplusC4D study46. We again performed differential analysis on the SNP-TSS loops to ascertain bias for HCASMC versus Naïve T cells (Fig. 6b). Overall, 75.1% of biased HCASMC SNP-TSS pairs were CAD SNPs, while only 5.5% of Naïve T cell biased SNP-TSS pairs were CAD SNP-TSS loops. Next, we examined the connectivity of the HCASMC SNP-TSS contacts and identified 1,062 gene targets, of which only 120 (~11%) mapped to the nearest gene. Furthermore, approximately 89% skipped at least one gene to reach a predicted target TSS, and 64% of SNPs were mapped to more than a single gene target.
We took advantage of genome phasing information in HCASMC to measure E-P interactions at allele-specific CAD SNPs, allowing us to examine the functional consequence of a risk variant compared to its alternative allele in the same nucleus. First, 4.2% of high confidence loops in HCASMC with no observed mapping bias in the anchors exhibited significant allelic bias (FDR < 0.05, Fig. 6c), consistent with frequency of allelic imbalance of RNA expression and prior evidence of allele-specific regulation of specific E-P interactions47,48. We leveraged this global E-P allelic bias to examine the effect of a risk variant compared to its control alternative allele for a set of CAD-associated SNP-target gene pairs (Fig. 6d)49. We found that many risk alleles disrupt enhancer-target gene interactions, but a subset of pathogenic SNPs increased enhancer-target gene interaction. At CAD risk variant rs1537373 in the 9p21.3 locus, the risk allele (T) showed increased EIS to the CDKN2A promoter as well as an additional enhancer within the lncRNA ANRIL relative to the reference allele (G) (Fig. 6e). We further observed increased EIS of the CAD risk variant rs4562997 to an additional SMAD3 enhancer 10 kb downstream of the TSS (Fig. 6e). The ability to resolve enhancer connectomes of the risk and reference alleles in the same nucleus demonstrates that the mutated base in the risk allele suffices to alter enhancer looping in cis in disease-relevant primary cells.
DISCUSSION
Here, we developed an approach to define the high-resolution landscape of E-P regulation in primary human cells. We find that E-P contacts are highly dynamic in related cell types and often involve genomic elements with shared accessibility. Accordingly, many complex features of the 3D enhancer connectome cannot simply be predicted from 1D, which demonstrates that mapping conformation in primary cells can identify novel regulatory connections underlying gene function in human disease. We take advantage of this principle to chart the connectivity of autoimmune and cardiovascular GWAS SNPs and link SNPs to hundreds of potential target genes. Although non-genic SNPs have previously been paired with their closest neighboring gene, we find that the majority of these variants can engage in long-distance interactions, including skipping several promoters to predicted target genes, connecting to multiple genes, or acting in concert with enhancer cliques to contact a single gene. Further use of this approach will help to clarify hidden mechanisms of human disease that are driven by genetic perturbations in non protein-coding DNA elements, which can now be linked to their cognate gene targets in primary cells.
AUTHOR CONTRIBUTIONS
M.R.M, A.T.S., W.J.G., and H.Y.C. conceived the project. M.R.M., A.T.S., J.T., and R.L. performed all genomics assays with help from T.N., M.R.C., N.S., and R.A.F. A.T.S. performed all sorting for experiments. B.G.G., S.W.C., M.R.M., M.L.N., K.R.K., and D.R.S., performed all CRISPR validation experiments. E.A.B., C.D., M.R.M., and J.X. analyzed HiChIP data. J.G., A.T.S., and Y.W. analyzed ATAC-seq data. A.J.R. and P.G.G. analyzed GWAS SNPs in HiChIP data. A.K., P.A.K., A.M., J.E.C., T.Q., W.J.G., and H.Y.C guided experiments and data analysis. M.R.M, A.T.S, E.A.B, C.D., W.J.G, and H.Y.C. wrote the manuscript with input from all authors.
DATA AVAILABILITY STATEMENT
Raw and processed data available at NCBI Gene Expression Omnibus, accession number GSE101498.
T cell ATAC and HiChIP datasets can be visualized in the WashU Epigenome Browser with the following link:
http://epigenomegateway.wustl.edu/browser/?genome=hg19&session=YAIzYBfrl9&statusld=1698051079
COMPETING FINANCIAL INTEREST
The authors declare no competing financial interests.
ACKNOWLEDGEMENTS
We thank members of the Chang and Greenleaf laboratories for helpful discussions and Justin Tumey for artwork. This work was supported by National Institutes of Health (NIH) P50HG007735 (H.Y.C. and W.J.G.) and U19AI057266 (W.J.G), Human Frontier Science Program (W.J.G.), Rita Allen Foundation (W.J.G.), and the Scleroderma Research Foundation (H.Y.C). M.R.M. and E.A.B. acknowledge support from the National Science Foundation Graduate Research Fellowship. A.T.S. is a Cancer Research Institute Irvington Fellow supported by the Cancer Research Institute. B.G.G. was supported by an Ō-AstraZeneca Postdoctoral Fellowship. M.R.C. is supported by a grant from The Leukemia & Lymphoma Society Career Development Program. J.E.C. was supported by the Li Ka Shing Foundation and the Heritage Medical Research Institute. W.J.G. and A.M. are Chan Zuckerberg Biohub investigators. A.M. serves as an advisor to Juno Therapeutics and PACT Therapeutics and the Marson lab has received sponsored research support from Juno Therapeutics and Epinomics. A.M. and J.E.C. are founders of Spotlight Therapeutics. Sequencing was performed by the Stanford Functional Genomics Facility (NIH S10OD018220). We thank Agilent Technologies for generating oligo pools for cloning of the CRISPRa gRNAs. We thank the UC Berkeley High Throughput Screening Facility and Flow Cytometry Facility. H.Y.C. and W.J.G. are founders of Epinomics and members of its scientific advisory board.
Footnotes
↵13 These authors jointly directed this work