Abstract
The spatial organisation of interphase chromosomes is known to affect genomic function, yet the principles behind such organisation remain elusive. Here, we first compare and then combine two well-known biophysical models, the transcription factor (TF) and loop extrusion (LE) models, and dissect their respective roles in organising the genome. Our results suggest that extrusion and transcription factors play complementary roles in folding the genome: the former are necessary to compact gene deserts or “inert chromatin” regions, the latter are sufficient to explain most of the structure found in transcriptionally active or repressed domains. Finally, we find that to reproduce interaction patterns found in HiC experiments we do not need to postulate an explicit motor activity of cohesin (or other extruding factors): a model where co-hesin molecules behave as molecular slip-links sliding diffusively along chromatin works equally well.
Interphase chromosomal organisation is intimately linked to gene regulation and cellular integrity [1–3]. Distinct genomic architectures can be found in cells undergoing differentiation and ageing or in those affected by disease [4, 5]. Recent years have seen major developments in a number of techniques to investigate the 3-D conformation assumed by interphase [6–9] and mitotic chromosomes [10]. The most widely employed technique to date is “HiC” - a high-throughput, genome-wide version of “chromosome conformation capture” whose natural output is a map quantifying the probability of interaction between different genomic loci within a population of cells [69, 11]. These maps, constructed for different organisms and cell types [7], naturally lend themselves to comparison with those predicted by “bottom-up” computational models based on polymer physics principles [12–14].
Two main classes of biophysical models are currently popular in the field: the “transcription factor” (TF) model [12, 15–17] (also known as the “strings-and-binders” model [18]); and the “loop extrusion” (LE) [13, 19, 20] model. The former postulates that multivalent chromatin-binding proteins mediate chromatin-chromatin interactions, creating loops and driving 3-D folding. Examples of proteins that play key roles within this framework are transcription factors associated with active chromatin [21], as well as polycomb group [22] and HP1 [23] proteins. This model naturally explains the large-scale (micro)phase separation of the genome into active and inactive (also known as A and B) compartments [12] and the formation of nuclear bodies [24], both driven by a mechanism known as the “bridging-induced attraction” [16]. The second model posits that the SMC complex cohesin and the CCCTC-binding factor (CTCF) are the master organisers of the genome, suggesting that cohesin acts as a loop extruding factor [25] which actively creates expanding loops, but halts when it meets a bound CTCF. This model can account for the striking bias in favour of convergent CTCF loops [9] and it can also rationalise the “topologically-associated-domain” (TAD) patterns observed in HiC maps [13]. However, a motor activity has yet to been observed in experiments probing the motion of DNA-bound cohesin in vitro [26–28] and the convergent loop bias can also be explained by a model of diffusive loop extrusion (dLE) where cohesin slides diffusively along the chromatin rather than actively moving unidirectionally [29]. A third possibility is that the diffusive motion is enhanced by ATP consumption resulting in an active bidirectional motion.
The TF and (d)LE models each explain different aspects of genome organisation. While the TF model describes a functional level of genome organisation, intimately linked to the local transcriptional activity and chromatin state, the LE model describes a level of organisation independent of these. The reality may well be a combination of the two, in which case one would expect that disrupting either transcription factor or cohesin binding would give rise to distinct changes in chromosomal architecture. Indeed, very high resolution conformation studies of the globin loci using Capture C [17, 31] revealed completely different conformations in erythroid cells, where these genes are very active, and stem cells, where they are inactive - i.e. changes in protein binding sites result in changes in conformation. Likewise, cohesin or CTCF knock-outs result in the disruption of the observed loops and loop-domains [25, 32–34], but appear to leave the underlying chromatin states mostly unchanged [34].
In this computer simulation study, we first compare the TF and (d)LE models in terms of their ability to predict chromosome organisation. We focus our attention on a 30 Mbp section of human chromosome 7, which includes large gene deserts (regions of “inert chromatin”, where transcriptional activity is sparse and which is void of active or repressive histone modifications), as well as facultative and constitutive heterochromatin, and active regions. We find that neither the TF nor the LE model can, by itself, give a satisfactory account of the observed folding of the entire chromosome segment. The (d)LE model accurately predicts the domain pattern locally, but fails to capture larger-scale interactions. On the contrary, the TF model poorly predicts the fine detail of local interactions (especially within gene deserts/inert chromatin), but captures long-range contacts more faithfully.
A combination of the TF and (d)LE models reproduces many of the HiC features, suggesting that TFs and cohesin (or other LE factors) indeed have complementary roles in genome organisation. We show that LEs are required tocreate TADs within inert chromatin, while our simulations suggest that TFs are sufficient to organise active/inactive domains, where cohesin-mediated loops play a more minor role.
Intriguingly, however, a naïve superposition of the standard TF and (d)LE models still leaves some key qualitative discrepancies between simulated and HiC interaction maps; for example the simulations tend to show too high a signal for medium to long range interactions. Qualitative agreement improves when the TF model is enhanced by including a non-equilibrium “switching” mechanism [24]. This switching-TF (sTF) model encodes a dynamic level of control on TFs; it might represent post-translational modification of the proteins affecting their binding affinity to chromatin [35]. The combined model with switching TFs is also consistent with single-molecule microscopy experiments on the dynamics of active/inactive chromatin domains, chromatin loops and protein clusters [36–38], and it correctly predicts the main observations of recent knockout experiments [34].
METHODS
Following our previous work [12, 17, 24, 29, 39–41], we employ a simulation scheme based on polymer physics: the chromatin fibre is represented as a chain of “beads” connected by springs. Details are shown schematically in Figure 1. Beads are “coloured” according to the underlying chromatin state based on histone modifications (ChlP-seq data are obtained from the ENCODE project [30], see Suppl. Methods bellow). In this way our chromatin becomes a co-polymer whose segments interact with freely diffusing beads mimicking explicit bridge-forming protein complexes. In our TF model, we consider three species of bridge proteins, representing transcription factors associated with euchromatin, HP1, and polycomb repressive complexes (PRC) respectively. Thus, during the course of a simulation, these proteins can bind and form bridges between chromatin beads bearing the associated epigenetic marks (see Fig. 1). Loop extruding factors, which might represent the cohesin complex, or a pair of cohesin rings, are represented as additional transient springs between non-adjacent beads (see Fig. 1 and Suppl. Methods bellow for more details).
RESULTS
In this paper we focus on the first 30 Mbp of human chromosome 7 in a human lymphoblastoid cell line GM12878, for which high-resolution HiC data are available [9]. We chose this region as it contains both active and repressed regions, as well as large regions devoid of most histone modifications, which we call gene deserts or inert chromatin. Inert chromatin is AT-rich and gene-poor, so that it bears some of the signatures of heterochromatin, though it is not characterised by an enrichment of either the H3K27me3 or H3K9me3 histone modifications. Below we describe the results we obtain by applying the TF and (d)LE models, either independently or in combination.
Neither the TF, nor the LE model alone can satisfactorily predict the observed HiC map
By applying the TF model (see Methods) we obtain the contact map shown in Figure 2(a). One way to measure how well the simulated map predicts the HiC data is to simply count the number of correctly predicted domain boundaries. From our previous work, we expect that the formation of domains which bear different epigenetic marks will be well captured by the TF model: they phase separate into distinct 3-D compartments, and clusters of like proteins form [16]. [Such clusters are visible in Figure 2(a) (see also Suppl. Movie 1) - they resemble liquid-like clusters formed by heterochromatin [23, 42] and transcription factories self-assembled within euchomatin [21].] Indeed the model does correctly capture a large fraction of boundaries in active and inactive regions (see, e.g., the 20 — 30 Mbp segment in Fig. 2(a)), as well as the pattern of longer-range interactions between segments bearing similar hi-stone marks. These features are a natural consequence of the spatial segregation (or more precisely “microphase separation”, i.e. phase separation into domains with self-limiting size [24]) between active and inactive chromatin, which leads to A/B compartmentalisation (see Fig. S1 bellow). As well as boundaries between regions in different compartments, alternating binding and non-binding chromatin regions can also give rise to boundaries even between two adjacent active (or inactive) domains - which is why in more active regions, such as in chromosome 19, the TF model correctly predicts an even larger fraction of boundaries [12]. However, the TF model clearly fails to capture the folding of the inert chromatin regions (which is why the total fraction of correctly predicted boundaries is only ~ 36%).
Compared to the TF model, the LE model gives a better prediction of local TAD formation, especially within inert chromatin (where 82% of domain boundaries are correctly predicted), but it performs less well in capturing the higher-order organisation of active and inactive regions. Although a similar number of boundaries (83%) are correctly predicted in those regions, the contact maps obtained with the LE model distinctly lack the long-range interactions between domains, which are associated with compartmentalisation.
The LE model also clearly cannot capture enhancer-promoter interactions within domains unless there are CTCF sites in the vicinity of those regulatory elements. To highlight this, we considered a virtual 4C experiment by selecting HiC interactions for a locus corresponding to a promoter, and compared the interaction patterns predicted by the TF and LE models. The results clearly show that the LE model fails to capture the pattern qualitatively (see Fig. S2): the correlation between the virtual 4C interaction profiles for the simulated and HiC data is −0.003 for the LE models, compared to 0.27 for the TF model.
A naïve combination of TF and LE models improves the qualitative agreement with HiC, yet some issues remain
Since each of the models captures different features of chromosome folding, one expects that a combination of the two should perform much better than either on its own. Indeed we find that the combined TF+LE model (see Methods for implementation details) yields an improvement, as now both inert and active/inactive regions are in fair qualitative agreement with HiC (Fig. 2(c)); however some discrepancies remain - these are discussed in more detail below.
An important result is that, within our simulations, extruders are not necessary for local folding within regions that display well-defined patterns of histone modification - their organisation is mainly driven by TF bridging (Fig. 2(c)). For instance, the simulated contact maps for the TF and TF+LE model in the 20 - 30 Mbp region, which is rich in active and inactive domains, are highly correlated (Pearson’s correlation r = 0.76). The main reason for this is that when bridges bind they tend to compact a whole stretch of chromatin, creating many more contacts compared to extruders, each of which only forms a single loop.
Notwithstanding the improved agreement with HiC, a visual inspection of the contact maps in Figure 2 reveals that there are some remaining qualitative discrepancies. Most notably, there are substantially more interdomain interactions far from the diagonal in the simulated contact maps, whereas these features are much weaker in the HiC map (see Fig. 2(c), bottom zoom, between 20 – 30 Mbp).
A model with switchable TFs shows qualitatively better agreement with HiC and fluorescence microscopy experiments
We now consider a variation of the TF model which gives improved qualitative agreement with HiC experiments [9]. In the model discussed above, active and inactive factors interact with chromatin beads thermodynamically - i.e., there is an attractive binding interaction between the TF and respective chromatin beads. A TF can bind chromatin when it diffuses into contact, and then unbinds due to the thermal motion in the system. The residence time depends on the interaction strength (see Suppl. Methods) and it is strongly modulated by emergent behaviour such as the bridging-induced attraction [16]. More specifically, once a multivalent factor reaches a configuration where it can form multiple chromatin interactions, it remains bound for a time which increases exponentially with number of interactions. This is because unbinding requires climbing over a potential energy barrier whose height increases linearly with the number of interactions. Within our baseline model, typical residence times can encompass the total simulation time, thus the model fails to capture the rapid turn-over of TFs observed in vivo (typically of the order of minutes - see below and Refs. [37, 43]).
Many TFs and other proteins which are relevant to our modelling are observed in stable foci which also exhibit rapid protein turnover. These two features are difficult to reconcile, but possible explanations are that there is ongoing post-translational modification which affects binding affinities (e.g. phosphorylation [35, 44]), that there is active protein degradation, or, in the case of PolII, that transcription-termination signals lead to unbinding [1]. A generic way to model these non-equilibrium processes is to consider TFs that switch between an “on” (binding) and an “off” (non-binding) state at rate kswitch (see Fig. 3(a)). We have recently shown [24] that this switching-TF (sTF) model gives rise to the formation of dynamic protein clusters, reminiscent of nuclear bodies [45], and can affect chromatin interaction patterns.
Figure 3(b) shows the qualitative effect of TF switching on the contact maps for different values of the switching rate kswitch. The most striking difference between the combined TF+LE models with and without switching is that switching markedly attenuates long-range inter-domain, but not intra-domain, interactions: active domains which are far apart along the genome are less likely to interact. This reduces the intensity of the off-diagonal features in our predicted contact maps, rendering them qualitatively more similar to the HiC. This is also shown by the decrease in the ratio of non-local to local contacts with an increasing kswitch (see Fig. 3(c)): the TF+LE model with switching is needed to predict the correct balance between long - and short-range interactions.
The sTF+LE model also conforms much better with observations from live-cell fluorescence microscopy experiments which probe dynamical information inaccessible to HiC. In the absence of switching, the TF dynamics in our model is slow and glassy, whereas it is much more rapid with switching (Fig. 4, and Suppl. Movies 1, 2). Whilst high-throughput experiments showing the dynamics of chromatin interactions over time are not yet possible, the more dynamical picture emergent from the switching model is consistent with fluorescence recovery after photo-bleaching (FRAP, see above and [24]) and single-molecule imaging experiments, which suggest that TF binding is short-lived and lasts for not more than minutes [37, 43]. The differences are clear if we examine the trajectories of individual TFs (Fig. 4(a)): without switching, a TF diffuses until it joins a cluster (of like proteins and binding sites), where it tends to stay for the remainder of the simulations; with switching a TF joins a cluster for a short time, then undergoes a period of free diffusion, before joining another cluster. This hopping between clusters manifests as a long tail in the distribution of mean squared displacement for a given time interval (Fig. 4(b)). Supplementary Movies 1 and 2 also show that the macroscopic dynamics of liquid-like domains formed by non-switching and switching proteins are profoundly different: in the latter case, we observe many more events corresponding to clusters splitting and reforming, and also the clusters are smaller.
A quantitative analysis confirms that the sTF+LE model gives the best agreement with HiC
We now turn to a more quantitative analysis of the agreement between HiC and the set of models considered. For this we consider a series of parameters.
First, we compare the model performance in domain boundary generation. For the sTF+LE model, we find that across the whole simulated region, ~ 84% of boundaries are correctly predicted (see Fig. 5(c)). This performance is similar to that of the LE and TF+LE models (~ 83% and ~ 82% respectively), but substantially better than the TF model (~ 36%). Previous studies with only TFs [12], or only extruders [13] found similarly high values, however neither focused on chromosome regions containing both inert and active/inactive regions, as we have done here.
Second, we consider a measure of the long range active-active, inactive-inactive, and active-inactive interactions to assess how well each model captures compartmentalisation and formation of promoter-enhancer hubs [21]. To do this we label each chromatin bead as active or inactive according to whether it binds to active or inactive TFs (which is in turn based on histone modification data); we label beads which can bind to both as “mixed”, and which bind neither as “inert”. Figure 5(a) shows the fraction of active, inactive and mixed beads which each chromatin bead interacts with, for each of the models. In the HiC data there is an enrichment of active-active and inactive-inactive contacts, which is associated with compartmentalisation. This is captured by models with TFs (where there is a high correlation with the HiC - see Fig. 5(b)), but not by the LE model (which shows essentially no correlation). A Spearman correlation test shows that the combined model with switching performs better than the combined model without switching (although it is not significantly better than the TF only model). A similar conclusion is reached by inspection of virtual 4C interaction profiles for promoters/enhancers (see example in Fig. S2).
Third, we assess the relative balance between local and non-local contacts in the various models and in experiments (Fig. 5(d)). This further supports the idea that the TF and LE model separately cannot fully account for HiC data, and also shows that to capture the right decay of the non-local to local fraction a model with switching is required (see also Fig. 3(c)).
An active extrusion mechanism is not necessary to create domains and CTCF loops
Having noted that CTCF and extrusion appear to be fundamental to the creation of domain boundaries, especially in inert chromatin, we now ask whether active extrusion (where LEs move unidirectionally due to some motor effect as in Refs. [13, 20]) is necessarily required, or whether diffusive extruders (dLE) behave similarly. This is currently a relevant question as single molecule experiments on cohesin loaded onto DNA or reconstituted chromatin [26–28] have not yet found evidence of a direct motor activity. [Indirect motor activity, e.g. by a transcribing polymerase, is a distinct and plausible alternative, and has been suggested on the basis of simulations [46]; however it is difficult to find direct evidence for this in vivo.]
Simulating large chromosome regions with a dLE model in 3-D requires using either infeasibly long simulation times, or using substantially coarser resolution. Therefore we first studied a 1-D model of dLE (see Suppl. Methods). CTCF sites were positioned as in the 3-D simulations with active LEs, and, like before, we assume that the diffusing LEs interact strongly and directionally with CTCFs (see Methods for more details). Contact maps can be computed within this 1-D model by assuming a HiC interaction between the position of each pair of monomers in a diffusing loop extruder (cohesin dimer). The resulting interaction maps are plotted in Figure 6(a), together with maps from active LE simulations, computed in the same “1-D fashion” (see Suppl. Methods). Results show that dLE is essentially indistinguishable from active LE, both visually and quantitatively: 83% of the HiC domain boundaries were correctly predicted by the 1-D dLE model.
We also performed 3-D simulations of dLE in the region between 10 and 20 Mbp, with 25 kbp resolution per chromatin bead - this lower resolution allows sufficiently long simulations for the dLE maps to reach steady state. Results confirm that dLE does reproduce most of the boundaries and peaks shown in HiC (see Fig. 6(b); for this case, simulations also include a non-specific attraction between all beads to qualitatively account for the effect of macro-molecular crowding [20]).
The combined sTF-LE model correctly predicts the effects of various protein knock-outs
We next use our combined sTF+LE model to simulate the effect of cohesin removal and targeted CTCF degradation, which were both recently explored experimentally. We find that simulations qualitatively reproduce the experimental observations.
Cohesin removal in the simulations leads to loss of folding in inert chromatin regions, leaving little structure in the contact maps (Fig. 7(a)i). This mirrors observations from experiments that knocked out NIPBL, which is required for cohesin loading in mammalian cells [25, 32]. On the other hand, domains organised by active and inactive switching factors are only subtly affected in our model. This is qualitatively consistent with the results of Ref. [25], which found some residual structure in active/inactive compartments (but not inert ones) following cohesin removal in mouse liver cells (see Suppl. Fig. 5 in Ref. [25]). More specific to our work, we show in Figure S3 HiC data for an active region in a similar chromosome region as considered here: it can be seen that some peaks and the overall contact pattern remain in the NIPBL knock-out. Like in the experiments, the simulated interaction map reveals stronger compartmentalisation upon cohesin removal, with a decrease in the number of interactions between domains with different epigenetic marks, and an enhancement of the interactions between like domains (see Fig. S4(a)).
To better access the qualitative agreement with experiments, we extracted from the interaction maps the ratio of non-local to local interactions as a function of the genomic separation threshold for “locality”. Figure. 7(a)ii shows the plots comparing KO and WT cases, for the simulated and HiC maps, obtained from mouse liver cell experiments [25]. There are two distinct regimes: for thresholds below the TAD range (~ 700 — 800 kbp) there is a loss in non-local interactions upon cohesin removal, and above the TAD range there is a loss in local interactions. Our model captures these features up to a threshold ~ 1100 kbp. Above that the simulation predictions deviate from the experimental observations - our WT model yields a higher ratio of non-local to local interactions. This is due to the choice of low concentrations in our model (which avoids non-physical confinement effects), which allows the polymer to change its conformation faster, meaning that loop extrusion will in fact favour a more compact structure and therefore more non-local interactions (see simulation snapshots in Fig. 2).
Experiments also reported the formation of superenhancer hubs following cohesin removal [32]. Superenhancers are genomic regions containing a high linear density of enhancer elements and high levels of the associated H3K27ac histone modification. Interactions between superenhancers - including interchromosomal interactions - were found to increase after cohesin removal, and examination of HiC ligation events revealed a higher instance of triplets of these loci appearing together [47] (i.e. three of these loci were in close proximity at the same time). To assess whether extruders qualitatively affect the network of active chromatin contacts in our simulations, we show in Figure 7(a)iii circos diagrams for enhancer/promoter chromatin beads only, for the KO and WT simulations. The chromatin beads are ordered according to their genomic position along the outer circumference in the clock-wise direction. Upon cohesin KO there is an 18% increase in the number of non-local interactions (genomic separation > 2 Mbp). This is further supported by analysing clusters of TFs binding active euchromatin, formed through the bridging-induced attraction. These indeed involve more non-local interactions between binding sites after LE removal: the mean genomic separation of chromatin beads associated with such clusters raises by over 10% from 837 kbp to 948 kbp. It is therefore tempting to associate these active protein clusters with the superenhancer hubs found experimentally. Our simulations also show that cohesin loss results in a minor decrease in the sizes of TF clusters (however the change is not statistically significant according to a Kolmogorov-Smirnov test).
CTCF removal leads to a loss of “hot-spots” in the contact map, which in wild-type nuclei correspond to convergent CTCF loops (Fig. 7(b)i). Domains and boundaries become much less well-defined within the inert chromatin region, but are relatively unaffected elsewhere (see Venn diagrams for identified boundaries in Fig. S4(b)). The spatial distribution of cohesin on the chromosomes is also strongly affected (see Fig. 7(b)ii). These findings are in agreement with experiments knocking out CTCF [33, 34] in mouseembryonic stem cells. In the wild-type simulations, cohesin localises mostly at CTCF sites, consistent with ChIP-seq data [48]. In the CTCF knock-out simulations, cohesin is distributed uniformly across the chromosome segment. In experiments, cohesin instead accumulates at transcription start-sites upon CTCF loss. One possible reason for this discrepancy is that cohesin might have preferred loading sites on the chromatin (some preferential binding of NIPBL, required for loading, has been observed at transcription start sites [48]). We have previously shown that including preferred loading sites in simulations would, in the absence of CTCF, leads to an enrichment of cohesin at those sites [29].
We also compared the ratio of non-local to local interactions as a function of the genomic separation threshold for the KO and WT cases (see Fig. 7(b)iii), for the simu-lated and HiC maps, obtained from mouse embryonic stem cell experiments [34]. CTCF removal has a minor effect on these ratios, slightly favouring more non-local contacts. Our simulations qualitatively agree with the experimental observations for all analysed contact separation threshold values.
DISCUSSION AND CONCLUSIONS
In this work we have studied chromosome folding by using a combination of two popular and successful models for mammalian genome organisation: the transcription factor [12, 15, 16, 18] and loop extrusion [13] models. The TF model is motivated by the abundance of multivalent architectural chromatin-binding proteins or complexes (e.g., HP1, PRC1, TF/Polll complexes etc.), which are known to form loops within the genome, and organise it into active and inactive regions. The TF model naturally explains the observations of transcription factories [21] and nuclear bodies [24] as multivalent TFs generically cluster through the bridging-induced attraction [16]. The LE model is motivated by the evidence that cohesin mediates chromatin looping between convergent CTCF sites in the genomes of mammals [9].
Our simulation results suggest that TFs and cohesin play complementary roles in genome organisation. On the one hand, cohesin is necessary to organise and compact regions of inert chromatin (gene deserts) where depletion of most histone marks is consistent with minimal TF binding. Accordingly, cohesin is required to account for many of the TAD boundaries in the region of human chromosome 7 we focussed on here (which contains a large gene desert). On the other hand, activating and repressive TF factors are sufficient to organise active and repressed regions respectively, as knocking out extrusion leaves largely similar contact patterns (Fig. 7).
Importantly, we find that an active mechanism for extrusion is not the only model which can generate TADs within inert chromatin: a similar number of HiC boundaries are correctly predicted by a diffusive LE model where cohesin slides along chromatin with no preferred direction (Fig. 6). This conclusion is robust, and applies to different genomic regions, for instance we analysed the folding of the segment between 20.3 and 22.6 Mbp in chromosome 4, which was considered in Ref. [20]: results (see Fig. S5) confirm that diffusive and active LE give very similar contact patterns. That the diffusive loop extrusion model works well for TAD formation is of interest since to date there is no direct evidence of unidirectional motion of cohesin on chromatin [26–28].
We found the best concordance between simulations and the available experimental evidence for a model which includes a biochemical “switching” reaction for TFs. This on⟷off switching drives the system away from thermodynamic equilibrium, and allows TFs both to bind strongly, and yet be able to dissociate frequently. The switching model gives a better prediction of long-range contacts, which would otherwise decay too slowly. More importantly, switching is necessary to reconcile simulations with fluorescence microscopy experiments which measure fast dynamics for both transcription factors [37, 43, 49, 50] and other protein clusters [24].
Our combined sTF+LE model reproduces qualitatively the effect of recent knock-out experiments. Cohesin degradation leads to unfolding and the disappearance of domain boundaries in inert chromatin regions, but results in smaller changes within active/inactive chromatin [25]. CTCF knock-out also mainly affects inert chromatin regions, and homogenises the distribution of cohesin along the chromatin fibre [33, 34].
We note that a limitation of the current work is that its methodology relies on previous knowledge of the TFs responsible for folding. A recent approach [51], has introduced a possible way to circumvent this problem, by using polymer physics and machine learning to infer the optimal, minimal number and type of TFs required to reproduce the HiC matrix within a given accuracy. Unlike the current work, though, this approach requires the HiC data as an input.
We also highlight here a recent simulation work [52], where the loop extrusion model was combined with a block copolymer model [53], which postulates a weak direct attractive interaction between all inactive regions (B compartments). Whilst this related work also found that both components of the model are required to get good agreement with HiC data, it was suggested there that extrusion may compete against compartmentalisation - e.g., if a convergent CTCF loop spans domains belonging to different compartments. This interference mechanism is appealing because it is consistent with the observation that cohesin or CTCF removal leads to an enhancement of non-local A/B compartmentalisation [25]. In the present work we did not find evidence of significant competition between chromatin-state and cohesin-mediated folding at a local level. For example, there is little difference between the TF and TF+LE models in the 20-30 Mbp region - the LEs do not interfere with the ability of TFs to organize active/repressed regions. Similarly there is no significant change in the LE loop length distribution between the LE and TF+LE models in either inert or active/repressed regions - the TFs do not interfere with extrusion. Our simulations are still fully consistent with experimental results, and the TFs and LEs do have an effect on each other with respect to longer-ranged interactions. For instance, experiments showed that cohesin loss leads to the formation of hubs of superenhancers involving very long-range contacts [32], which is associated with the increase in compart-mentalisation. This result sits well within our model, as we find that protein-mediated interactions between active chromatin beads associated with promoter or enhancers become longer-range in the LE knockout.
In summary, our results suggest that these transcription factors and cohesin complexes provide two complementary mechanisms for chromosome organization, and that they are more or less important in different regions of the genome. The question of how this “division of labour” is functionally relevant remains open: we speculate that cohesin-mediated folding of inert chromatin may be useful to facilitate the transition to mitosis, where (condensin-associated) loops are likely much more abundant. We also note that our work focuses on a single cell type during interphase, where histone modification patterns are already established. We do not consider, instead, how particular patterns of chromatin state are set up during differentiation, or re-established on exit from mitosis [40, 41]. It remains possible that LEs and TFs may have a more complex relationship in such situations, when the underlying epigenetic landscape is dynamic, and we hope to address this issue in the future.
Author contributions
MCFP, MN, DMa designed the research project. MCFP, CAB, DMi, CA, SB, AMC developed the in-house codes. CAB retrieved and treated the experimental data. MCFP, CA, SB, AMC performed simulations. MCFP, CAB, DMi carried out data analysis. MCFP, CAB, DMi, CA, MN and DMa wrote the manuscript.
Acknowledgements
This work was supported by ERC (CoG 648050, THREEDCELLPHYSICS). MCFP acknowlegdes studentship funding from EPSRC under grant no. EP/L015110/1. This work was also supported by grants from the NIH ID 1U54DK107977-01, by CINECA ISCRA Grant HP10CYFPS5 and HP10CRTY8P, computer resources at INFN and Scope at the University of Naples (M.N.), and by the Einstein BIH Fellowship Award to M.N.
REFERENCES
- [1].↵
- [2].
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵