Abstract
The CCCTC-binding zinc finger protein (CTCF)-mediated network of long-range chromatin interactions is important for genome organization and function. Although this network has been considered largely invariant, we found that it exhibits extensive cell-type-specific interactions that contribute to cell identity. Here we present Lollipop—a machine-learning framework—which predicts CTCF-mediated long-range interactions using genomic and epigenomic features. Using ChIA-PET data as benchmark, we demonstrated that Lollipop accurately predicts CTCF-mediated chromatin interactions both within and across cell-types, and outperforms other methods based only on CTCF motif orientation. Predictions were confirmed computationally and experimentally by Chromatin Conformation Capture (3C). Moreover, our approach reveals novel determinants of CTCF-mediated chromatin wiring, such as gene expression within the loops. Our study contributes to a better understanding about the underlying principles of CTCF-mediated chromatin interactions and their impact on gene expression.