PT - JOURNAL ARTICLE AU - Sofie Demeyer AU - Tom Michoel TI - Graph-based data integration predicts long-range regulatory interactions across the human genome AID - 10.1101/004622 DP - 2014 Jan 01 TA - bioRxiv PG - 004622 4099 - http://biorxiv.org/content/early/2014/04/29/004622.short 4100 - http://biorxiv.org/content/early/2014/04/29/004622.full AB - Transcriptional regulation of gene expression is one of the main processes that affect cell diversification from a single set of genes. Regulatory proteins often interact with DNA regions located distally from the transcription start sites (TSS) of the genes. We developed a computational method that combines open chromatin and gene expression information for a large number of cell types to identify these distal regulatory elements. Our method builds correlation graphs for publicly available DNase-seq and exon array datasets with matching samples and uses graph-based methods to filter findings supported by multiple datasets and remove indirect interactions. The resulting set of interactions was validated with both anecdotal information of known long-range interactions and unbiased experimental data deduced from Hi-C and CAGE experiments. Our results provide a novel set of high-confidence candidate open chromatin regions involved in gene regulation, often located several Mb away from the TSS of their target gene.