RT Journal Article SR Electronic T1 PREDICTD: PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition JF bioRxiv FD Cold Spring Harbor Laboratory SP 123927 DO 10.1101/123927 A1 Timothy J. Durham A1 Maxwell W. Libbrecht A1 J. Jeffry Howbert A1 Jeff Bilmes A1 William Stafford Noble YR 2017 UL http://biorxiv.org/content/early/2017/04/04/123927.abstract AB The Encyclopedia of DNA Elements (ENCODE) and the Roadmap Epigenomics Project have produced thousands of data sets mapping the epigenome in hundreds of cell types. However, the number of cell types remains too great to comprehensively map given current time and financial constraints. We present a method, PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition (PREDICTD), to address this issue by computationally imputing missing experiments in collections of epigenomics experiments. PREDICTD leverages an intuitive and natural model called “tensor decomposition” to impute many experiments simultaneously. Compared with the current state-of-the-art method, ChromImpute, PREDICTD produces lower overall mean squared error, and combining methods yields further improvement. We show that PREDICTD data can be used to investigate enhancer biology at non-coding human accelerated regions. PREDICTD provides reference imputed data sets and open-source software for investigating new cell types, and demonstrates the utility of tensor decomposition and cloud computing, two technologies increasingly applicable in bioinformatics.