RT Journal Article SR Electronic T1 Crunch: Completely Automated Analysis of ChIP-seq Data JF bioRxiv FD Cold Spring Harbor Laboratory SP 042903 DO 10.1101/042903 A1 Severin Berger A1 Saeed Omidi A1 Mikhail Pachkov A1 Phil Arnold A1 Nicholas Kelley A1 Silvia Salatino A1 Erik van Nimwegen YR 2016 UL http://biorxiv.org/content/early/2016/03/09/042903.abstract AB Today experimental groups routinely apply ChIP-seq technology to quantitatively characterize the genome-wide binding patterns of any molecule associated with the DNA. Here we present Crunch, a completely automated procedure for ChIP-seq data analysis, starting from raw read quality control, through read mapping, peak detection and annotation, and including comprehensive DNA sequence motif analysis. Among Crunch's novel features are a Bayesian mixture model that automatically fits a noise model and infers significantly enriched genomic regions in parallel, as well as a Gaussian mixture model for decomposing enriched regions into individual binding peaks. Moreover, Crunch uses a combination of de novo motif finding with binding site prediction for a large collection of known regulatory motifs to model the observed ChIP-seq signal in terms of novel and known regulatory motifs, extensively characterizing the contribution of each motif to explaining the ChIP-seq signal, and annotating which combinations of motifs occur in each binding peak. To make Crunch easily available to all researchers, including those without bioinformatics expertise, Crunch has been implemented as a web server (crunch.unibas.ch) that only requires users to upload their raw sequencing data, providing all results within an interactive graphical web interface.To demonstrate Crunch's power we apply it to a collection of 128 ChIP-seq data-sets from the ENCODE project, showing that Crunch's de novo motifs often outperform existing motifs in explaining the ChIP-seq signal, and that Crunch successfully identifies binding partners of the proteins that were immuno-precipitated.