Abstract
Crosslinking and immunoprecipitation (CLIP) is used to determine the transcriptome-wide binding sites of RNA-binding proteins (RBPs). Here we present RCRUNCH, an end-to-end solution to CLIP data analysis that enables the reproducible identification of binding sites as well as the inference of RBP sequence specificity. RCRUNCH can analyze not only reads that map uniquely to the genome, but also those that map to multiple genome locations or across splice boundaries. Furthermore, RCRUNCH can consider various types of background in the estimation of read enrichment. By applying RCRUNCH to the eCLIP data from the ENCODE project, we have constructed a comprehensive and homogeneous resource of in vivo-bound RBP sequence motifs. RCRUNCH automates the reproducible analysis of CLIP data, enabling studies of post-transcriptional control of gene expression. RCRUNCH is available at: https://github.com/zavolanlab/RCRUNCH.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Abbreviations
- AAE
- Alu antisense element
- AS
- alignment score
- cDNA
- complementary DNA
- ChIP
- chromatin immunoprecipitation
- CLIP
- crosslinking and immunoprecipitation (CLIP)
- DNA
- deoxyribonucleic acid
- FAIR principles
- findable, accessible, interoperable, reusable
- FDR
- False discovery rate
- mRNA
- messenger RNA
- ncRNA
- non-coding RNA
- PCR
- polymerase chain reaction
- PWM
- positional weight matrix
- RBDs
- RNA-binding domains
- RBPs
- RNA-binding proteins
- RIC
- RNA-interactome capture
- RNA
- ribonucleic acid
- RNA-seq
- RNA sequencing
- RNPs
- ribonucleoprotein complexes
- UMI
- unique molecular identifier
- rRNA
- ribosomal RNA
- tRNA
- transfer RNA
- snRNA
- small nuclear RNA
- SMI
- size-matched input