TY - JOUR T1 - Rail-RNA: Scalable analysis of RNA-seq splicing and coverage JF - bioRxiv DO - 10.1101/019067 SP - 019067 AU - Abhinav Nellore AU - Leonardo Collado-Torres AU - Andrew E. Jaffe AU - José Alquicira-Hernández AU - Jacob Pritt AU - James Morton AU - Jeffrey T. Leek AU - Ben Langmead Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/08/11/019067.abstract N2 - RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it is difficult to reproduce the exact analysis without access to original computing resources. We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16 hours for US$0.91 per sample. Rail-RNA produces alignments and base-resolution bigWig coverage files, ready for use with downstream packages for reproducible statistical analysis. We identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounders. Rail-RNA is open-source software available at http://rail.bio. ER -