TY - JOUR T1 - Rapid and efficient analysis of 20,000 RNA-seq samples with Toil JF - bioRxiv DO - 10.1101/062497 SP - 062497 AU - John Vivian AU - Arjun Rao AU - Frank Austin Nothaft AU - Christopher Ketchum AU - Joel Armstrong AU - Adam Novak AU - Jacob Pfeil AU - Jake Narkizian AU - Alden D. Deran AU - Audrey Musselman-Brown AU - Hannes Schmidt AU - Peter Amstutz AU - Brian Craft AU - Mary Goldman AU - Kate Rosenbloom AU - Melissa Cline AU - Brian O’Connor AU - Megan Hanna AU - Chet Birger AU - W. James Kent AU - David A. Patterson AU - Anthony D. Joseph AU - Jingchun Zhu AU - Sasha Zaranek AU - Gad Getz AU - David Haussler AU - Benedict Paten Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/07/07/062497.abstract N2 - Toil is portable, open-source workflow software that supports contemporary workflow definition languages and can be used to securely and reproducibly run scientific workflows efficiently at large-scale. To demonstrate Toil, we processed over 20,000 RNA-seq samples to create a consistent meta-analysis of five datasets free of computational batch effects that we make freely available. Nearly all the samples were analysed in under four days using a commercial cloud cluster of 32,000 preemptable cores. ER -