TY - JOUR T1 - Graph Theory Approaches for Optimizing Biomedical Data Analysis Using Reproducible Workflows JF - bioRxiv DO - 10.1101/074708 SP - 074708 AU - Gaurav Kaushik AU - Sinisa Ivkovic AU - Janko Simonovic AU - Nebojsa Tijanic AU - Brandi Davis-Dusenbery AU - Kural Deniz Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/09/12/074708.abstract N2 - As biomedical data becomes increasingly easy to generate in large quantities, the methods used to analyze it have proliferated rapidly. However, for the insights gained from these analyses to be meaningful, the analysis methods themselves must be transparent and reproducible. To address this issue, numerous groups have developed workflow standards or specifications that provide a common framework with which to describe a given analysis method. The diversity of methods demands that the specification be robust and flexible to accurately describe a specific biomedical data analysis. However, a powerful specification alone is insufficient to drive reproducible analysis – even the most completely described workflow must also be ‘runnable’ on diverse architectures. Thus the complete reproducible workflow ecosystem includes one or more well defined workflow definitions or specifications as well as the software components needed to implement these specifications. Such implementations allow adaptation to diverse environments, provide optimizations to workflow execution, improve computing efficiency, and support reproducibility through task logging and provenance. To meet these requirements, we have developed the Rabix Executor, an open-source workflow engine which utilizes graph theory approaches to enable computational optimization of complex, dynamic workflows, and supports reproducibility in biomedical data analysis. ER -