TY - JOUR T1 - Generalised empirical Bayesian methods for discovery of differential data in high-throughput biology JF - bioRxiv DO - 10.1101/011890 SP - 011890 AU - Thomas J. Hardcastle Y1 - 2014/01/01 UR - http://biorxiv.org/content/early/2014/11/28/011890.abstract N2 - High-throughput data are becoming ubiquitous in biological research, and rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a ‘small n, large P’ setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses.We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show that the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. We compare the application of these generic methods to methods developed specifically for particular distributions, and show equivalent or better performance. We additionally present a number of enhancements to the baySeq algorithm and a set of strategies to reduce the computational time required for complex data sets.The methods are implemented in the R baySeq (v2) package, available at http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. ER -