RT Journal Article SR Electronic T1 How well do RNA-Seq differential gene expression tools perform in a eukaryote with a complex transcriptome? JF bioRxiv FD Cold Spring Harbor Laboratory SP 090753 DO 10.1101/090753 A1 Kimon Froussios A1 Nick J. Schurch A1 Katarzyna Mackinnon A1 Marek Gierliński A1 Céline Duc A1 Gordon G. Simpson A1 Geoffrey J. Barton YR 2017 UL http://biorxiv.org/content/early/2017/03/13/090753.abstract AB RNA-seq experiments are usually carried out in three or fewer replicates. In order to work well with so few samples, Differential Gene Expression (DGE) tools typically assume the form of the underlying distribution of gene expression. A recent highly replicated study revealed that RNA-seq gene expression measurements in yeast are best represented as being drawn from an underlying negative binomial distribution. In this paper, the statistical properties of gene expression in the higher eukaryote Arabidopsis thaliana are shown to be essentially identical to those from yeast despite the large increase in the size and complexity of the transcriptome: Gene expression measurements from this model plant species are consistent with being drawn from an underlying negative binomial or log-normal distribution and the false positive rate performance of nine widely used DGE tools is not strongly affected by the additional size and complexity of the A. thaliana transcriptome. For RNA-seq data, we therefore recommend the use of DGE tools that are based on the negative binomial distribution.