RT Journal Article SR Electronic T1 Modelling dropouts allows for unbiased identification of marker genes in scRNASeq experiments JF bioRxiv FD Cold Spring Harbor Laboratory SP 065094 DO 10.1101/065094 A1 Tallulah S. Andrews A1 Martin Hemberg YR 2016 UL http://biorxiv.org/content/early/2016/07/21/065094.abstract AB Single-cell RNASeq (scRNASeq) differs from bulk RNASeq in that a large number of genes have zero reads in some cells, but relatively high expression in the remaining cells. We propose that these zeros, or dropouts, are due to failure of the reverse transcription, and we model the process using the Michaelis-Menten (MM) equation. We show that the MM equation provides an equivalent or superior fit to existing scRNASeq datasets compared to other models. In addition, identifying genes significantly to the right of the MM curve is a fast and accurate method to distinguish differentially expressed genes without prior identification of subpopulations of cells. We applied our method to a mouse preimplantation dataset and demonstrate that clustering the selected genes identifies biologically meaningful clusters. Furthermore, this feature selection makes it possible to overcome batch effects and cluster cells from five different datasets by their biological groups rather than by experimental origin.