RT Journal Article SR Electronic T1 Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning JF bioRxiv FD Cold Spring Harbor Laboratory SP 052225 DO 10.1101/052225 A1 Bo Wang A1 Junjie Zhu A1 Emma Pierson A1 Daniele Ramazzotti A1 Serafim Batzoglou YR 2016 UL http://biorxiv.org/content/early/2016/06/09/052225.abstract AB Single-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical to identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts. Here, we propose a novel similarity-learning framework, SIMLR (single-cell interpretation via multi-kernel learning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization. We show that SIMLR separates subpopulations more accurately in single-cell data sets than do existing dimension reduction methods. Additionally, SIMLR demonstrates high sensitivity and accuracy on high-throughput peripheral blood mononuclear cells (PBMC) data sets generated by the GemCode single-cell technology from 10x Genomics.