PT - JOURNAL ARTICLE AU - Bo Wang AU - Junjie Zhu AU - Emma Pierson AU - Serafim Batzoglou TI - Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning AID - 10.1101/052225 DP - 2016 Jan 01 TA - bioRxiv PG - 052225 4099 - http://biorxiv.org/content/early/2016/05/09/052225.short 4100 - http://biorxiv.org/content/early/2016/05/09/052225.full AB - Single-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical to identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts. Here, we propose a novel similarity-learning framework, SIMLR (single-cell interpretation via multi-kernel learning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization. We show that SIMLR separates subpopulations more accurately in single-cell data sets than do existing dimension reduction methods. Additionally, SIMLR demonstrates high sensitivity and accuracy on high-throughput peripheral blood mononuclear cells (PBMC) data sets generated by the GemCode single-cell technology from 10x Genomics.