Abstract
The increasing popularity of spatial transcriptomics has allowed researchers to analyze transcriptome data in its tissue sample’s spatial context. Various methods have been developed for detecting SV (spatially variable) genes, with distinct spatial expression patterns. However, the accuracy of using such SV genes in clustering cell types has not been thoroughly studied. On the other hand, in single cell resolution sequencing data, clustering analysis is usually done on highly variable (HV) genes. Here we investigate if integrating SV genes and HV genes from spatial transcriptomics data can improve clustering performance beyond using SV genes alone. We evaluated six methods that integrate different features measured from the same samples including MOFA+, scVI, Seurat v4, CIMLR, SNF, and the straightforward concatenation approach. We applied these methods on 19 real datasets from three different spatial transcriptomics technologies (merFISH, SeqFISH+, and Visium) as well as 20 simulated datasets of varying spatial expression conditions. Our evaluations show that the performances of these integration methods are largely dependent on spatial transcriptomics platforms. Despite the variations among the results, in general MOFA+ and simple concatenation have good performances across different types of spatial transcriptomics platforms. This work shows that integrating quantitative and spatial marker genes in the spatial transcriptomics data can improve clustering. It also provides practical guides on the choices of computational methods to accomplish this goal.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Yijun Li, email : liyijun{at}umich.edu
Stefan Stanojevic, email: stanojes{at}med.umich.edu
Bing He, email: hbing{at}med.umich.edu
Zheng Jing, email: jingzhe{at}umich.edu
Qianhui Huang, email: qhhuang{at}umich.edu
Jian Kang, email: jiankang{at}umich.edu