Abstract
In the past decade, single cell technologies have revolutionized our ability to study cellular heterogeneity. Spatial omics represents the next technological wave, granting spatial context to single cell transcriptomes. Integration analysis of transcripts and spatial information will greatly enable us to dissect tissue organization and inter-cellular communications. Here, we present SEDR, an unsupervised spatial embedded deep representation of both transcript and spatial information. The SEDR pipeline uses a deep autoencoder to construct a gene latent representation in a low-dimensional latent space, which is then simultaneously embedded with the corresponding spatial information through a variational graph autoencoder. SEDR was tested on the 10x Genomics Visium spatial transcriptomics and Stereo-seq datasets, demonstrating its ability to create a better data representation that benefits various follow-up analysis tasks. In benchmarking test, SEDR achieved better clustering accuracy than contemporary methods, and in conjunction with trajectory analysis, it correctly retraced retraces the prenatal development of the human dorsolateral prefrontal cortex. We also found the SEDR representation to be eminently feasible for batch integration. Finally, we used SEDR to characterize the intratumoral heterogeneity of human breast cancer. We identified regions with different immune microenvironments, ranging from pro-inflammatory to immune suppressive areas with infiltrated tumor associated macrophages (TAMs). Analysis suggested a cancer cell dissemination trajectory from cells in pre-metastatic state to invasive carcinoma.
Competing Interest Statement
The authors have declared no competing interest.