Abstract
Computational cytometry methods are now frequently used in flow and mass cytometric data analyses. However, systematic bias-free methodologies to assess inter-sample variability have been lacking, thereby hampering efficient data mining from a large set of samples. Here, we devised a computational method termed LAVENDER (latent axes discovery from multiple cytometry samples with nonparametric divergence estimation and multidimensional scaling reconstruction). It measures the Jensen-Shannon distances between samples using the k-nearest neighbor density estimation and reconstructs samples in a new coordinate space, called the LAVENDER space. The axes of this space can then be compared against other omics measurements to obtain biological information. Application of LAVENDER to multidimensional flow cytometry datasets of 301 Japanese individuals immunized with a seasonal influenza vaccine revealed an axis related to baseline immunological characteristics of each individual. This axis correlated with the proportion of plasma cells and the neutrophil-to-lymphocyte ratio, a clinical marker of the systemic inflammatory response. The same method was also applicable to mass cytometry data with more molecular markers. These results demonstrate that LAVENDER is a useful tool for identifying critical heterogeneity among similar, yet different, single-cell datasets.