Abstract
Innovations in single cell technologies have lead to a flurry of datasets and computational tools to process and interpret them, including analyses of cell composition changes and transition in cell states. The diffcyt workflow for differential discovery in cytometry data consist of several steps, including preprocessing, cell population identification and differential testing for an association with a binary or continuous covariate. However, the commonly measured quantity of survival time in clinical studies often results in a censored covariate where classical differential testing is inapplicable. To overcome this limitation, multiple methods to directly include censored covariates in differential abundance analysis were examined with the use of simulation studies and a case study. Results show high error control and decent sensitivity for a subset of the methods. The tested methods are implemented in the R package censcyt as an extension of diffcyt and are available at https://github.com/retogerber/censcyt. Methods for the direct inclusion of a censored variable as a predictor in GLMMs are a valid alternative to classical survival analysis methods, such as the Cox proportional hazard model, while allowing for more flexibility in the differential analysis.
Competing Interest Statement
The authors have declared no competing interest.
Abbreviations
- scRNA-seq
- single-cell RNA sequencing
- DA
- differential abundance
- DS
- differential state
- GLMM
- generalized linear mixed model
- RB
- raw bias
- CR
- coverage rate
- CI
- confidence interval
- RMSE
- root mean squared error
- km
- Kaplan-Meier imputation
- kme
- Kaplan-Meier imputation with an exponential suvival function tail
- rs
- risk set imputation
- mrl
- mean residual life imputation
- pmm
- predictive mean matching
- cc
- complete case analysis
- DM
- dirichlet-multinomial distribution
- TPR
- true positive rate
- FDR
- false discovery rate