TY - JOUR T1 - DeepCyTOF: Automated Cell Classification of Mass Cytometry Data by Deep Learning and Domain Adaptation JF - bioRxiv DO - 10.1101/054411 SP - 054411 AU - Huamin Li AU - Uri Shaham AU - Kelly P. Stanton AU - Yi Yao AU - Ruth Montgomery AU - Yuval Kluger Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/06/14/054411.abstract N2 - Mass cytometry or CyTOF is an emerging technology for high-dimensional multiparameter single cell analysis that overcomes many limitations of fluorescence-based flow cytometry. New methods for analyzing CyTOF data attempt to improve automation, scalability, performance, and interpretation of data generated in large studies. However, most current tools are less suitable for routine use where analysis must be standardized, reproducible, interpretable, and comparable. Assigning individual cells into discrete groups of cell types (gating) involves time-consuming sequential manual steps untenable for larger studies. The subjectivity of manual gating introduces variability into the data and impacts reproducibility and comparability of results, particularly in multi-center studies. The FlowCAP consortium was formed to address these issues and it aims to boost user confidence in the viability of automated gating methods. We introduce DeepCyTOF, a standardization approach for gating based on a multi-autoencoder neural network. DeepCyTOF requires labeled cells from only a single sample. It is based on domain adaptation principles and is a generalization of previous work that allows us to calibrate between a source domain distribution (reference sample) and multiple target domain distributions (target samples) in a supervised manner. We apply DeepCyTOF to two CyTOF datasets generated from primary immune blood cells: (i) 14 subjects with a history of infection with West Nile virus (WNV), and (ii) 34 healthy subjects of different ages. Each blood sample was labeled with 42 antibody markers, 12 of which were used in our analysis, at baseline and three different stimuli (PMA/ionomycin, tumor cell line K562, and infection with WNV). In each of these datasets we manually gated a single baseline reference sample to automatically gate the remaining uncalibrated samples. We show that DeepCyTOF cell classification is highly concordant with cell classification obtained by individual manual gating of each sample with over 99% concordance. Additionally, we apply a stacked autoencoder, which is one of the building blocks of DeepCyTOF, to cytometry datasets used in the 4th challenge of the FlowCAP-I competition and demonstrate that it over performs relative to all gating methods introduced in this competition. We conclude that stacked autoencoders combined with a domain adaptation procedure offers a powerful computational approach for semi-automated gating of CyTOF and flow cytometry data. ER -