Abstract
We show that deep convolutional neural networks combined with non-linear dimension reduction enable reconstructing biological processes based on raw image data. We demonstrate this by recon-structing the cell cycle of Jurkat cells and disease progression in diabetic retinopathy. In further analysis of Jurkat cells, we detect and separate a subpopulation of dead cells in an unsupervised manner and, in classifying discrete cell cycle stages, we reach a 6-fold reduction in error rate compared to a recent approach based on boosting on image features. In contrast to previous methods, deep learning based predictions are fast enough for on-the-fly analysis in an imaging flow cytometer.
Introduction
A major challenge and opportunity in biology is interpreting the increasing amount of information-rich and high-throughput single-cell data. Here, we focus on imaging data from fluorescence microscopy (Pepperkok and Ellenberg, 2006), in particular from imaging flow cytometry (IFC), which combines the fluorescence sensitivity and high-throughput capabilities of flow cytometry with single-cell imaging (Basiji et al., 2007). Imaging flow cytometry is unusually well-suited to deep learning as it provides very high sample numbers and image data from several channels, that is, high-dimensional, spatially correlated data. Deep learning is therefore capable of processing the dramatic increase in information content — compared to spatially integrated fluorescence intensity measurements as in conventional flow cytometry (Brown and Wittwer, 2000) — in IFC data. Also, IFC provides one image for each single cell, and hence does not require whole-image segmentation.
Deep learning enables improved data analysis for high-throughput microscopy as compared to traditional machine learning methods (Eliceiri et al., 2012; Blasi et al., 2016; Jones et al., 2009; Dao et al., 2016). This is mainly due to three general advantages of deep learning over traditional machine learning: there is no need for cumbersome preprocessing and manual feature definition, prediction accuracy is improved, and learned features can be visualized to uncover their biological meaning. In particular, we demonstrate that this enables reconstructing continuous biological processes, which has stimulated much research effort in the past years (Gut et al., 2015; Bendall et al., 2014; Trapnell et al., 2014; Haghverdi et al., 2016). Only one of the other recent works on deep learning in high-throughput microscopy discusses the visualization of network features (Pärnamaa and Parts, 2016), but none deal with continuous biological processes (Chen et al., 2016a; Kraus et al., 2016; Dürr and Sick, 2016; Kandaswamy et al., 2016; Pärnamaa and Parts, 2016).
When aiming at an understanding of a specific biological process, one often only has coarse-grained labels for a few qualitative stages, for instance, cell cycle or disease stages. While a continuous label could be efficiently used in a regression based approach, qualitative labels are better used in a classification-based approach. In particular, if the ordering of the categorical labels at hand is not known, a regression based approach will fail. Also, the detailed quantitative information necessary for a continuous label is usually only available if a phenomenon is already understood on a molecular level and markers that quantitatively characterize the phenomenon are available. While this is possible for cell cycle when carrying out elaborate experiments where such markers are measured (Gut et al., 2015; Blasi et al., 2016), in many other cases, this is too tedious, has severe side effects with unwanted influences on the phenomenon itself or is simply not possible as markers for a specific phenomenon are not known. Therefore, we propose a general workflow that uses a deep convolutional neural network combined with classification and visualization based on non-linear dimension reduction (Fig. 1).
Materials and Methods
The primary dataset in this paper consists in raw IFC images of 32,266 asynchronously growing immortalized human T lymphocyte cells (Jurkat cells), which was previously analyzed using tra ditional machine learning (Blasi et al., 2016; Hennig et al., 2016). Images of these cells can be classified into seven different stages of cell cycle (Figure 2), including phases of interphase (G1, S and G2) and phases of mitosis (Prophase, Anaphase, Metaphase and Telophase). In this data set, ground truth is based on the inclusion of two fluorescent stains: propidium iodine (PI) to quantify each cell’s DNA content and the mitotic protein monoclonal #2 (MPM2) antibody to identify cells in mitotic phases. These stains allow each cell to be labeled through a combination of algorithmic segmentation, morphology analysis of the fluorescence channels, and user inspection (Blasi et al., 2016). Note that 97.78% of samples in the dataset belong to one of the interphase classes G1, Sand G2. The strong class imbalance in the dataset is related to the fact that interphase lasts — when considering the actual length of the biological process — a much longer period of time than mitosis.
To substantiate the generality of our results, we study a second dataset that was collected with a technology other than IFC and is related to a biological process other than cell cycle: 30.000 publicly available images from the Diabetic Retinopathy Detection Challenge (2015). Diabetic retinopathy is the leading cause of blindness in the working-age population of the developed world. It is diagnosed by trained humans based on the presence of lesions visible in color fundus photographies of the retina and is classified into four disease states: “healthy”, “mild”, “medium’’, and “severe”.
Recent advances in deep learning have shown that deep neural networks are able to learn powerful feature representations (Krizhevsky et al., 2012; Vincent et al., 2010; Szegedy et al., 2015; LeCun et al., 2015). We adapt the widely used “Inception” architecture (Szegedy et al., 2015) and, for the IFC data, optimize it for treating the relatively small input dimensions. The architecture consists in 13 three-layer “dual-path” modules (Suppl. Fig. S3), which process and aggregate visual information at an increasing scale. These 39 layers are followed by a standard convolution layer, a fully connected layer and the softmax classifier. Training this 42-layer deep network does not present any computational difficulty, as the first three layers consist in reduction dual-path modules (Suppl. Fig. S3b), which strongly reduce the original input dimensions prior to convolutions in the following normal dual-path modules. The number of kernels used in each layer increases towards the end, until 336 feature maps with size 8 × 8 are obtained. A final average pooling operation melts the local resolution of these maps and generates the last 336-dimensional layer, which serves as an input for both classification and visualization.
This neural network operates directly on uniformly resized images. It is trained with labeled images using stochastic gradient descent with standard parameters (see Suppl. Notes). For the IFC data, we focus on the case in which only brightfield and darkfield channels are used as input for the network, during training, visualization and prediction. As stated before, this case is of high interest as a fluorescent markers might affect the biological process under study or adequate markers are not known. We note, however, that technical imperfections in the IFC data capture might always lead to a minor amount of fluorescence signal, activated by a fluorescence channel, in the darkfield and brightfield channels, a phenomenon known as “bleed through” (see Suppl. Notes).
Results
To show how learned features of the neural network can be used to visualize, organize and biologically interpret single-cell data, we study the activations in the last layer of the neural network (Donahue et al., 2013). We refer to this as studying the activation space representation of the data. The approach is motivated by the fact that the neural network strives to organize data in the last layer in a linearly separable way, given that it is directly followed by a softmax classifier. Distances from the separating hyperplanes in this space can be interpreted as similarities between cells in terms of the features extracted by the network. Cells with similar feature representations are close to each other and cells with different class assignments are far away from each other. As we will see, this gives a much more fine-grained notion of biological similarity than provided by the class labels used for labeling the training set. Evidently, it automatically generalizes to the unseen, new data in the validation data set.
The activation space of our network’s last layer has 336 dimensions and is much too high-dimensional to be accessible for human interpretation. We use non-linear dimension reduction to visualize the data in a lower dimensional space, in particular, t-distributed stochastic neighbor embedding (tSNE) (van der Maaten and Hinton, 2008) (see Suppl. Video).
Reconstructing cell cycle progression
In this visualization, we observe that the Jurkat cell data is organized in a long stretched cylinder along which cell cycle phases are ordered in the chronologically correct order (Fig. 3a). This is remarkable as the network has been provided with neither structure within the class labels nor the relation among classes. The learned features evidently allow reconstructing the continuous temporal progression from the raw IFC data, and by that define a continuous distance between the phenotypes of different cell cycle phases.
We separately visualized just those cells annotated as being in the interphase classes (G1, S, G2) (Figure 3b) and colored them with the DNA content obtained from one of the fluorescent channels (PI). The DNA content reflects the continuous progression of cells in G1, S and G2 on a more fine-grained level. Its correspondence with the longitudinal direction of the cylinder found by tSNE demonstrates that the temporal order learned by the neural network is accurate even beyond the categorical class labels.
Detecting abnormal cells in an unsupervised manner
Both tSNE visualizations (Fig. 3a,b) produce a small, separate cluster highlighted with an arrow in Fig. 3b. This cluster is learned in an unsupervised way as cell cycle phase labels provide no information about it: it contains cells from all three interphase classes. While cells in the bulk have high circularity and well defined borders (Fig. 3c), cells in the small cluster are characterized by morphological abnormalities such as broken cell walls and outgrowths, signifying dead cells (Fig. 3d).
Deep learning automatically performs segmentation
We interpret the data representation encoded in one of the trained intermediate layers of the neural network by inspecting its activation patterns using exemplary input data from several classes (Fig. 4). These activation patterns are the essential information transmitted through the network. They show the response of various kernels on their input. By inspecting the activation patterns, we obtain an insight into what the network is “focusing on” in order to organize data. We observe a strong response to features that arise from the cell border thickness (Fig. 4, map 1), to area-based features (Fig. 4, map 2), as well as cross-channel features. For example, map 4 in Fig. 4 shows high response to the difference of information from the brightfield channel, as seen in map 2, and scatter intensities, as seen in map 3. A strong response of the neural network to area-based features as in map 2 could indicate that the network learned to perform a segmentation task.
Deep learning outperforms boosting for cell cycle classification
We study the classification performance of Deep learning on the validation data set shown in Fig. 3. We first focus on the case in which G1, S and G2 phases are considered as a single class. Using five-fold cross-validation on the 32,266 cells, we obtain an accuracy of 98.73%±0.16%. This means a 6-fold improvement in error rate over the 92.35% accuracy for the same task on the same data in prior work using boosting on features extracted via image analysis (Blasi et al., 2016). The confusion matrix obtained using boosting show high true positive rates for the mitotic phases (Fig. 5a). For example, no cells in Anaphase and Telophase are wrongly classified, as indicated by the zeros in the off-diagonal entries of the two lower rows of the matrix (Fig. 5a). This means high sensitivity, most cells from mitotic phases are correctly classified as such. Still this comes at the price of low precision: many cells from the interphase class are classified as mitotic phases, as indicated by the high numbers in the off-diagonal entries of the first row of the matrix (Fig. 5a). Deep learning, by contrast, achieves high sensitivity and precision, leading to an almost diagonal confusion matrix (Fig. 5b). Further Deep learning allows to classify all seven cell cycle stages with an accuracy of 79.40%±0.77% (see Suppl. Notes and Suppl. Fig. S2).
Reconstructing disease progression
Consider now the second dataset, which deals with diabetic retinopathy (DR), as described above. Having trained the neural network with four different qualitative disease states, “healthy”, “mild”, “medium”, and “severe”, we observe a reconstructed disease progression (Fig. 6) for 8000 samples in the validation dataset, that is, the four disease states are ordered along disease severity, even though the network has not been provided with the ordering information. Similar to the cell cycle example, the ordering ensures that only neighboring classes overlap, as visible from the tSNE plot (Fig. 6a). Hence, the confusion matrix (not shown) displays a similar close-to tridiagonal structure as for the cell cycle (Fig. 5b and Suppl. Fig. S2a).
Discussion
The visualization of the data as encoded in the last layer of the network using tSNE demonstrates how deep learning overcomes a well known issue of traditional machine learning. When trained on a continuous biological process using discrete class labels, traditional machine learning often fail to resolve the continuum (Eliceiri et al., 2012). Reconstructing continuous biological processes though is possible in the context of so-called pseudotime algorithms (Bendall et al., 2014; Trapnell et al., 2014; Haghverdi et al., 2016). For the cell-cycle it has been demonstrated by Gut et al. (2015), but in a very different setting. These authors measured five stains that uniquely define the cell cycle and then applied a pseudotime algorithm (Bendall et al., 2014) within this five-dimensional space. This procedure is only possible if stains that correlate with a given process of interest are known, if they do not interact with the process and if the elaborate experiments for measuring the intensity of these stains can be carried out. We, by contrast, use raw images directly and the learned features of the neural network automatically constitute a feature space in which data is continuously organized. In the Suppl. Notes, we demonstrate that pseudotime algorithms fail at solving this much harder problem.
Deep learning is able to reconstruct continuous processes based on categorical labels as adjacent classes are morphologically more similar than classes that are temporally further separated. If this assumption does not hold, also pseudotime algorithms fail to reconstruct a process. This can be better understood when inspecting Fig. 6a, where we show the tSNE visualization of the validation set for the diabetic retinopathy (DR) data. Samples are organized in the correct order of progression through disease states, from healthy to severe DR. However, between the healthy cluster (green) and the mild DR cluster (orange), one observes an area of slightly reduced sampling density (dashed line). This should not be attributed to “less data points having been sampled in this region” but should be seen as a consequence of the fact that the overlap between the “healthy” stage and the “mild” stage is smaller than the overlap of the diseased stages among each other. If there was no overlap between “healthy” and “mild” stages, the tSNE would show a complete separation of the healthy cluster from the rest of the data. Such a behavior is typically observed if the underlying data is not sampled from a continuous process.
The unsupervised detection of a discrete cluster of abnormal cells for the Jurkat cell data indicates that the neural network learns the cluster of abnormal cells independently of the cell-cycle-label based training. The model is therefore not only capable of resolving a biological process, but generates features that are general enough to separate incorrectly labeled cells that do not belong to the process. None of the mentioned pseudotime algorithms is capable of this. This shows the ability of deep learning to find unknown phenotypes and processes without knowledge about features or labels. Also, there is a high practical use of the detection of damaged cells. The Jurkat cell data set has been preprocessed using the IDEAS® analysis software to remove images of abnormal cells. In particular, out of focus cells were removed by gating for images with gradient RMS and debris was removed by gating for circular objects with a large area. The discovery of a cluster of abnormal cells shows the limitations of this approach and provides a solution to it.
An advantage of using a neural network for cell classification in IFC is its speed. Traditional techniques rely on image segmentation and measurement, time-consuming processes limited to roughly 10 cells per second. Neural network predictions, by contrast, are extremely fast, as the main computation consists in parallelizable matrix multiplications (“forward propagations”), which can be performed using optimized numeric libraries. This yields a roughly 100-fold improvement in speed to about 1000 cells per second with a single GPU. Aside from much faster analysis of large cell populations, this opens the door to “sorting on the fly”: imaging flow cytometers currently do not allow physically sorting individual cells into separate receptacles based on measured parameters, due to these speed limitations.
Conclusion
Given the compelling performance on reconstructing the cell cycle, we expect deep learning to be helpful for understanding a wide variety of biological processes involving continuous morphology changes. Examples include developmental stages of organisms and the progression of healthy states to disease states, situations that have often been non-ideally reduced to binary classification problems. Ignoring intrinsic heterogeneity has likely hindered a deeper insight into the mechanisms at work. Analysis as demonstrated here could reveal morphological signatures at much earlier stages than previously recognized.
Our results indicate that reconstructing biological process is possible for a wide variety of image data, if enough samples are available. Although generally lower-throughput in terms of the number of cell processed, conventional microscopy is nevertheless still high-throughput and can usually provide higher resolution images than IFC. Furthermore, given that multi-spectral methods are advancing rapidly, imaging mass spectrometry is allowing dozens of labeled channels to be acquired (Bodenmiller et al., 2012; Angelo et al., 2014). Due to its basic structure and high flexibility, our deep learning framework can accommodate a large increase in the number of available channels.
Acknowledgments
F.A.W. acknowledges support by the Helmholtz Postdoc Programme, Initiative and Networking Fund of the Helmholtz Association. P.R. and A.E.C. acknowledge the support of the Biotechnology and Biological Sciences Research Council/ National Science Foundation under grant BB/N005163/1 and NSF DBI 1458626.