ABSTRACT
Myeloid cells are important sites of lytic and latent infection by human cytomegalovirus (CMV). We previously showed that only a small subset of myeloid cells differentiated from CD34+ hematopoietic stem cells is permissive to CMV replication, underscoring the heterogeneous nature of these populations. The exact identity of susceptible and resistant cell types, and the cellular features characterizing permissive cells, however, could not be dissected using averaging transcriptional analysis tools such as microarrays and, hence, remained enigmatic. Here, we profile the transcriptomes of ∼ 7000 individual cells at day one post-infection using the 10X genomics platform. We show that viral transcripts are detectable in the majority of the cells, suggesting that virion entry is unlikely to be the main target of cellular restriction mechanisms. We further show that viral replication occurs in a small but specific sub-group of cells transcriptionally related to, and likely derived from, a cluster of cells expressing markers of Colony Forming Unit – Granulocyte, Erythrocyte, Monocyte, Megakaryocyte (CFU-GEMM) oligopotent progenitors. Compared to the remainder of the population, CFU-GEMM cells are enriched in transcripts with functions in mitochondrial energy production, cell proliferation, RNA processing and protein synthesis, and express similar or higher levels of interferon-related genes. While expression levels of the former are maintained in infected cells, the latter are strongly down-regulated. We thus propose that the preferential infection of CFU-GEMM cells may be due to the presence of a pre-established pro-viral environment, requiring minimal optimization efforts from viral effectors, rather than to the absence of specific restriction factors. Together, these findings identify a potentially new population of myeloid cells susceptible to CMV replication, and provide a possible rationale for their preferential infection.
AUTHOR SUMMARY Myeloid cells such as monocytes and dendritic cells are critical targets of CMV infection. To identify the cellular factors that confer susceptibility or resistance to infection, we profiled the transcriptomes of ∼ 7,000 single cells from a population of semi-permissive myeloid cells infected with CMV. We found that viral RNAs are detectable in the majority of the cells, but that marked expression of CMV lytic genes occurs in only a small subset of cells transcriptionally related to a cluster of CFU-GEMM progenitors that express similar amounts of transcripts encoding interferon-related anti-viral factors as the rest of the population but higher levels of transcripts encoding proteins required for energy, RNA, and protein production. We thus conclude that the preferential infection of CFU-GEMM cells might be due to the pre-existing presence of an intracellular environment conducive to infection onset, rather than to the absence of anti-viral factors restricting viral entry or initial gene expression. Together, these findings uncover a new type of myeloid cells potentially permissive to CMV infection, expand our understanding of the cellular requirements for successful initiation of CMV infection, and provide new pro- and anti-viral gene candidates for future analyses and therapeutic interventions.
INTRODUCTION
Infection by human cytomegalovirus (CMV) is common and usually asymptomatic in healthy individuals, but can be the source of serious disease in hosts with naïve or compromised immune functions such as fetuses, newborns, AIDS patients, and solid organ or bone marrow transplant recipients [1, 2]. CD34+ hematopoietic stem cells (HSC) and derived monocytes, macrophages and dendritic cells are important sites of CMV latency and reactivation, as well as of lytic infection in vivo (for recent reviews, see [3–6]). CMV interactions with these cells have thus been intensively studied, using a variety of different cell culture models [7–11].
We previously showed that exposure of cord blood CD34+ HSC to specific cytokines such as Flt3 ligand (FL) and transforming growth factor β1, known to instruct their differentiation into Langerhans cells [12, 13], gives rise to myeloid cell populations that are semi-permissive to CMV infection. While only 2-3% of non-activated cells obtained at the end of the differentiation period allowed expression of the viral immediate early 1 and 2 (IE1/IE2) proteins, which are essential for infection onset and progression, cell activation by exposure to granulocyte-macrophage colony-stimulating factor (GM-CSF), fetal bovine serum (FBS), CD40 ligand (CD40L) and lipopolysaccharide (LPS) partially released this initial block, raising the proportion of IE1/IE2+ cells by 5-10 fold [14–16]. Unexpectedly, however, non-activated cells produced higher yields per IE1/IE2+ cell than activated cells, suggesting that signaling by GM-CSF, FBS, CD40L and LPS may trigger the establishment of a second block to infection progress, acting after IE gene expression and negatively impacting viral progeny production. This second block is unlikely to be due to defects in progeny release, as non-activated and activated cells generated similar ratios of cell-free to cell-associated virus [16], but may instead depend on impairments in the assembly of viral replication compartments (Galinato and Hertel, unpublished).
Because of their ability to restrict infection progress at multiple steps of the viral replication cycle, activated myeloid cells represent an outstanding model to study the determinants of cellular susceptibility to CMV infection. Their intrinsic heterogeneity, however, thus far precluded the identification of cellular factors supporting or restricting infection using averaging gene expression analysis tools such as microarrays. Here, we took advantage of the most recent developments in single-cell RNA sequencing technologies to provide the first transcriptional profiling of CMV-infected, activated myeloid cells conducted at the single-cell level, and the first comparison of gene expression changes occurring in infected and bystander cells co-existing in the same population.
We show that: 1) more than half of the cells contain detectable viral transcripts at day one post-infection, with only a small minority (∼ 2%) displaying an expression pattern consistent with progression to lytic replication. This indicates that restrictions to viral entry may contribute to, but are not the main determinant of resistance; 2) lytically-infected cells are transcriptionally related to a specific cluster of bystander cells with the hallmarks of CFU-GEMM, suggesting that this type of cells may be a previously unidentified target of CMV lytic infection; 3) compared to the remainder of the population, CFU-GEMM cells express similar or higher levels of IFN-related genes with anti-viral roles, which are strongly down-regulated in infected cells, indicating that CFU-GEMM cells are not defective in their ability to recognize and respond to CMV infection; 4) also compared to the remainder of the population, CFU-GEMM cells are enriched in transcripts encoding proteins involved in mitochondrial energy production, S-phase control, and RNA and protein production. Expression levels of these genes remain largely unchanged in lytically-infected cells, suggesting that that preferential infection of CFU-GEMM cells is likely due to the presence of a transcriptional landscape already optimized for viral replication, and requiring little conditioning effort from viral effectors, rather than to an intrinsic inability to recognize and respond to the presence of viral products.
Together, these data identify a new myeloid cell type potentially permissive to CMV replication, broaden our knowledge of the cellular determinants of susceptibility to infection, and reveal the identity of new pro- and anti-viral factors involved in regulating CMV tropism for myeloid cells.
RESULTS
CD34+ HSC-derived myeloid cell populations are semi-permissive to CMV infection
To identify cellular factors potentially involved in regulating the susceptibility of myeloid cells to CMV infection, we sought to analyze the transcriptome of permissive and non-permissive cell types derived from the differentiation of CD34+ HSC in vitro. To select a representative population, the CD34+ HSC isolated from the cord blood of twelve different donors were separately cultured in the presence of a cytokine cocktail known to promote the development of Langerhans-type dendritic cells [12, 13]. Differentiated cells were then activated by exposure to GM-CSF, FBS, CD40L and LPS, and infected with CMV strain TB40/E. Consistent with our previously published data [14–16], cell numbers did not increase over time (not shown), and only 3 ± 1.5 % of non-activated but 10 ± 5 % of activated cells expressed the viral IE1/IE2 proteins at day two post-infection (pi), notwithstanding the use of a multiplicity of infection (MOI) of ten pfu/cell, which is sufficient to infect the totality of permissive cell types such as fibroblasts (Fig 1A and B). Despite containing higher numbers of IE1/IE2+ cells at each time point, activated cell populations produced lower intracellular progeny amounts per IE1/IE2+ cell than non-activated cells (Fig 1C and D).
Activated cells differentiated from the CD34+ HSC of donor 113G (Fig 1, pink circles) were then selected as representative, and subjected to single-cell RNA sequencing at day one using the 10X Genomics Chromium platform [17]. Activated cells were chosen to ensure data collection from sufficient numbers of infected cells, and to facilitate the identification of potential cellular inhibitors of viral replication, whereas the day one time point was selected to allow sufficient time for infection to start, while limiting the extent of virus-induced changes to the cellular transcriptional landscape. A median of 2,305 genes and 10,627 transcripts were detected in the 6,837 cells profiled, and the total number of genes with at least one count in any cell was 20,899. After reduction by principal components analysis, data was visualized in two dimensions using the t-distributed stochastic neighbor embedding (t-SNE) algorithm [18], which displays cells with similar transcriptional profiles as nearby points, and cells with dissimilar transcriptional profiles as distant points with high probability (Fig 2A). Cells thus represented on t-SNE plots were then interrogated for specific gene transcripts using Loupe™ Cell Browser [19].
Viral transcripts are detected in the majority of the cells, but their presence is not associated with expression of specific cellular genes
Query of the t-SNE projection data for the presence of viral RNA revealed that 59% of the cells in the population contained at least one viral transcript (Fig 2A and B). RNAs mapping to the viral open reading frames (ORFs) UL4/UL5, US34, UL145 and UL16/17, and to the non-coding RNA2.7 and RNA1.2, were present in the largest number of cells, accounting for half of the CMV-transcript+ population. These viral RNA+ cells were dispersed throughout the entire population, suggesting that infection had occurred into the majority of the cells. However, to avoid introducing perturbations potentially affecting cellular transcription, non-penetrated viral particles were not enzymatically removed from the cell surface. Consequently, some of the detected transcripts may have originated from virions still attached to the outside of cells or from penetrated capsids that did not reach the nucleus. The specific viral RNAs that were detected in the largest proportion of the cells, however, are not amongst those reported to be packaged into virions [20–22]. Moreover, more than half of the 26 transcripts found in > 200 cells mapped to viral ORFs known to be expressed with immediate-early or early kinetics (not shown), suggesting that they were likely newly synthesized from the viral genome.
Staining of infected cells for the capsid-associated phosphoprotein pp150 [14, 23, 24] (Fig 2C) also revealed the presence of viral particles associated with 55 ± 15 % of activated cells and in 34 ± 9 % of non-activated cells at day one pi (Fig 2D). These results are consistent with our previously published data using CMV strain TB40-BAC4, a BAC-cloned variant of TB40/E [14], although, in contrast to TB40-BAC4, TB40/E virions remained visible on, or within, the cells until at least day four pi.
To identify cellular factors potentially involved in restricting viral entry, the gene expression profile of CMV-transcript+ cells was compared to that of CMV-transcript− cells. Only five cellular genes scored as differentially expressed between the two groups of cells with P < 0.05, but none was present in the totality, nor in the majority, of CMV-transcript− or CMV-transcript+ cells (S1 Dataset). The two genes expressed in the largest number of cells, RETN and TJP1 (for gene names, see S2 Table 1), were detected in only 303 and 124 cells, respectively, and were distributed in both populations: RETN was found in 153 CMV-transcript+ vs. 150 CMV-transcript-cells, and TJP1 in 104 CMV-transcript+ vs 20 CMV-transcript-cells.
The extent of expression of genes encoding potential CMV entry receptors, such as EGFR [25–28], PDGFRA [29–31], THY1/CD90 [32, 33], the integrins αVβ3, α2β1, and α6β1 [34–36], and BSG [37] was also queried. EGFR, THY1/CD90, and integrins β3, α2, and α6 were either not expressed at all or were found in less than ten cells, while PDGFRA was expressed in only 356 cells, and then only at low levels. Integrin β1 and BSG, by contrast, were present in larger numbers of cells (2568 and 5810, respectively), but these did not preferentially segregate with the CMV-transcript+ group.
Together, these findings indicate that viral entry is unlikely to be the main roadblock restricting infection onset, that cells devoid of viral transcripts do not transcribe specific factor(s) restricting virion entry, and that cells containing viral RNAs do not selectively express genes encoding entry facilitators, including surface molecules reported to act as CMV entry receptors in other cell types.
Transcription of viral lytic genes proceeds in a small group of cells lacking expression of select cellular genes
Eleven genetic loci were identified as required for efficient CMV genome replication in transient co-transfection replication assays [38, 39]. These encode the transcriptional activators/regulators IE1, IE2, UL112/113, UL84 and IRS1/TRS1, the anti-apoptotic factors UL36-38, and six members of the viral DNA replication complex, i.e. the DNA polymerase UL54, the polymerase accessory factor UL44, the helicase UL105, the primase UL70, the primase associated factor UL102, and the single-stranded DNA binding protein UL57. To identify cells ostensibly progressing towards lytic replication, the population was queried for the presence of viral transcripts encoding each of these proteins.
A total of 278 cells, corresponding to ∼ 4% of the entire population and ∼ 7% of CMV-transcript+ cells were UL122+ (IE2) and/or UL123+ (IE1), and 42% of these expressed both. These proportions were in agreement with those obtained by immunofluorescence staining of infected cells from donor 113G harvested at day one pi (3.6% IE1/IE2+, data not shown).
Consistent with progression towards lytic replication (Fig 1D), UL122+/UL123+ cells were also found to express transcripts encoding UL112/113 (91% triple-positive), UL84 (77%), IRS1/TRS1 (59%), UL36 (84%), UL37 (13%), UL38 (83%), and three replication complex components, namely UL54 (64%), UL105 (70%), and UL102 (56%) (Fig 3A). By contrast, RNAs corresponding to UL44, UL70, and UL57 were not detected. Staining of infected cells confirmed that the UL84 protein was present in 37 ± 19% of activated, IE1/IE2+ cells at day two pi (Fig 3B and E). Interestingly, while the UL44 and UL57 proteins were also observed at day two pi (Fig 3C and D), the proportion of IE1/IE2+ cells co-expressing each of these polypeptides in activated cells was significantly lower than in non-activated cells (Fig 3E). This suggests that assembly of the viral DNA replication complex in activated cells might be impaired, perhaps on account of delayed or inefficient transcription of specific complex members.
The data from cells containing the above viral transcripts, plus several others, comprised a tight and well separated cluster of 138 points on the t-SNE projection, which we collectively named CMV+ (Fig 3A). Comparison of the transcriptional profile of CMV+ and CMV− cells identified 629 genes as being more highly expressed in CMV− cells, but none was associated with significant P values (< 0.05), nor was present in the totality of CMV− and absent in CMV+ cells. Sixty cellular genes had more than four-fold higher mean expression levels in CMV− than in CMV+ cells, with nine being present in more than 50% of CMV− cells but less than 50% of CMV+ cells (S3 Dataset and S4 Fig). Encouragingly, and as expected for virus-exposed cells, five of these nine genes encoded well-known type I interferon (IFN)-inducible proteins involved in mediating innate immune responses to viruses, i.e. MX1, OAS1, OAS2, IFIT3, and USP18. In line with activated myeloid cells, the majority of CMV− cells also expressed the TNF-α and LPS-inducible protein CYTIP [40], and the CD40 ligand-inducible costimulatory molecule CD80 [41], plus two genes, one coding for the orphan G protein-coupled receptor GPR157 and one for DUSP4, whose transcriptional regulation by viruses or other stimuli has not been assessed.
Together, these data indicate that although CMV transcripts are associated with the majority of cells, lytic infection proceeds in only a small sub-population containing lower amounts of a handful of genes, most of which encode known powerful antiviral proteins.
CMV+ cells are closely related to a specific sub-cluster of cells within the population
The direct comparison of CMV+ and CMV− cells failed to yield transcripts with statistically significant differences in expression levels between the two groups, suggesting that resistance to infection is unlikely to be conferred by a single set of “universal” restriction factors highly expressed in all CMV− cells but absent in CMV+ cells. Rather, we hypothesized that the CMV− population might be comprised of several different cell types, each resisting infection due to the expression of anti-viral genes, or to the lack of pro-viral factors, specific to each sub-group.
Cell clustering using the K-means algorithm indeed revealed the presence of multiple different sub-groups of cells within the population, each characterized by distinct transcriptional profiles (Fig 4A). To identify the cluster most closely related to CMV+ cells, the mean number of transcripts/cell for each gene in the CMV+ cluster was divided by the mean number of transcripts/cell for each gene in each of the other nine clusters, and the frequency distribution of all Log2 ratios was plotted. A nonlinear regression fit test using the least squares method revealed that all distributions could be described by the Gaussian function, and that the histogram with the mean value closest to zero (0.184), the smallest standard deviation (0.814), and the highest R2 value (0.995) belonged to the CMV+ versus cluster 6 comparison (Fig 4B). A Wilcoxon signed rank test also identified the Log2 CMV+/cluster 6 ratio distribution as the one whose median values differed the least from zero, suggesting that the transcriptional profiles of CMV+ and cluster 6 cells were the most similar to each other.
To uncover the identity of cluster 6 cells, the genes most selectively expressed by this cluster relative to the rest of the population were identified using the 10X Genomics Cell Ranger software [42], and their expression range in vivo was assessed using publicly available gene expression databases and literature data. Sixty-nine genes were selected as being highly differentially expressed (Log2 cluster 6/rest of the cells fold change > 4, P values < 10−15, S5 Dataset). Seven of these (ELANE, PRTN3, AZU1, MPO, PRSS57, CTSG and RNASE2), coding for markers of neutrophil precursors [43, 44] were predominantly or exclusively expressed in a sub-group of 116 cells, which we designated “promyelocytes” (S6 Fig A). Four more genes (RETN, S100A8, S100A9 and S100A12), encoding proteins secreted by activated neutrophils under pro-inflammatory conditions [45, 46], were abundant in promyelocytes and in a separate group of ∼ 205 cells, designated “activated neutrophils” (S6 Fig B). The remaining 57 genes encoded mostly DNA replication and cell cycle regulators, and were present, either exclusively or overlapping with promyelocytes and CMV+ cells, in a third sub-group of 93 cells (S6 Fig C), designated “sub-cluster 3”.
In addition to separating cluster 6 into three sub-clusters, other groups of cells were identified based on their expression of known markers such as CD14 and CD68 (monocytes), CD207/langerin and CD1a (Langerhans cells), CD1b (monocyte-derived dendritic cells), and hemoglobins (erythrocytes) (Fig 4C). Of note, and in keeping with CD34+ HSC differentiation towards myeloid (rather than lymphoid) lineages, no T or B cell specific transcripts were found.
While none of the promyelocyte- and activated neutrophil-specific genes were also expressed by CMV+ cells, 18 (32%) of sub-cluster 3 marker genes were shared, some almost exclusively, with the CMV+ group. This suggested that sub-cluster 3 cells in specific might be related to the CMV+ cell cluster. To further verify this, the mean number of transcripts/cell for each gene in the CMV+ cluster was divided by the mean number of transcripts/cell for each gene in each of the other 13 clusters, and the frequency distribution of Log2 ratios was plotted. The histogram whose median value differed the least from zero did indeed correspond to the CMV+ versus sub-cluster 3 comparison (Fig 4D), confirming that the transcriptional profile of these two groups are the most closely related.
Sub-cluster 3 is comprised of cells with CFU-GEMM hallmarks
To more precisely identify the cell type comprising sub-cluster 3, the list of 115 genes more abundantly (average transcript count > 0.3) and most differentially (Log2 fold change > 3, P < 0.0005) transcribed in these cells relative to all other clusters was compared to gene lists from two recently published single-cell analyses of human hematopoiesis [47, 48] (S7 Dataset). Seventy-two transcripts were found among the list of genes reported to be differentially expressed in 16 discrete bone marrow populations by Velten L. et al. [48], with the largest proportion falling within the “G2/M phase” (56%) and the “Immature myeloid progenitors with high cell cycle activity” (24%) categories. A total of 108 genes were also found among the transcripts classified as differentially expressed in seven human cord blood populations by Karamitros D. et al. [47], with the vast majority belonging to the “common myeloid progenitor” population (88%), followed by the megakaryocyte/erythroid progenitor compartment (6%). This suggested that sub-cluster 3 cells might consist of multipotent progenitors which, in contrast to HSC, are known to be highly proliferative and metabolically active [49, 50].
Within our population, half of the 115 abundantly/differentially expressed genes were almost exclusively associated with sub-cluster 3, followed by shared expression with promyelocytes, erythrocytes/megakaryocytes, activated neutrophils, and monocytes (S7 Dataset). Thirty-six of these genes were also present in CMV+ cells, with the majority being shared with the promyelocytes and erythrocytes/megakaryocytes clusters. Together, these data indicate that sub-cluster 3 is comprised of proliferating cells expressing erythroid, monocytic and granulocytic markers, which we surmised might represent CFU-GEMM oligopotent progenitors.
To further test this hypothesis, cells belonging to the cluster 7, erythro, mono, MDDC, CMV+, promyelo, act neut and sub-cluster 3 groups depicted in Fig 4C were ordered along trajectories corresponding to their inferred differentiation pathways using Monocle [51]. A trajectory with four main branches extending from a rooted center was generated (Fig 5A), and the identity of cells composing each of the eight groups was uncovered using Seurat [52] (Fig 5B and S8 Fig). Cells in group D, the root center, expressed the same key genes as sub-cluster 3 cells in Fig 4C, while its closest neighbors, group E, F, G and H, expressed markers typical of the monocytes, erythrocytes, promyelocytes and activated neutrophils clusters in Fig 4C, respectively (S8 Fig and S9 Dataset). The most isolated cluster of cells, group B, was related to CL7 in Fig 4C, and differentially expressed CD52 and FCER1A (S9 Dataset). Thus, the observed pseudotime distances and cluster organization strongly implicated group D as the most likely origin of erythrocytes, megakaryocytes, promyelocytes, neutrophils and monocytes, suggesting that cells comprising group D/sub-cluster 3 might indeed represent CFU-GEMM. As the CMV+ cluster was immediately adjacent to group D, we further conclude that CMV+ cells are directly related to, and possibly derived from, CFU-GEMM progenitors.
The majority of the genes characterizing CFU-GEMM cells are more highly expressed in this cluster than in the rest of the population, and are maintained to similar levels in CMV+ cells
To understand why cells in sub-cluster 3 (relabeled GEMM) in particular allowed infection to initiate we sought to identify which set of genes and, consequently, which cellular functions, were most differentially regulated in GEMM and CMV+ cells with respect to the rest of the population. A set of 1989 genes was identified by the 10X Genomics Cell Ranger software as being more selectively expressed in the CMV+ and GEMM clusters relative to all other clusters. The majority of these genes (1361, or 68% for GEMM, and 1460, or 73% for CMV+ cells) were associated with positive Log2 fold change values, indicating that most of the GEMM-specific genes were more highly expressed in these cells than elsewhere, and that, as expected, infection was accompanied by a strong transcriptional up-regulation of cellular genes (S10 Dataset, sheet 1).
The differentially expressed genes were then partitioned between “synchronous” and “asynchronous”, depending on whether their transcription was similarly regulated in GEMM and CMV+ cells or not. Genes that were up-regulated in GEMM cells relative to the rest of the population, and that were expressed to similar levels or further up-regulated in CMV+ cells (total = 1077), as well as genes that were down-regulated in GEMM cells and that were expressed to similar levels or further down-regulated in CMV+ cells (total = 325) were considered synchronous, while genes that were up-regulated in GEMM but down-regulated at least two-fold in CMV+ cells, and vice-versa, were labeled asynchronous (total = 587). The majority (1402, 70 %) of the selected genes fell into the synchronous category. Of these, most were up-regulated in the GEMM cluster and expressed to similar levels in CMV+ cells, with only 28 genes being further induced in infected cells, suggesting that GEMM cells already contain large numbers of transcripts beneficial (or neutral) to infection (S10 Dataset, sheet 1). As levels of down-regulated genes were also mostly maintained without any further repression by infection, we hypothesized that GEMM cells might be preferentially infected because their transcriptional landscape requires the least amount of optimization by viral effectors.
Expression of genes with functions in energy production, cell cycle control, RNA and protein metabolism is higher in both GEMM and CMV+ cells
To pinpoint the functional areas distinguishing GEMM cells from the remainder of the population, the most differentially expressed synchronous genes, and all of the asynchronous genes (1304 in total) were partitioned into 15 categories based on their encoded functions (S10 Dataset, sheet 2). The transcript abundance of each gene found in the CMV+ or GEMM clusters was then divided by the abundance in the rest of the cells (CMV/REST and GEMM/REST) or in GEMM cells (CMV/GEMM), and the distributions of the Log2 ratio values were plotted (Fig 6). As expected, only the GEMM/REST and CMV/REST, but not the CMV/GEMM ratio distributions of all genes were identified by the Wilcoxon signed rank test as having Log2 median values significantly different from zero (Fig 6A). Genes with roles in mitochondrial functions (Fig 6I), proliferation and cell cycle control (Fig 6J), RNA metabolism (Fig 6L) and protein processing (Fig 6K) were also more highly expressed in both GEMM and CMV+ cells, and were thus further scrutinized.
Mitochondria
Genes involved in ATP production, mitochondrial protein synthesis, and mitochondrial transport were consistently more abundant in GEMM cells than elsewhere (Fig 7A and C-E, blue lines), with their expression levels remaining largely unchanged in CMV+ cells (Fig 7A, and C-E, red lines). Among these, genes encoding members of the ATP synthase and NADH dehydrogenase complexes of the electron transfer chain were the most represented, together with genes encoding mitochondrial ribosomal proteins (S10 Dataset, sheet 3). This suggests that infection might preferentially start in GEMM cells due to the existence of an intracellular environment already geared toward high energy production, and hence capable of supporting the large metabolic requirements of viral replication. We did indeed previously observe a similarly strong up-regulation of genes with functions in oxidative phosphorylation and fatty acid β-oxidation in infected fibroblasts at late times pi [53], indicating that the enhancement of mitochondrial functions is a key feature of infection.
Proliferation/cell cycle
Consistent with the notion that multipotent progenitors are highly proliferative [49, 50], GEMM cells expressed higher levels of genes encoding S and M phase effectors than the rest of the population (Fig 7B and F-G, blue lines and S10 Dataset, sheet 4). CMV infection of fibroblasts was reported by us and others to repress expression of genes promoting entry into S phase, while simultaneously inducing expression of DNA synthesis effectors [53–56]. In keeping with these observations, CMV+ cells contained lower transcript amounts of genes promoting entry into S phase, such as CCNA2, CCND3, MKI67 and RB1, but higher transcript levels of genes encoding inhibitors of S phase progression, including BTG1, BTG3, CCNDBP1, CDKN1A (Cip1) and the HSC quiescence-promoting gene NDN [57], which was almost exclusively expressed in CMV+ cells (not shown). Transcription of DNA replication effectors was, by contrast, inconsistently induced. While expression of some genes, such as the catalytic subunit of the DNA polymerase delta (POLD2) and its interacting protein POLDIP2, RPA3 and the RPA complex nuclear importer RPAIN, was high, transcription of others such as PCNA, MCM3, MCM7, and FEN1 was reduced in CMV+ cells. We speculate that this mixed transcriptional regulation might be typical of the early phase of infection, when viral factors are still in the process of gaining control over cell proliferation, while at later times, when data from fibroblasts were collected [53], viral DNA synthesis is already fully established.
We previously reported that CMV infection induces the appearance of aberrant mitotic figures, supported by the induction of numerous genes involved in M phase progression [53]. Although this feature was shared by different CMV strains, it was by far most evident with the attenuated strain AD169 than with TB40/E [58]. Consistent with the TB40/E pattern, only a minority of the 63 genes with functions in mitosis were maintained to high levels in CMV+ cells, while the rest were down-regulated (Fig 7B and G), including the two main components of the mitosis-promoting factor, CDK1 and CCNB1, chromatin condensation agents (SMC2, SMC4, ZWINT and TOP2A), mitotic spindle assembly controllers (AURKB, BIRC5, PLK1, MAD2L1, and CENPF), components of the anaphase-promoting complex (CDC20 and PTTG1), and cytokinesis effectors (SEPT9, ARF6, and RAB11A).
Together, these data are consistent with a CMV-induced block in cell proliferation, aimed at curtailing usage of cellular resources for processes irrelevant to viral replication, such as mitosis, and steering others, such as those devoted to cellular genome replication, toward viral DNA production instead.
RNA metabolism
As expected for metabolically active cells, expression of numerous genes involved in RNA processing, splicing and translation were more highly expressed in GEMM and in CMV+ cells than in the rest of the population (Fig 8A, C and D, blue and green lines and S10 Dataset, sheet 5). By contrast, expression of ∼ 70% of transcription-related genes was similar in GEMM and in the rest of the cells, but was up-regulated in CMV+ cells (Fig 8A and B, blue and red lines).
Particularly revealing of the strong impetus of infection toward stimulating cellular gene transcription on a broad scale was the induction of several RNA polymerase II subunits and elongation factors (Fig 8E), while among transcription factors, the most strongly up-regulated in CMV+ cells were the HOPX homeobox (CMV/GEMM ratio ∼14-fold), the proto-oncogene JUN (7-fold) with its heterodimerization partner BATF3 (3-fold), and the differentiation inhibitor ID2 (3-fold). Transcription of the other two JUN partners, FOS and JUNB, and of the regulators of hematopoietic cells differentiation IKZF1, SPI1/PU.1, and RUNX3 was instead reduced 2- to 9-fold (Fig 8F).
Thus, the early stages of infection appear to be associated with a sharp push towards increased production of RNA synthesis and processing effectors. This is likely required to support viral gene transcription in order to fine-tune viral control over a variety of cellular processes, including cell differentiation.
Protein metabolism
In keeping with the robust infection-associated stimulation of gene translation, expression of numerous protein chaperones and post-translational modifiers was also higher in both GEMM and CMV+ cells than in the rest of the population (Fig 9A and B-C, blue and green lines and S10 Dataset, sheet 6). Chaperone-assisted protein folding occurs via three main routes, the simplest one being via interactions with single HSP70 or HSP90 family members. Some polypeptides require the sequential binding of HSP70 and HPSP90 instead, while others need the intervention of the chaperonin containing TCP1 complex (CCT) [59]. Both HSP70 coding transcripts, HSPA1A and HSPA1B, and their co-chaperone DNAJB6 were expressed to lower levels in GEMM cells than in the rest of the population, and were up-regulated in infected cells. The adaptor protein STIP1, which coordinates protein transfer from HSP70 to HSP90, the inducible (HSP90AA1) and constitutive (HSP90AB1) HSP90 isoforms, and all eight subunits of the CCT complex were expressed at higher levels in both GEMM and CMV+ cells. A similar pattern of regulation was observed for calnexin (CANX) and calreticulin (CALR), and for seven out of eleven members of the large endoplasmic reticulum (ER)-localized multiprotein complex (HSPA5, DNAJB11, HSP90B1, PPIB, PDIA6, SDF2L1 and ERP29), which, together, comprise the ER protein quality control system [60, 61] (Fig 9E).
Levels of numerous genes with roles in protein degradation were also slightly higher in GEMM and CMV+ cells (Fig 9A and D, blue and green lines). Of particular interest was the up-regulation of 17 subunits (out of 33) of the proteasome (Fig 9F). Protein degradation may benefit the virus by removing unwanted cellular polypeptides and damaged or misfolded proteins, while simultaneously enhancing amino acid availability. An essential role of the proteasome, however, is to produce antigenic peptides suitable for presentation on major histocompatibility complex (MHC) class I molecules, an activity extremely detrimental to virus spread. In the immunoproteasome, the proteolytic subunits PSMB5, 6 and 7 are replaced with PSMB8, 9 and 10. Very intriguingly, and consistent with data from infected fibroblasts [62], expression of these latter subunits was down-regulated in CMV+ cells (Fig 9F).
In addition to curtailing the ability of the immunoproteasome to produce antigenic peptides, MHC class I activities were also negatively impacted by the strong transcriptional induction of APLP2, an enhancer of MHC class I internalization and turnover [63, 64]. Rather intriguingly, transcript levels of genes encoding the three main MHC class I molecules, HLA-A, -B, and -C, and of their binding partner B2M, as well as of the three main MHC class II isotypes, HLA-DR, -DQ, and -DP and the invariant chain CD74 were already ∼ 2.5-fold lower in GEMM cells than in the rest of the population and were not further reduced in CMV+ cells (Fig 9G). By contrast, expression of HLA-DMA and HLA-DMB, which assist in the binding of high affinity antigenic peptides into MHC class II [65], were repressed while transcription of HLA-DOA and HLA-DOB, which increase tolerance to self-peptides [65], was increased (Fig 9G). Together, these data underscore the strong effects of infection on fine-tuning the cellular protein “portfolio” to match the virus’ needs, and highlight the pristine selectivity of viral effectors in modulating the expression of specific cellular proteins in order to protect infected cells from detection and elimination by the host immune system.
Expression of genes with functions in IFN-mediated antiviral defenses is similar in GEMM and in the rest of the cells, and is strongly down-regulated in CMV+ cells
Akin to genes belonging to categories of apoptosis, immune, lipids, soluble factors/receptors/signaling and vesicles (Fig 6C, G, H, M and N, blue line), transcript levels of IFN-related genes were overall similar in GEMM cells and in the rest of the population (Fig 6F, blue line). Very excitingly, however, this category contained the most strongly down-regulated genes of all in CMV+ cells (median Log2 CMV/GEMM ratio value of – 1.9, P < 0.0001, Fig 6F, red line).
Compared to the rest of the population, GEMM cells contained higher levels (median ratio, ∼ 1.5-fold) of transcripts encoding sensors of viral double-stranded DNA and RNA, such as IFI16 [66], HMGB1 [67], DDX58/RIG-I [68], IFIH1/MDA5 [69] and EIF2AK2/PKR [70], of signaling mediators like STAT1, and of transcriptional activators such as IRF3, IRF7 and IRF8 [71], but lower levels (median ratio, ∼ 4-fold) of negative regulators of IFN production and signaling such as IRF2 [72], IRF4 [73], TRAFD1 [74] and SOCS1 [75]. Expression of IFN effectors including IFIT1, IFIT2 and IFIT3, which recognize and prevent translation of virally produced triphosphorylated RNA molecules [76], IFITM1, IFITM2 and IFITM3, which block infection at multiple steps including entry [77], ISG15 and its conjugating (HERC5) and de-conjugating (USP18) enzymes, which disrupt the activity of viral proteins by ISGylation [78], as well as of known (MX1, MX2, OAS1, OAS2, OAS3 and OASL) [79, 80], or suspected anti-viral proteins such as viperin [81], SAMHD1 [82] and ISG20 [83] were instead similarly abundant in GEMM and the rest of the cells (median ratio, ∼ 1.1-fold) (Fig 10, GEMM/REST column).
Together, these data suggest that GEMM cells are not defective in their ability to detect, respond and potentially antagonize viral infection. Rather, GEMM cells appear to be similarly, or even more responsive than the rest of the population, indicating that the lack of appropriate cellular defenses is unlikely to be the main reason for their preferential infection. Interestingly, and similar to the situation with MHC class I and II genes, transcriptional modulation of these genes in CMV+ cells appeared to be selective: while mRNA levels of most IFN antiviral effectors were powerfully reduced, transcription of negative regulators was enhanced, with the notable exception of the adaptor protein TMEM173/STING [84–87] and the TBK1 activator OPTN [88], which are both involved in IFN production following CMV DNA detection by MB21D1/cGAS [89]. Taken together, these findings provide support to our theory whereby infection preferentially begins in GEMM cells due to their higher metabolic, proliferative, and RNA and protein synthesis rates, rather than to impairments in their capacity to mount strong cellular defenses.
DISCUSSION
In previous studies, we showed that activated and non-activated myeloid cells differentiated from CD34+ HSC are semi-permissive to CMV infection [9, 14–16]. As such, they constitute a useful model to identify the cellular determinants of viral tropism.
Based on our current and previous data, resistance to infection appears to be multilayered, affecting multiple sequential steps in the viral life cycle, and progressively narrowing the proportion of cells capable of supporting the full viral replication cycle. While in homogeneous permissive cell populations such as fibroblasts the probability for a cell to remain free of viral particles at an MOI of ten is null, our data show that at day one pi ∼ 40% of activated myeloid cells do not contain any viral RNAs (Fig 2). However, of the CMV-transcript+ cells, only ∼ 7% express the UL122 and UL123 ORFs necessary for infection onset, and of these, about half contain additional transcripts needed for CMV genome replication. While the proportion of cells progressing toward the replicative stage may increase over time, the number of IE1/IE2+ cells remained unchanged from day two pi onwards (Fig 1), suggesting that the cellular barriers restricting infection onset are never overcome. Despite the presence of donor-dependent variation (expected not only for primary cells but especially for hematopoietic cell types), activated cells consistently contained larger proportions of IE1/IE2+ cells but produced less progeny (Fig 1). We used these less permissive cells as a tool to identify the cellular pathways involved in inhibiting (or promoting) infection.
Although the absence of viral RNAs in ∼ 40% of activated cells may depend, at least in part, on timing and detection limits, it is also possible for this portion of the population to be more resistant to viral entry due to the presence of specific restriction factors and/or the absence of entry facilitators. However, no specific cellular genes were identified as being selectively transcribed in viral RNA+ or RNA− cells. Transcripts coding for proteins currently known to support virion entry were also either absent (EGFR, THY1/CD90 and ITGB3) or found in only a minute proportion of cells (ITGAV, ITGA2, ITGA6 and PDGFRA), while more abundant levels of transcripts coding for BSG and ITGB1 did not selectively partition with viral RNA+ cells. Because our myeloid cell cultures are highly heterogeneous (Fig 4), preferential infection of select sub-groups may still have been facilitated by the expression of specific genes. BSG, for instance, was present in 99% of GEMM cells but in only 60% of cluster 7 cells. Conversely, viral RNA− cells may have resisted infection owing to the expression of subset-specific molecules. However, the fact that no “universal” entry resistance/enabling gene(s), expressed by all viral transcript+/− cells, could be identified implies that such gene(s) may not exist. This is in contrast to other cell types such as endothelial and epithelial cells, whose infection instead depends on the expression of surface molecules (such as BSG), acting as receptors for specific glycoprotein complexes present on the virion’s surface [37].
Only a small fraction (∼ 3%) of CMV-transcript+ cells expressed multiple viral ORFs at high levels. Presumably, these represent cells that will progress toward lytic replication, as corroborated by the presence of early viral proteins in a similar proportion of cells (Fig 1 and Fig 3). Thus, we wondered about the fate of infection in the remainder of the cells, which contain lower amounts of viral transcripts, and no detectable viral proteins. Intriguingly, a similar scenario was recently encountered following single-cell RNA sequencing of TB40/E-infected CD14+ and CD34+ cells [90]. Elevated levels of viral transcripts were observed in just ∼ 2% of monocytes, while the rest of the population, which contained lower amounts of a wide range of viral transcripts, were interpreted as potentially being latently infected. This led us to speculate that the remaining CMV-transcript+ cells in our population might be either latently infected or on a path toward latency. While this hypothesis requires additional testing, it remains a thrilling possibility, especially in view of recently presented evidence supporting the potential association of viral latency with quantitative rather than qualitative changes in viral gene expression [90]. Alternatively, it is of course possible for viral transcripts to simply be detected and eliminated by cellular defense mechanisms, producing an abortive infection.
Although expression of CD207/langerin and CD1a was observed in a number of cells within the population, Langerhans cells did not appear to be the main source of infected cells, at least at day one pi. Rather, multiple lines of evidence indicate that CMV+ cells derive from a cluster with the hallmarks of GEMM colony forming units, albeit devoid of transcripts coding for some of the markers traditionally used to describe this population, i.e. CD34, CD38, and CD123. As none of the genes we and others [47, 48] found to be selectively expressed by these cells encode surface molecules, their isolation from either in vitro differentiated myeloid populations or hematopoietic tissues is particularly challenging. Consequently, we do not currently have direct evidence that this specific cell type can support CMV lytic infection in vivo.
Very recent data from single-cell RNA-seq analyses of hematopoietic processes have revealed that lineage development is a continuous process, more usefully depicted by Waddington’s landscapes [91], than by more rigid cell differentiation trees. In this emergent scenario, CD34+ HSC are visualized as beads rolling along a surface stretching from a higher to a lower point in space, and containing ridges and valleys. These ridges, corresponding to barriers separating individual lineages, are smaller near the top and become increasingly higher towards the bottom as expression of fate mediators progresses in each cell. Once ridges become too high cells can no longer change their identity and terminal lineages are established [47, 48, 92, 93]. We believe that the permissive cell type we identified in this study corresponds to a mid-point along this surface, characterized by the loss of pluripotency, but not yet enclosed by the high ridges separating granulocytes, monocytes, erythrocytes and megakaryocytes from each other. While this cell type is likely to exist in vivo, it may have been missed in previous studies of CMV tropism due to its rarity, and/or to the lack of specific surface markers.
An interesting question in this regard is: when did these cells arise during CD34+ HSC differentiation, and what factors influence this process? The CD34+ cells we employed in this study were isolated from cord blood and were amplified for 8-10 days in the presence of FL, stem cell factor (SCF), and thrombopoietin (TPO) before differentiation. Others have shown that GEMM cell numbers increase by 850-fold during culture of CD34+ cord blood cells in the presence of FL and TPO for 15 weeks [94], while CD34+ HSC from peripheral blood produce lower total cell numbers and colony forming units than CD34+ HSC from cord blood after three weeks of exposure to FL plus TPO or FL plus TPO plus SCF [95]. Rather interestingly, HSC expansion was also associated with the rapid loss of the CD34 marker [95]. These data suggest that cord blood-derived HSC might have a stronger propensity to develop into GEMM cell-containing populations than cell populations isolated from peripheral blood or bone marrow. FL, SCF, and TPO promote self-renewal of CD34+ cells and have been used to expand cord blood HSC in vitro for therapeutic intervention [94–98]. While all three cytokines stimulate HSC division, TPO also drives megakaryocyte development [99], and FL, which steers hematopoiesis toward the lympho-myeloid lineage at the expense of erythrocytes/megakaryocytes, is essential for the generation of dendritic cells [100]. Thus, in addition to stimulating HSC proliferation, these cytokines may have provided the very first “ridges”, nudging HSC differentiation toward GEMM cells. Intriguingly, we were able to detect the presence of progeny virus in the culture supernatant of amplified (but not of non-amplified) CD34+ cells exposed to TB40/E, albeit with low frequencies (not shown). This led us to wonder if, perhaps, GEMM cells might be present in amplified HSC cultures even before exposure to the differentiation cocktail. Thus, cell culture conditions are critical when studying HSC in conjunction with CMV. CD34+ cells are extremely plastic, and can clearly give rise to clustered sub-populations of myeloid cells, some of which permissive to lytic infection. Being a minority in the population these clusters can easily escape detection and may introduce unwanted and unnoticed “lytic noise” [90] in studies of viral latency.
While the reason for the preferential infection of CMV-transcript+ cells remains unclear, our data provide a plausible rationale for initiation of lytic infection in GEMM cells; i.e., their higher expression of multiple gene products involved in energy, RNA, and protein production, as well as in cell cycle control, which likely create an intracellular environment particularly conducive to infection onset by lowering the amount of energy required from viral effectors to steer cellular processes away from cell needs and toward viral replication.
We initially reported the up-regulation of numerous genes with functions in mitochondrial oxidative phosphorylation, fatty acid β-oxidation, and malate-aspartate, ATP, and citrate transport systems in infected fibroblasts at late times pi [53]. Our findings were subsequently confirmed and expanded by a number of studies in fibroblasts and other cell types [101–104]. To our knowledge, the current work is the first report showing that a strong transcriptional induction of this type of genes also occurs in myeloid cells and at early times post-entry (Fig 7), making it a hallmark of CMV infection in different cell types and a requirement for successful viral replication.
The cell cycle is also a very well-known target during viral infection. Multiple studies have shown that CMV infection drives host cells into the G1/S phase to shunt cellular resources required for DNA synthesis and repair toward viral rather than cellular genome replication. Manipulation of cell cycle functions occurs at multiple levels, including protein transcription, translation, stability, posttranslational modification, and subcellular localization [105]. We previously showed that CMV infection is associated with a very strong positive impact on the expression of multiple S phase, M phase, and DNA activity regulators in fibroblasts, leading to the appearance of aberrant mitotic figures, which we called pseudomitosis, at late times pi [53, 58]. Here, we found that expression of genes involved in S phase control was higher in GEMM cells and remained high in CMV+ cells, whereas transcription of M phase regulators was reduced (Fig 7). While fibroblasts were infected at confluency (when the majority of the cells are in G0/G1), GEMM cells were likely actively proliferating at the moment of contact with CMV. We thus believe that our new data highlight the exquisite ability of infection to fine tune its impact on gene transcription according to the conditions of the cell at the time of entry, in order to reach optimal expression levels of specific genes useful to viral replication. Although entry into mitosis is clearly detrimental to viral replication [106], the presence of select M phase proteins may be needed to perform specific tasks, such as viral genome compaction, disentangling, or transport. To reach an ideal protein concentration these genes may thus need to be transcriptionally upregulated in quiescent fibroblasts, whereas downregulation may prevail when cells are already actively cycling.
High expression levels of genes involved in RNA processing, splicing, and translation were not unexpected for metabolically active cells such as GEMM, and neither was it surprising that they were maintained in CMV+ cells (Fig 8). Indeed, a similar scenario was observed by us [53] and others [107] in infected fibroblasts. Again, our data validate and expand these findings to include myeloid cells, clearly marking these metabolic processes as pivotal for successful infection.
Expression of several genes with essential roles in hematopoietic development was also altered in CMV+ cells (Fig 8). These include HOPX, which regulates primitive hematopoiesis [108], BATF3, vital for the development of conventional cross-presenting CD8α+ dendritic cells, ID2, whose expression in CD34+ HSC inhibits the development of dendritic cell precursors [109], RUNX3, whose depletion leads to defects in the proliferation and differentiation of activated cytotoxic CD8+ T cells, helper Th1 cells and NK cells, and to the disappearance of skin Langerhans cells [110], IKZF1, essential for normal lymphopoiesis and for myeloid, megakaryocyte and erythroid differentiation [111], and SPI1/PU.1, which is critical for the generation of all hematopoietic lineages [112]. If dysregulated expression of these genes also occurs in infected progenitors in vivo, it may powerfully affect the development and functions of multiple arms of the hematopoietic system, providing potential new culprits for the infection-associated problems ensuing congenital infection and hematopoietic stem cell transplantation.
Finally, we observed a much stronger negative impact of infection on the expression of genes with functions in the production and responses to IFN than previously reported, accompanied by the induction of a very small, and possibly selected, number of genes. To our knowledge this is the first analysis of transcriptional responses to CMV infection in myeloid cells conducted at the single-cell level and, hence, capable of comparing gene expression levels in lytically infected cells to those in bystander cells co-existing within the same population. Our data thus provide a new perspective on how host defenses are raised and subsequently offset by the virus, in contrast to previous analyses that compared mean gene expression levels in CMV-infected samples to those in separate mock-infected cells [113–117]. Aside from the detection method, differences may also depend on the time pi, the cell type, and the strain of virus used. Infection recognition was shown to occur very rapidly in monocytes and fibroblasts, leading to the activation of the transcription factors IRF3 and NF-κB, and to the implementation of the IFN transcriptional program within 4-8 hours [113, 114, 116, 118–122]. Structural components of the virion, such as the tegument proteins pp65 and/or pp71 [113, 114, 123, 124], as well as viral immediate-early and early proteins [125, 126] then cooperate to blunt these responses via multiple mechanisms, including inactivation of the double-stranded DNA sensor MB21D1/cGAS, blockage of the STING-TBK1-IRF3 complex assembly, inhibition of NF-kB binding to DNA, and degradation of the signal transduction molecules JAK1 and STAT2 [123, 124, 127–132]. IFN-related genes that are highly transcribed at 4-8 hours pi may thus be downregulated at 24 hours pi, or upon full implementation of viral countermeasures. The importance of viral anti-IFN defenses is indeed underscored by the fact that five of the nine genes more abundantly expressed in CMV− cells encode IFN-induced antiviral proteins (MX1, OAS1, OAS2, IFIT3, and USP18, Supplementary Fig 1), suggesting that effective downregulation of their expression could not be achieved in the absence of specific viral gene products.
Basal expression of sensors, signal transducers, and IFN-inducible genes as well as the speed and strength whereby antiviral responses are mounted can also be cell type dependent [133]. IRF3 and IRF7, for instance, were shown to be required for IFN-β induction in response to West Nile virus infection in murine fibroblasts but not in macrophages and dendritic cells, implying that detection of viral components proceeds via different pathways in these cell types [134]. Myeloid cell responses to CMV infection may thus differ from those of fibroblasts, while in latently-infected monocytes, IFN-related gene expression may remain high due to the absence of viral lytic proteins.
Finally, cell responses may also be affected by the virus strain, as virion content of pp65, reported to block IFN induction at very early times post-entry [113, 135], was shown to vary dramatically in different CMV strains [136], while STAT2 degradation was observed to occur in fibroblasts infected with CMV clinical isolates or with strain AD169, but not strain Towne [128]. Altogether, our data broaden the number of IFN-related genes susceptible to transcriptional regulation by CMV to include effectors with currently no known role in CMV infection inhibition, which may thus represent new host encoded anti- or pro-viral proteins.
In summary, our data provide evidence in favor of the existence of a new type of myeloid cells potentially permissive to CMV lytic infection, offer a reasonable theory regarding their preferential infection over other cell types present in the same population, substantially expand our understanding of the cellular determinants of CMV tropism for myeloid and other types of cells, and provide new candidate pro- and anti-viral molecules for future studies and potential therapeutic interventions.
MATERIALS AND METHODS
Cells and virus
Umbilical cord blood CD34+ HSC were purchased from STEMCELL Technologies Inc, Vancouver, Canada and pre-amplified in α-Minimum Essential Medium (Thermo Fisher Scientific, Waltham, MA) supplemented with 20% heat-inactivated FBS (Gibco, Fisher Scientific, Waltham, MA), 375 ng/ml of FL, 50 ng/ml of SCF and 50 ng/ml of TPO for 8-10 days at a density of 1 × 104 cells/well in 48-well tissue culture plates. Cells were then differentiated in serum-free X-VIVO 15 medium (Lonza/BioWhittaker, Allendale, NJ) supplemented with 1,500 IU/ml of GM-CSF (Leukine Sargramostim), 150 ng/ml of FL, 10 ng/ml of SCF, 2.5 ng/ml of tumor necrosis factor-α, and 0.5 ng/ml of transforming growth factor β1 for eight days at a density of 1 × 105 cells/well in 48-well plates. Activation of differentiated cells was then induced by exposure to X-VIVO 15 medium containing 10% standard FBS (US origin, Gibco, Fisher Scientific, Waltham, MA), 1,500 IU/ml of GM-CSF, 200 ng/ml of CD40L, and 500 ng/ml of LPS (Sigma-Aldrich, St. Louis, MO) for two days at a density of 1 × 105 cells/well in 48-well plates. All cytokines were from Peprotech, Rocky Hill, NJ. Human foreskin fibroblasts were propagated in Dulbecco’s Modified Eagle Medium (Corning Cellgro, UCSF CCF, San Francisco, CA) supplemented with 10% fetal clone serum III, 100 U/ml penicillin, 100 µg/ml streptomycin, 4 mM HEPES (all from HyClone, Fisher Scientific, Pittsburgh, PA), and 1 mM sodium pyruvate (Corning Cellgro, UCSF CCF, San Francisco, CA). CMV strain TB40/E, a gift from C. Sinzger (University of Ulm, Ulm, Germany), was propagated on fibroblasts and purified by ultracentrifugation as previously described [53].
Myeloid cell infection
Differentiated myeloid cell populations were exposed to TB40/E at a calculated MOI of ten for four hours, washed twice and further cultured for ten days. Cells were harvested on days 2, 4, 6, 8, and 10 pi, counted, and used in immunofluorescence staining analyses and titration assays.
Immunofluorescence staining analyses
Cell staining was performed as previously described [16]. Briefly, cytospin preparations of myeloid cells were fixed in 1.5% formaldehyde for 30 min, permeabilized in 0.5% Triton-X 100 for 20 min, and blocked in 40% FBS/40% goat serum for 30 min before incubation with antibodies directed against the viral proteins IE1/IE2 (MAb810, 1:600, or AF488 MAB810X, 1:200, Millipore, Temecula, CA), UL32 (pp150, 1:400, a kind gift from Bill Britt, University of Alabama, Birmingham), UL84 (1:500, Virusys, Taneytown, MD), UL44 (1:200, Virusys, Taneytown, MD), or UL57 (1:100, Virusys, Taneytown, MD) for one hour, followed by secondary antibodies conjugated to Alexa-Fluor 488 or Alexa-Fluor 594 (1:200, Invitrogen, Carlsbad, CA, and Jackson Immunoresearch, West Grove, PA) for another hour. Nuclei were labeled with Hoechst 33342 (0.2 mg/ml; Molecular Probes, Eugene, OR) for three min. Samples were viewed using a Nikon Eclipse E600 fluorescence microscope equipped with Ocular imaging software.
Virus titrations
Cell-associated virus was released from pelleted myeloid cells by sonication for ∼ 5-10 seconds on ice using a Branson Ultrasonics Sonifier 150 and incubated with fibroblasts for one hour. After 24 hours infected fibroblasts were stained for IE1/IE2 expression.
Statistical analysis
All data were analyzed using Prism 7 (GraphPad Software). Unpaired t-tests were used to compare data from non-activated and activated cells in Fig 3. Differences were considered significant at P < 0.05. The Wilcoxon signed rank sum test was used to compare median ratio values from data distributions with a hypothetical median of zero.
Single-cell RNA-seq generation and analysis
Activated myeloid cells differentiated from the CD34+ HSC of a representative donor (113G) were infected with TB40/E at an MOI of 10, washed twice, and further incubated for 24 hours. Cells were then processed through the Chromium Single-cell 3′ v2 Library Kit (10X Genomics) by the Genetic Resources Core Facility Cell Center and BioRepository, Johns Hopkins University, Baltimore, MD. Briefly, 10,000 cells were loaded onto a single channel of the 10X Chromium Controller. Messenger RNA from approximately ∼ 7,000 cells captured and lysed within nanoliter-sized gel beads in emulsion was then reverse transcribed and barcoded using polyA primers with unique molecular identifier sequences before being pooled, amplified, and used for library preparation. The library was then sequenced in two lanes of an Illumina HiSeq 2500 Rapid Flowcell system. Demultiplexing of the bcl file into a FASTQ file was performed using Cell Ranger mkfastq software, and alignments to human (hg19) or TB40E (NCBI EF999921.1) genome reference sequences were performed using STAR [137]. Dimensionality reduction of data was performed by principal component analysis using N= 10 principal components, and reduced data were visualized in two dimensions using the t-SNE nonlinear dimensionality reduction method [18]. Clustering for expression similarity was performed using both graph-based and K-means (with K=10 clusters) methods by Cell Ranger [42]. Clusters and differential expression analyses generated by Cell Ranger were then visualized using LoupeTM Cell Browser [19]. For each gene in each cluster, three values were computed and reported in supplemental datasets: 1) the mean number of unique molecular identifier counts; 2) the log2 fold-change of each gene’s expression in cluster × relative to other clusters and 3) the p-value denoting significance of each gene’s expression in cluster × relative to other clusters, adjusted to account for the number of hypotheses (i.e., genes) being tested.
Monocle clustering and single cell ordering in pseudotime
Cells belonging to the cluster 7, erythro, mono, MDDC, CMV+, promyelo, act neut, and sub-cluster 3 groups depicted in Fig 4C were used for pseudotime analysis. Gene-cell matrices produced by Cell Ranger were loaded into R with cellrangerRkit (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/rkit) and pseudo-temporal assignment was performed with Monocle version 2.99.0 (39) using N = 5 principal components. Marker genes were found using Seurat’s FindAllMarkers function [52], and groups were identified based on the expression of gene markers from Fig 4C and S6 Fig. The root of the tree was manually selected using orderCells from Monocle, defined by the point of origin of the majority of the branches.
Data availability
All single-cell data files are deposited in Gene Expression Omnibus under accession number GSExx.
ACKNOWLEDGMENTS
We thank Melissa Olson, Director, Genetics Research Core Facility and Biorepository & Cell Center and David Mohr, Director, High Throughput Sequencing at the Johns Hopkins Medical Institutes, Baltimore MD for technical assistance with single-cell RNA-seq analyses using the 10X genomics Chromium platform. We are also grateful to Aharon Nachshon, Weizmann Institute of Science, for sharing the annotated reference for the TB40/E transcription units, to Christian Sinzger for the kind gift of CMV strain TB40/E, and to Bill Britt for providing us with the anti-pp150 antibodies.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.
- 6.↵
- 7.↵
- 8.
- 9.↵
- 10.
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.
- 27.
- 28.↵
- 29.↵
- 30.
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.
- 55.
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.
- 86.
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.
- 97.
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.
- 103.
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.
- 116.↵
- 117.↵
- 118.↵
- 119.
- 120.
- 121.
- 122.↵
- 123.↵
- 124.↵
- 125.↵
- 126.↵
- 127.↵
- 128.↵
- 129.
- 130.
- 131.
- 132.↵
- 133.↵
- 134.↵
- 135.↵
- 136.↵
- 137.↵