Abstract
Embryonic stem (ES) cells represent a popular model system for investigating development, tissue regeneration and repair. Although much is known about the molecular mechanisms that regulate the balance between self-renewal and lineage commitment in ES cells, the spatiotemporal integration of responsive signalling pathways with core transcriptional regulatory networks are complex and only partially understood. Moreover, measurements made on populations of cells reveal only average properties of the underlying regulatory networks, obscuring their fine detail. Here, we discuss the reconstruction of regulatory networks in individual cells using novel single cell transcriptomics and proteomics, in order to expand our understanding of the molecular basis of pluripotency, including the role of cell-cell variability within ES cell populations, and ways in which networks may be controlled in order to reliably manipulate cell behaviour.
1 Introduction
Our understanding of how pluripotency has grown tremendously since pioneering studies described the derivation and in vitro culture of pluripotent stem cells (PSCs) [1–⇓⇓⇓⇓⇓⇓8]. It is now well known that PSCs display two characteristic features: 1) indefinite self-renewal in vitro, and 2) tri-lineage commitment to ectoderm, endoderm and mesoderm, once released from the self-renewing regime. Knowledge of both these properties has been predominantly generated from aggregates of cellular material, and therefore represents the average behaviour of hundreds or thousands of cells. Nevertheless, from a wealth of regulatory relationships, a limited set of core transcription factors have been inferred and validated, resulting in the construction of a now reliable regulatory model for the pluripotent state [9]. However, more recent measurements at the single cell level have highlighted the presence of significant biological variability and heterogeneity among clonal PSC populations [10], suggesting that subtle cellcell variations in network configurations may have an important role in regulating pluripotency [11–⇓⇓⇓15]. These results stress the importance of reconstructing regulatory networks at the individual cell level in order to uncover the refined mechanisms that balance self-renewal and differentiation in vitro and cellular propensities for different developmental states. In this article, we sketch out our current understanding of the integrated regulatory network (IRN) that controls the transient developmental state of pluripotency and discuss the ways in which more refined single cell regulatory networks are enhancing our understanding of PSC states and the ways in which PSCs balance self-renewal and lineage commitment at the individual cell level.
2 Combinatorial control of pluripotency by regulatory networks
The pluripotent state in mouse and human cells is regulated by a number of integrated regulatory networks (previously reviewed [16–⇓⇓⇓20]), including transcriptional [21], epigenetic [22], signalling [23] and metabolic [24] sub-networks (represented schematically in Figure 1). In the presence of defined extrinsic stimuli [25,26], the pluripotent state is maintained by a cell-intrinsic set of transcription factors (TFs) that constitute a self-sustaining gene regulatory network (GRN) that is rich in feedback [27]. Central to this GRN lies a core network of TFs composed of Oct4, Sox2, Nanog, with significant support of secondary factors such as also Klf4, Myc and Lin28 [28]. Combinations of these TFs were originally found to revert the cell identity of terminally differentiated somatic cells towards the pluripotent cell identity [6–⇓8], however subsets of these factors are also sufficient to reconstitute pluripotency in somatic cells [29,30]. The members of this core GRN interact with a range of auxiliary transcription factors [31–⇓⇓⇓⇓36], which collectively control transcription of a large number of genes either directly, by binding to gene promoters [21,33,37,38], or indirectly, by mediating the effects of epigenetic remodelling complexes [20,39,40], which help maintain pluripotency by producing a permissive chromatin state that allows for widespread nonspecific transcription [41], in which important developmental genes are sporadically expressed at low levels, yet remain poised for robust expression under the appropriate differentiation cues [42–⇓44]. To buffer this “noisy” environment, a network of microRNAs [45–⇓⇓48] and ribosome specific mechanisms [49], ensure appropriate protein levels are robustly maintained. In addition to these cell-intrinsic regulatory mechanisms, a layer of signalling pathways integrates cell-extrinsic information to the central pluripotency GRN. While the core transcriptional circuitry is broadly similar in mouse and human cells [50], mouse embryonic stem cells (mESCs), mouse epiblast stem cells (mEpiSCs) and human embryonic stem cells (hESCs) display marked differences in their dependence on extrinsic signalling factors. In mESCs, Lif/Stat signalling [51,52], Bmp [53] and canonical Wnt [25] promote self-renewal, while Fgf/Erk signalling disrupts pluripotency [25,54–⇓56]. In contrast, hESCs and mEpiSCs require Activin and Fgf [57,58] signalling for self-renewal and cells in this “primed” pluripotent state undergo differentiation when exposed to Bmp [58], while Lif/Stat signalling has no measureable effect on their self-renewal in vitro [59]. Importantly, the flow of information between signalling and transcriptional regulatory networks is not one-way: signalling networks mediate external environmental information to the core GRN, while the core GRN affects the expression of the pathway components themselves, or of key miRNAs that in turn regulate signalling pathway components [60]. Collectively, these reports indicate that pluripotency is regulated by mechanisms that act at both the transcriptional and translational levels and involve layers of combinatorial regulatory control, including complex feedback relationships between transcriptional, epigenetic and signalling mechanisms. However, while this model has been tremendously successful, much of this information has been inferred from bulk properties of large ensembles cells. Within individual cells, regulatory networks may adopt a variety of different states and may deviate dramatically from this ensemble model. Thus, a better understanding of how cell-cell variation in network structure affects cell population function is now needed.
3 Regulatory networks at the single cell level
In contrast to ensemble networks from bulk cell material, single cell measurements of co-expression patterns are able to reveal a more nuanced picture of regulatory networks within cell populations [61]. Traditionally, flow-cytometry (FC) has been used quantify co-expression patterns of individual cells. However, FC methods are intrinsically limited in the number of factors that can be co-assessed (currently up to about 18), mainly due the spectral resolution of fluorescently labelled antibodies [62]. These limitations are gradually being overcome, for instance using new methods such as Cytometry by Time of Flight (CyTOF), which, at present, is able to quantify co-expression of up to approximately 50 different proteins in individual cells via immuno-labelling with elemental isotopes [62–⇓64]. Similarly, complementary nucleic-acid-based techniques such as RNA-FISH [65,66], qRT-PCR [67] and RNA-seq [68] are also able to assess multi-dimensional transcript expression patterns at single cell resolution. These emerging single cell technologies are now enabling broad expression profiles across a large number of cells to be obtained [69], from which single cell regulatory networks can be inferred [70]. Table 1 summarizes some notable uses of these methods to profile PSCs.
Using conventional methods such as FC, evidence of heterogeneity in clonal cell populations has accumulated for individual factors of the integrated pluripotency regulatory network [14,71–⇓⇓⇓75] and it is generally agreed that variable quantities of mRNA and protein can be present within individual cells of a clonal cell population, for instance due to noise inherent to transcription and translation [76,77]. For important members of the IRN such variation can strongly affect the efficiency with which cell-extrinsic stimuli are processed by individual cells and thereby drive widespread differences in multivariate expression patterns within clonal populations (for example amongst the descendants of individual PSCs) [15]. Moreover, variability in expression of central regulatory factors can also affect network structure by differentially activating particular sub-networks of the IRN within individual cells [78]. Such structural changes have been shown to have an important role in regulating cell-cell variability in PSC fate changes, for example upon Nanog withdrawal [27,79]. Ultimately, differential expression of important sub-networks may impose constraints on cellular signal processing, which in turn restrict the degrees of freedom of cell-fate decisions. Although yet fully demonstrated, it is likely that similar mechanisms, involving subtle variations in network structure between in individual cells, are important for regulating the collective dynamics of PSC populations [80].
4 Plasticity of pluripotency networks in development and reprogramming
Regulatory networks controlling a particular cellular identity undergo dramatic changes as cells progress from one developmental state to another [81,82]. Such changes can be exploited to classify cellular identities based on properties of their underlying regulatory network [83,84]. Four instances of cellular identity changes are of particular interest with respect to understanding cell-cell variability and network-plasticity in pluripotency: three of these instances are associated with the native developmental programmes, starting with blastulation (the origin of pluripotency) and proceeding through gastrulation (establishment of different pluripotent states, followed by exit from pluripotency); while the fourth is related to the reverse process of establishing the pluripotent regulatory network in somatic cells during cellular reprogramming. In all four cases, variation in cell-cell expression of regulatory networks has an important role.
A Network plasticity during blastulation
During pre-implantation development, the inner cell mass (ICM) forms two strata, the epiblast (EPI) and primitive endoderm (PE), in preparation to the formation of the embryo proper from the EPI. Initially, mosaic expression patterns of central regulatory factors for EPI (Nanog) and PE (Gata6) emerge seemingly at random [85]. The apparent spatial randomness of this process suggests that cell intrinsic stochastic mechanisms are responsible for the initial EPI-PE stratification process [86], while sub-networks centred on Nanog or Gata6 subsequently reinforce these initial stochastic variations before cell re-arrangements, coordinated through juxtacrine signalling, lead to tissue-organisation into the two strata [87,88]. However, other evidence suggests that the mosaic expression of Nanog and Gata6 is preceded by asymmetric cell division leading to an unequal distribution of Fgf-signalling components [89–⇓91]. The reconstruction of single-cell regulatory networks could be instrumental in consolidating both models by inferring the logical sequence of events from unbiased single cell expression data.
B Network plasticity during the naive pluripotent to primed pluripotent transition
Two pluripotent states exist that display distinct differences in their IRN [92]: a naïve pluripotent state present in the EPI of the pre-implantation embryo, from which mESCs are derived; and a primed state, characteristic of the late stage of the EPI in the post-implantation embryo (the egg cylinder in mice), from which mEpiSCs are derived [26]. While the transition from the naïve state to the primed state corresponds to the natural developmental progression in the embryo, the primed states can be artificially reverted to the naïve state in vitro only through ectopic expression of Klf4 in mEpiSCs [93], or Nanog and Klf2 in hPSCs [94]. These observations reveal a remarkable property: only few key nodes are necessary to alter the processing logic of the IRN towards accepting contrasting extrinsic signalling inputs (i.e. LIF and BMP in mESCs versus Activin and Fgf/Erk in mEpiSCs) in order to arrive at the same outcome: self-renewal. The precise rewiring of signalling pathways into the GRN, characteristic for the alternative pluripotent states, may be inferred from single-cell expression data.
C Decay of the IRN: From pluripotency and lineage commitment
In development, the transient pluripotent state ceases with the formation of the germ layers during gastrulation. The spatial organisation of the peri-implantation embryo contains various localised sources of signalling molecules [95] and it has been demonstrated that such localised extrinsic signals can cause asymmetric division, leading to two daughter cells with different sets of active signalling networks and GRN components [96]. If key GRN components such as Nanog are lost – with accompanying changes in transcription factor binding [97] and chromatin reorganisation [98] – then the self-sustaining properties of core GRN in one daughter may be compromised, leading to destabilisation of the pluripotent state and spontaneous differentiation [27]. Thus, changes in expression of key factors subsequent to cell division can lead to divergent fates in paired daughter cells, via reorganization of intracellular regulatory networks (see Figure 2).
D Induced pluripotency
The most dramatic changes to the IRN occur during reprogramming in vitro [99]. The principle of cellular reprogramming is to establish pluripotency in somatic cells by transiently inducing the activity of key parts of the self-sustaining pluripotency network, for instance by ectopic overexpression of core factors in direct reprogramming [6–⇓8]. The more components a somatic cell IRN and the pluripotent IRN have in common, the fewer factors are required for this identity-remodelling [29,30]. Moreover, redundancies in the pluripotency IRN allow the replacement of individual factors without affecting the final cell identity [100]. Full reprogramming typically takes a number of weeks, and along the way to the pluripotent state, cells transition through a range of intermediate signalling [101] and chromatin states [102,103]. Although the specific trajectory taken depends upon the cells initial identity, it has been shown, using a drug inducible system for comprehensive dedifferentiation [104], that the progeny of the majority of cell types are able to undergo an identity conversion towards pluripotency [105]. Although early reports indicated that reprogramming is a stochastic process [105–⇓107], in the case that overexpression of key factors in supplemented with additional knockdown of Mbd3 deterministic reprogramming with synchronised emergence of pluripotent colonies has been observed [108], suggesting that reprogramming progresses through a fixed sequence of remodelling events. Thus, the balance between stochastic and deterministic mechanisms has yet to be fully elucidated. Single cell based expression data, addressing the sequence of reprogramming events have very recently become available [109–⇓111]. Using these data and similar experiments to study the corresponding topological changes to the IRN that occur during this transition, will likely inform important questions surrounding the nature of reprogramming.
5 Controllability of single cell networks
In summary, many efforts have been made to decipher the components within the IRN that control average cell behaviour [112–⇓114]. The structure of these ensemble networks can explain this population-level behaviour [9], however, due to variability in the expression levels, and, necessarily, cell-to-cell differences in the IRN, the response to the provided stimuli will vary greatly among individual cells, leading, for instance, to impurities in the resulting cell population following ‘directed’ differentiation, or, as another example, incomplete cellular reprogramming. An intuitive strategy to develop better experimental protocols is to identify important driver nodes, in order to reduce the undesirable by-products that emerge from these processes. These driver nodes are the set of nodes that must be manipulated in order to control a system completely [115], for instance to steer the IRN of a particular cell state into the desired alternative state. In the absence of a full understanding of the structure and dynamics of the IRN this is a challenging task, although recent developments in the theory of controllability of networks may help [116]. One strategy to address this problem is the computational inference of regulatory networks from single cell data. For this purpose, a number of methods are available [117,118]. Individual studies that have already started to adopt similar strategies employed Boolean networks to reconstruct pluripotent IRN from single cell qPCR data [79] or Bayesian network inference in order to extract the sequence of events during reprogramming [109]. We predict that such methods – which combine high-throughput single cell profiling, with advanced network analysis routines – will lead to a more complete understanding of pluripotency in general and the development of better protocols for stem cell maintenance, differentiation and reprogramming in particular.
Conflict of Interest Statement
The authors have declared no conflict of interest.
Acknowledgements
6 References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [78].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵
- [94].↵
- [95].↵
- [96].↵
- [97].↵
- [98].↵
- [99].↵
- [100].↵
- [101].↵
- [102].↵
- [103].↵
- [104].↵
- [105].↵
- [106].↵
- [107].↵
- [108].↵
- [109].↵
- [110].↵
- [111].↵
- [112].↵
- [113].↵
- [114].↵
- [115].↵
- [116].↵
- [117].↵
- [118].↵
- [119].
- [120].
- [121].