ABSTRACT
Rapid mutation fuels the evolution of many cancers and pathogens. Much of the ensuing genetic variation is detrimental, but cells can survive by limiting the cost of accumulating mutation burden. We investigated this behavior by propagating hypermutating yeast lineages to create independent populations harboring thousands of distinct genetic variants. Mutation rate and spectrum remained unchanged throughout the experiment, yet lesions that arose early were more deleterious than those that arose later. Although the lineages shared no mutations in common, each mounted a similar transcriptional response to mutation burden. The proteins involved in this response formed a highly connected network that has not previously been identified. Inhibiting this response increased the cost of accumulated mutations, selectively killing highly mutated cells. A similar gene expression program exists in hypermutating human cancers and is linked to survival. Our data thus define a conserved stress response that buffers the cost of accumulating genetic lesions and further suggest that this network could be targeted therapeutically.
INTRODUCTION
A multitude of DNA repair factors promote faithful transmission of genetic information from one generation to the next (Friedberg et al., 2014). However, in response to oncogenic transformation, infection, and environmental stress, many cells become hypermutators (Alexandrov et al., 2013; Bielas et al., 2006; Cairns and Foster, 1991; Galhardo et al., 2007a, b; Lawrence et al., 2013; Loeb, 2016; Oliver et al., 2000). Human cancers (e.g., melanoma, lung cancer, and colorectal cancer; Lawrence et al., 2013) often have mutation rates that are three orders of magnitude higher than those of normal tissue. Although mutations can occasionally be adaptive, theoretical and experimental studies in a wide range of organisms suggest that they are on average slightly deleterious (Bloom et al., 2004; Eyre-Walker and Keightley, 2007; Firnberg et al., 2014; Pakula and Sauer, 1989; Tokuriki and Tawfik, 2009). Therefore, although hypermutation can fuel rapid evolution, it is also associated with fitness costs. In the simplest scenarios, random mutations can disrupt catalysis by an enzyme, decrease the strength of protein–protein interactions, or perturb baseline gene expression levels. In addition, empirical studies have shown that the majority of mutations in coding regions destabilize protein structure (Fersht, 1998), a property that may have slowed evolution of highly expressed genes (Drummond et al., 2005; Drummond and Wilke, 2008). Thus, as mutations accumulate, fitness is expected to decline. Synergistic epistasis among accumulating mutations could in principle accelerate this effect (de Visser et al., 2011; Elena and Lenski, 1997).
Despite the toxic consequences of mutation, many cells, ranging from pathogenic bacteria and fungi (Fares et al., 2002b; Maisnier-Patin et al., 2005; Oliver et al., 2000) to human tumor cells (Andor et al., 2017; Lawrence et al., 2013) can survive considerable mutation burdens. High mutation rates arising from loss of DNA mismatch repair genes occur in >20% of pathogenic Pseudomonas aeruginosa colonizing the lungs of cystic fibrosis patients (Oliver et al., 2000). Extreme mutators can also be found among Helicobacter pylori and Neisseria meningitidis isolated from human patients (Hall and Henderson-Begg, 2006). In humans, mutations in DNA mismatch repair genes, including MSH2 and MSH6, cause Lynch Syndrome, a disease characterized by an elevated risk of colorectal and other cancers (Lynch et al., 2015; Sijmons and Hofstra, 2016). Many human cancers that do not arise from hereditary mutations in DNA mismatch repair genes also exhibit elevated mutation burden (Campbell et al., 2017; Lawrence et al., 2013; Yousif et al., 2018).
Each of these systems has a mutation rate that approaches the theoretical limits for a ‘error catastrophe,’ in which the fidelity of information transfer from one generation to the next is too low to support viability (Eigen, 2002; McFarland et al., 2017; Sole and Deisboeck, 2004). As these cells accumulate more mutations, however, they do not die, but instead remain capable of surviving, proliferating, and evolving new traits with devastating clinical consequences (e.g., resistance to antibiotics or chemotherapeutics). In bacteria, the capacity of cells to buffer the cost of mutations has been linked to the protein homeostasis network (Aguilar-Rodriguez et al., 2016; Fares et al., 2002b; Maisnier-Patin et al., 2005; Moran, 1996; Sabater-Munoz et al., 2015). It remains unclear, however, whether analogous pathways exist in eukaryotes to dampen the effects of deleterious mutations.
One potential modulator of mutation cost in eukaryotes, heat shock protein 90 (Hsp90), has been proposed to influence the phenotypic manifestation of natural genetic variation (Burga et al., 2011; Cowen and Lindquist, 2005; Jarosz and Lindquist, 2010; Queitsch et al., 2002; Rohner et al., 2013; Rutherford and Lindquist, 1998). Hsp90 is a highly conserved molecular chaperone that functions with dozens of co-chaperones (Taipale et al., 2010; Taipale et al., 2014) to fold hundreds of client proteins, most of which are key regulators of growth and development. From yeast to humans, Hsp90 can strongly influence the phenotypic effects of genetic and epigenetic variation that naturally arises within populations (Burga et al., 2011; Cowen and Lindquist, 2005; Jarosz, 2016; Jarosz and Lindquist, 2010; Karras et al., 2017; Queitsch et al., 2002; Rohner et al., 2013; Rutherford and Lindquist, 1998; Sangster et al., 2004). Although Hsp90 has been shown to enhance phenotypes derived from some recently accumulated genetic variants (Geiler-Samerotte et al., 2016; Mason et al., 2018), the full effects of this chaperone on recently accumulated mutations have yet to be fully characterized.
In this study, we harnessed the power of a simple model eukaryote, Saccharomyces cerevisiae, in which a wealth of genomic tools enabled us to investigate the interplay between de novo mutations and stress response pathways. We created obligate hypermutator lineages by disrupting DNA polymerase proofreading and/or mismatch repair, mimicking the situation in Lynch Syndrome patients and many hypermutating tumors. We then evolved the lineages in parallel using a mutation accumulation (MA) framework that mirrors the population bottlenecks that characterize infectious pathogens, solid tumors, and metastatic or refractory disease. These experiments revealed a strong bifurcation in the cost of accumulating mutations: lesions that arose in early generations had a much stronger impact on fitness than those that arose in later generations. Although no single mutation was shared among all lineages, mRNA sequencing revealed a shared response to mutation burden, which we term the Eukaryotic Mutation Burden Response (EMBR). EMBR is distinct from previously characterized stress responses and its components form a highly coherent network of protein–protein interactions in yeast, as do their homologs in humans. Targeting the EMBR network revealed that highly mutated yeast and colon cancer cells depend on EMBR gene functions to buffer the cost of accumulating mutations, representing an addiction that could be exploited therapeutically.
RESULTS
A sequenced collection of mutation accumulation lineages
Cancer cells on the brink of error catastrophe are apparently able to buffer the fitness consequences of accumulating mutation burden, but the molecular origins of this capacity remain unclear. We explored this question using mutation accumulation (MA) experiments, which have served as valuable tools in studies of a wide range of biological processes, from adaptive fitness trajectories in evolution to DNA polymerase usage at the replication fork (Denver et al., 2009; Huang et al., 2016; Lujan et al., 2014; Ossowski et al., 2010; Uchimura et al., 2015; Zhu et al., 2014b). By propagating independent lineages through single-cell bottlenecks, we mimicked the genetic drift that can occur in the proliferation of solid tumors and in metastatic events (Barrick and Lenski, 2013; Sun et al., 2017).
Specifically, we constructed de novo MA lineages in the budding yeast S. cerevisiae (Figure 1A). Due to its concise genome and faithful recapitulation of fundamental eukaryotic biology, S. cerevisiae has been widely used to model the fundamental biology of cancer and stress responses (Hartwell, 2004; Khurana and Lindquist, 2010). To reproduce mutation rates that correspond to human tumors, we made use of extensive data about patterns of mutagenesis amassed in both S. cerevisiae and human patients (Figure 1A-B; Lawrence et al., 2013; Lujan et al., 2014; Roberts and Gordenin, 2014; Serero et al., 2014; Supek and Lehner, 2015; Wielgoss et al., 2013). We constructed mutator yeast strains by deleting the mismatch repair gene MSH6, and generated hypermutator strains by combining the msh6Δ allele with a point mutation in DNA polymerase δ that reduces its proofreading activity (pol3-L612M). Combining these alleles amplifies mutation frequency while preserving the mutation spectrum inherent to loss of MSH6 function (base pair substitutions; Lujan et al., 2014; Nick McElhinny et al., 2008). Indeed, mutation or misregulation of human MSH6 co-occurs with mutation in pol δ in patients with Lynch syndrome (Jansen et al., 2016). Sequencing of multiple mutation accumulation clones (see below) revealed that the msh6Δ mutator exhibited a 47-fold increase in mutation rate (~0.08 mutations per cell division) whereas the pol3-L612M msh6Δ double mutant exhibited a ~1,100-fold increase in mutation rate (~1.5 mutations per cell division; Figure 1B). Similar values have previously been reported using canavanine mutagenesis assays (Herr et al., 2011). Hereafter, we refer to these strains as “mutator” (msh6Δ) and “hypermutator” (pol3-L612M msh6Δ). This range encompasses the relative mutation rates in rapidly mutating human tumors (e.g., melanomas, glioblastomas, and colorectal cancers (Lawrence et al., 2013); Figure 1B; see SI for additional discussion).
The mutations in these lineages are random
We passaged sixteen parallel lineages of each MA genotype for over 1,000 generations to create independent descendants with distinct mutation spectra (Figure 1A). To promote random mutational trajectories and minimize the effects of selection, we employed single-cell bottlenecks, picking average-sized colonies every ~25 generations. Whole-genome sequencing (Table S1) established that the lineages accumulated around 100 (for mutators) or 2,000 (for hypermutators) unique mutations over the course of passaging. Lineages from the same generation did not share more mutations than expected by chance (Figure 1C-D), in stark contrast to the stereotyped parallelism observed in chemostat and other evolution experiments in large population sizes (Hope et al., 2017; Venkataram et al., 2016).
Several lines of evidence demonstrate that our MA experiment preserved a wide range of naturally occurring mutations, with the exception of lethal or near-lethal mutations: 1) In whole-genome sequencing data from multiple parallel lineages after 500 and 1,250 generations of propagation, we observed no systematic bias in the physical locations in which mutations arose; the number of mutations in 100 kilobase windows conformed closely to random expectation (Figure 1C); 2) Estimates of mutation rate from whole-genome sequencing data established that the hypermutator phenotype was maintained throughout passaging (Figure 1 – figure supplement 1A, Table S1); 3) The mutation spectrum was dominated by base substitutions and enriched for transitions over transversions both early (passages 1 – 20) and late (passages 20 – 50) in the experiment, also consistent with the neutral expectation (Figure 1 – figure supplement 1B; Lujan et al., 2014); 4) Mutations accumulated independently of the local sequence context (Figure 1 – figure supplement 1C); 5) Given the number of cell divisions that occurred during the experiment and the elevated mutation rate, each lineage could have explored mutations covering virtually every base pair in the genome (see SI for further discussion). If specific mutations were selected for their fitness benefit, they would be expected to appear in multiple independent lineages, as is the case with cancer driver mutations (Sidow and Spies, 2015). Very few individual mutations occurred in more than one independent lineage, and none were shared in more than two lineages (Figure 1D); 6) Several genes were mutated (at different locations) in three lineages, but fewer genes with missense mutations were shared and no genes with severe mutations (stop codon gains or frameshifts) were shared (Figure 1 – figure supplement 1D). The number of mutations in a given ORF is primarily driven by gene length as expected from minimal selection (Figure 1 – figure supplement 1E); 7) Deleterious mutations predicted by the SIFT algorithm (Kumar et al., 2009) occurred at a constant rate throughout passaging (Figure 1 – figure supplement 1F). We further compared the fraction of deleterious variants in our strains relative to natural genetic variants from sequenced wild yeast strains that have experienced extensive selective pressure (Bergstrom et al., 2014; Liti et al., 2009). The fraction of predicted deleterious variants in our lineages was much larger than the fraction in natural variants from wild yeasts (Figure 1 – figure supplement 1G). We therefore conclude that the MA lines were largely free from selective pressures that could bias their evolution toward convergent genotypes. Taken together, these results support our use of these MA lines as a model for investigating the accumulation of random mutations and the resulting mutation burden.
The impact of accumulated mutations on fitness
To assess the phenotypic impact of accumulated mutation burden, we measured the doubling times of the various mutated lineages relative to their ancestors. As a critical control, we passaged eight independent lineages of wild-type cells for the same number of generations and under the same conditions. These lineages exhibited little change in genotype (they accumulated one or zero mutations) or fitness over the entire experiment when grown on glucose or non-fermentable glycerol (Figure 2A, Figure 2 – figure supplement 1A, Table S1), providing further evidence that our propagation scheme did not impose undue selection. Likewise, in the mutator lineages (msh6Δ) we observed no fitness decline in multiple growth environments (Figure 2A, Figure 2 – figure supplement 1A).
The fitness of the hypermutator lines (pol3-L612M msh6Δ) did decline. Moreover, the cost of each successive mutation decreased over the course of the experiment (Figure 2B, Figure 2 – figure supplement 1B), in striking contrast to the expectation had the effects of the accumulating mutations been independent (i.e., additive). In a generic regression model , a β value of 1 would correspond to a linear, purely additive model (Lenski et al., 1999; Maisnier-Patin et al., 2005). However, our data were best fit by β = 0.460 ± 0.106 (for growth on glucose; p < 0.0001), and β = 0.575 ± 0.119 (for growth on glycerol; p < 0.0001 by extra-sum-of-squares F-test). These values are indicative of antagonistic epistasis (Figure 2B, Figure 2 – figure supplement 1B). That is, the apparent fitness cost per mutation was much higher for mutations that arose early in these lineages than for those that arose later (Figure 2C, Figure 2 – figure supplement 1C). This effect is remarkable considering that there was no discernable difference between the types of mutations that accumulated in the earliest and latest passages.
A simple explanation for such apparent epistasis would be an inability to observe colonies with fitness below a certain threshold (because they would not grow to a sufficient size before the next round of propagation). This would inadvertently select for progressively less deleterious mutations over the course of the experiment. However, populations with much lower fitness than we observed are capable of growing into sizeable colonies within our propagation schema (see SI for additional discussion). As described below, we could also measure even slower growth of the MA strains we generated after chemical inhibition (Figure 3B). Therefore, we conclude that the fitness trajectory of accumulating mutations is indeed characterized by apparent antagonistic epistasis.
In an evolution experiment where selection had dominated, the likeliest origin of antagonistic epistasis would be genetic suppression. That is, the fitness costs of an ancestral mutation could be compensated by newly arising variants. However, in our schema the variants accumulated randomly. Thus, for genetic suppression to entirely explain the effect, the number of antagonistic genetic interactions would need to be greater than the number of synergistic genetic interactions. However, data from systematic double deletion libraries have revealed significantly higher numbers of synergistic interactions than antagonistic interactions (~1.5-fold; Costanzo et al., 2016). The effect sizes of synergistic interactions also tend to be larger in magnitude than those of antagonistic interactions. We therefore searched for a cellular response that might buffer the cost of mutation load.
Protein folding is a key contributor to the cost of accumulating mutations
The antagonistic epistasis we observed was strikingly reminiscent of mutation accumulation experiments in Salmonella typhimurium (Maisnier-Patin et al., 2005) and Escherichia coli (Fares et al., 2002b). In these systems, upregulation of the bacterial protein chaperone GroEL buffers the fitness cost of accumulating mutation burden. Hence, we investigated the influence of molecular chaperones. The eukaryotic GroEL homolog Hsp60 has an exclusively mitochondrial function (Zeilstra-Ryalls et al., 1991), but the Hsp90 chaperone can exert a strong influence on the capacity of natural genetic variants to produce phenotypes (Burga et al., 2011; Cowen and Lindquist, 2005; Jarosz and Lindquist, 2010; Queitsch et al., 2002; Rohner et al., 2013; Rutherford and Lindquist, 1998). Whether this chaperone buffers the phenotypic outcome of de novo mutations remains controversial (Geiler-Samerotte et al., 2016; Mason et al., 2018). Therefore, we asked whether the behavior of our MA lineages would be influenced by inhibition of Hsp90.
We first exposed the mutated lineages to radicicol, a potent inhibitor of Hsp90 function. The mutated cells were much more sensitive to the drug than their unmutated ancestors (Figure 3A). Indeed, the fitness cost per mutation was amplified by this treatment. That is, the alpha value of this generic regression model decreased (untreated α = 0.006 ± 0.004, treated α = 0.010 ± 0.007; p < 0.05). Hsp90 inhibition also resulted in a more additive relationship between accumulating mutations and fitness (untreated β = 0.699 ± 0.096, treated β = 0.749 ± 0.095; Figure 3B-C). When α was constrained to 0.01 (the fitness cost per mutation in the presence of Hsp90 inhibitor), the β value of the untreated fit decreased further to 0.612. These data suggest that the apparent antagonistic epistasis we observed is in part driven by an active cellular response involving Hsp90. Importantly, both mutated and ancestral cells were equally sensitive to other, unrelated stressors such as hydrogen peroxide and the antifungal fluconazole (Figure 3D-E). Because no mutation was shared by all of the lineages, we conclude that Hsp90 has a broad impact on the phenotypic outcome of multiple new mutations.
To assess Hsp90 activity in the MA lineages, we measured the activity of the Hsp90-dependent tyrosine kinase v-Src from the transforming Rous Sarcoma Virus (Xu and Lindquist, 1993; Zabinsky et al., 2018). This widely used assay relies on the fact that v-Src kinase inhibits yeast growth. However, because v-Src is an obligate Hsp90 client, its toxicity requires abundant and available Hsp90 to chaperone the folding of the kinase (Brugge et al., 1987; Falsone et al., 2004; Xu and Lindquist, 1993). We reasoned that if Hsp90 activity were consumed by demand from mutated proteins in the MA lineages, the toxic effect of v-Src would be mitigated. Indeed, v-Src toxicity was markedly reduced in the MA lineages compared to ancestral lines (Figure 3E– F). Collectively, these data suggest that the Hsp90 chaperone plays a central role in buffering the costs of accumulating mutations.
The lineages mount a shared stress response
Each MA lineage harbored a unique set of mutations (Figure 1E, Figure 1 – figure supplement 1D) but all lineages we examined shared a physiological response. This led us to investigate whether independent lineages might engage a common gene expression response to mutation burden. To address this question, we performed mRNA-sequencing (RNA-seq) to examine gene expression across multiple hypermutator lineages, choosing time points corresponding to both the ‘high-cost’ and ‘low-cost’ phases of their fitness trajectories. We first tested whether continuous passage alone had an impact on gene expression, examining the transcriptome of wild-type lineages passaged for 1,250 generations. Gene expression patterns in these cells were virtually unchanged relative to the wild-type ancestor, and the eight genes that were differentially expressed were not enriched for any shared function or protein–protein interactions (Table S2). Thus, passaging itself had a minimal impact on gene expression.
Increasing mutation burden in the hypermutator lineages, by contrast, was associated with marked changes in gene expression. As expected, transcriptional profiles diverged with increasing genetic distance between each passaged MA lineage and its ancestor (represented by gray arrow in Figure 4A, and gray points in Figure 4B). We also compared sister lineages, whose genetic distance from each other was twice that of a single lineage to its own ancestor (orange arrows and points). In these cases, the increased genetic distance was not associated with a concomitant transcriptional divergence (Figure 4B). Thus, despite having accumulated completely different sets of mutations, the congruent gene expression profiles of the highly mutated lineages point to a shared response to mutation burden.
Further examination of our data revealed shared changes in gene expression among the independent lineages, which we term the Eukaryotic Mutation Burden Response (EMBR). After ~1,250 generations in the hypermutator lineages, expression of this cohort of 360 genes (FDR Q < 0.05, Figure 4C-D, Table S2-7) changed robustly relative to the ancestor. EMBR was not dominated by the effects of one individual lineage, nor was the response simply due to the slower growth rate of lineages with high mutation burden, as we observed only a slight enrichment in genes whose expression is correlated with growth rate (Figure 4E; Brauer et al., 2008). We therefore conclude that, despite the diversity of genetic lesions they accumulated, the hypermutating lineages each mounted a similar stress response.
EMBR is distinct from previously reported stress responses
To further characterize the transcriptional response to mutation burden, we analyzed the shared pathways and functions of differentially expressed genes. The only Gene Ontology (GO) term (Szklarczyk et al., 2015) enriched among EMBR genes that increased in expression was ‘protein refolding,’ consistent with our prior observations of sensitivity to Hsp90 chaperone inhibition. However, this enrichment was very weak (FDR = 0.00295), and largely centered on Hsp90. Down regulated genes showed even less enrichment; ‘regulation of transcription from RNA polymerase II promoter’ was the only GO term that emerged (FDR = 0.0117).
We next compared these gene expression changes to other stress conditions that have been transcriptionally profiled in yeast. EMBR was not enriched for environmental stress response genes (Gasch et al., 2000) that characterize many such gene expression profiles (Figure 4F) or the DNA damage response to MMS, bleomycin, and cisplatin (Caba et al., 2005; Figure 4G). Although direct comparisons of gene expression datasets can be complicated by batch effects, a principal component analysis grouped EMBR as a separate response from well-studied responses to heat shock, unfolded protein, osmotic stress, oxidative stress, and DNA damage (Figure 4H). Comparison to all of the 2,400 gene expression profiles that have previously been reported in yeast (Hibbs et al., 2007) did not reveal stronger overlaps with other known stress responses.
To examine the network connectivity amongst proteins involved in EMBR, we took advantage of the protein–protein interaction networks that have been systematically mapped in S. cerevisiae (Costanzo et al., 2010; Szklarczyk et al., 2015). In contrast to the weak GO term enrichment, we observed an extremely strong enrichment for protein–protein interactions among EMBR components (Figure 5A; p < 10−16 for components that increase with mutation burden; p < 10−7 for those that decrease with mutation burden; p < 10−15 for all components). This high degree of connectivity suggests that EMBR is a concerted physiological response.
We next sought to identify factors that might coordinate EMBR. We searched for enrichment of transcription factor binding sites among the promoters of EMBR genes using two complementary algorithms: Pscan (Zambelli et al., 2009) and oPOSSUM (Kwon et al., 2012). Among the genes that were upregulated, binding sites for four transcription factors were clearly enriched: Ume6, Gcn4, Uga3, and Gis1 (Figure 5B). By contrast, the targets of transcription factors critical for mounting other stress responses, such as Hsf1 or Msn2/4, were not enriched in the mutated lineages (p = 0.5 by Pscan).
Ume6 represses meiosis-specific promoters by binding to upstream repressive sequences and recruiting the histone deacetylase complex Rpd3–Sin3 (Hahn and Young, 2011). Ume6 is also part of a signaling cascade that regulates autophagy via repression of ATG8 (Bartholomew et al., 2012) and its repression and activation by Sin3 and Ime3 is analogous to the stepwise activation of human Myc/Max/Mad axis (Washburn and Esposito, 2001). Gcn4 is a central regulator of metabolism, notably amino acid biosynthesis, and several stress responses (Natarajan et al., 2001). Uga3 is a transcriptional activator for γ-aminobutyrate (GABA)-responsive genes (Andre, 1990). Gis1 is a histone demethylase and transcription factor that regulates genes during nutrient limitation (Tu et al., 2007) and hypoxia (Dastidar et al., 2012). We also compared EMBR to gene sets that were differentially regulated in strains in which transcription factor genes had been deleted, using the SPELL database (Hibbs et al., 2007). This analysis revealed overlap between EMBR and two mutant datasets, gcn4∆ and bas1∆ (Fendt et al., 2010). Both of these genes were also identified by Pscan and oPOSSUM (highlighted in red in Figure 5B). Bas1 regulates the expression of enzymes in the histidine, purine, and pyrimidine biosynthetic pathways (Arndt et al., 1987; Daignan-Fornier and Fink, 1992; Denis et al., 1998; Denis and Daignan-Fornier, 1998) and is homologous to the myb proto-oncogene family (Tice-Baldwin et al., 1989).
The overlap of genes in EMBR with those that are differentially regulated in gcn4∆ and bas1∆ mutants suggests that these transcription factors are likely important regulators of EMBR. In total, 33% of EMBR genes are experimentally validated targets of at least one of these five transcription factors (Cameroni et al., 2004; Cherry et al., 2012), whereas the validated targets of all five compose 22% of all genes in the genome. We reasoned that if the transcription factors acted predominantly as an EMBR activator, its deletion should reduce expression of upregulated EMBR genes and conversely if a transcription factor acted predominantly as an EMBR repressor, its deletion should increase expression of upregulated EMBR genes. Genes that were downregulated in the gcn4∆ mutant were significantly enriched in upregulated EMBR genes, suggesting Gcn4 primarily functions as an EMBR activator (Figure 5C; p < 0.01, Fisher’s exact test). Genes upregulated in the ume6∆ and bas1∆ mutants were enriched in upregulated EMBR genes suggesting Ume6 and Bas1 function as EMBR repressors (Figure 5C; p < 0.05, Fisher’s exact test). Very few EMBR genes changed expression in the uga3∆ mutant (Hu et al., 2007). Understanding the full role of these transcription factors in EMBR activation and repression, and the regulation of the remaining 67% of EMBR genes, stands as a goalpost for future studies. However, the strong connectivity that we observe among EMBR genes led us to investigate whether the response had adaptive value.
EMBR components are required for the survival of mutated cells
To determine the importance of EMBR genes in buffering mutation burden, we perturbed the activity of several proteins that were: i) upregulated with accumulating mutations; ii) engaged in several physical and genetic interactions with other genes involved in the same biological processes; iii) amenable to pharmacological inhibition; and iv) homologous to well-described human proteins. First, we observed upregulation of several autophagy-related genes that are crucial for phagophore initiation (e.g., ATG1, ATG2, ATG13), expansion of the autophagosome (ATG7), and fusion of the autophagosome with the vacuole (VAM6, MON1). Autophagy not only serves as a way to recycle macromolecules in the face of nutrient limitation, but also as a mechanism to clear damaged organelles and misfolded protein aggregates. We wondered whether the mutation burden, and consequent protein misfolding, in our mutated cells might require autophagy to clear. To investigate, we exposed the MA lineages to bafilomycin, a well-characterized inhibitor of autophagy (Yoshimori et al., 1991). Indeed, sensitivity to this compound increased with the mutation burden (Figure 6A-B), establishing the protective effect of autophagy for mutation burden.
We also observed upregulation of the inosine-5’-monophosphate (IMP) dehydrogenases, IMD1/2. IMD2 is an important but non-essential gene in yeast; it catalyzes the rate-limiting step in GTP synthesis and is regulated by Ume6. Heterozygous mutations in one of its human homologs, IMPDH1, are associated with retinitis pigmentosa (McKusick, 2007), and expression of its other human homolog, IMPDH2, is correlated with poor prognosis of nasopharyngeal carcinoma (Xu et al., 2017), subtypes of which have high mutation load (Zhang et al., 2017). The encoded protein can be inhibited in vivo by mycophenolic acid (Fleming et al., 1996), which is currently used to limit transplant rejection and autoimmune disease and has recently been introduced as an anti-cancer agent in clinical trials (Chen and Pankiewicz, 2007). The requirement for IMPDH2 in cancer is not completely understood (Xu et al., 2017). The mutated lineages were much more sensitive to treatment with this inhibitor than ancestral controls (Figure 6C-D), establishing the importance of this EMBR component for withstanding accumulating mutation burden.
We further observed upregulation of several arginine biosynthesis genes: ARG1, ARG2, ARG5/6. To test if the mutated lineages became unusually dependent on arginine for proliferation, we decreased levels of arginine in the growth medium and measured the effect on growth. Whereas the ancestral strain was tolerant to arginine removal, the burdened lineages were strikingly sensitive to decreased arginine availability (Figure 6E). The precise mechanisms by which these and other factors limit the cost of mutation burden demand future study. However, these data clearly establish that EMBR components can be critical for the survival of hypermutating cells.
Conservation of EMBR in humans
We wondered whether a response resembling EMBR might be conserved in humans. To investigate, we identified 568 human homologs of the S. cerevisiae EMBR genes through deltaBLAST searches (Boratyn et al., 2012). The human homologs of EMBR genes (hEMBR) form a highly enriched protein–protein interaction network (Figure 7A; p < 1.1 × 10−16, hypergeometric test using physical and genetic interactions from BIOGRID; www.thebiogrid.org). However, because they are involved in deeply rooted biological processes, random sets of genes conserved from yeast to humans also encode proteins that are enriched in interactions. To test whether hEMBR was enriched in protein–protein interactions beyond what would be expected for conserved proteins, we sampled the protein–protein interactions of 500 random sets of proteins conserved from yeast to humans to define a null distribution. This bootstrapping analysis revealed that hEMBR was significantly enriched in protein–protein interactions relative to random expectation (P = 7.0×10−4), suggesting that the underlying structure of EMBR is conserved in humans.
If hEMBR were important for cancers with a high mutation burden, we would predict that cell lines derived from such tumors would be sensitive to inhibition of hEMBR genes. We used an inhibitor of Hsp90, geldanamycin, and an inhibitor of IMPDH2, mycophenolic acid, to investigate. We used two colon cancer cell lines that are confirmed mutators with thousands of variants measured, HCT-116 and HCT-15 (Barretina et al., 2012; Vilar and Gruber, 2010), and a non-mutator colon cancer cell line SW480 (Barretina et al., 2012). Indeed, HCT-116 and HCT-15 were far more sensitive to both inhibitors than SW480 was (~5-fold for geldanamycin, ~10-fold for mycophenolic acid; Figure 7B). These same cell lines are equivalently sensitive to multiple other drugs (Iorio et al., 2016). This result is further supported by anti-cancer activity of mycophenolic acid and another IMP-dehydrogenase inhibitor, Tiazofurin (Chen and Pankiewicz, 2007; Malek et al., 2004).
Can yeast EMBR further inform our understanding of the response to mutation burden in human cancer cells? Autophagy was important for yeast cells to withstand mutation burden, and this mechanism also provides a promising target for cancer therapies (Levy et al., 2017). Arginine was also important for yeast cells to withstand mutation burden, and the human homolog of the EMBR gene ARG1, ASS1, is activated in colon cancers, where it is thought to contribute to pathogenicity (Bateman et al., 2017). Tumors have varying mutation burdens, as well as diverse genotypes and mechanisms of oncogenesis. They thus constitute an incredibly complex dataset, with many confounding variables. Although these analyses are fundamentally under-powered and are limited to genes conserved from yeast to humans, we nevertheless searched for an hEMBR signature within curated datasets from The Cancer Genome Atlas (TCGA). Colon adenocarcinomas exhibited a bimodal distribution of mutation burdens, allowing us to separate tumors with high and low mutation burdens into distinct groups (Figure 7C). To narrow our scope and identify which hEMBR components drive the largest biological differences among colon cancer tumors, we performed a principal component analysis on the transcriptomes of the tumor samples. High mutation burden tumors roughly clustered together along the first two principal components (Figure 7D). We then asked what hEMBR factors contributed the most to the distinction between high and low mutation burden in the principal component analysis. We found HSP90AA1 (a human homolog of Hsp90), ASS1 (the human homolog of ARG1), IMPDH2 (the human homolog of IMD2), and HSPD1 (a human homolog of Hsp60; Figure 7 – figure supplement 1).
We next asked whether the expression of homologs of the EMBR regulators we identified were associated with survival of patients with tumors harboring high and low mutation burden. We hypothesized that activation of hEMBR due to differential expression of these hEMBR regulators would be beneficial for cancer cells with high mutation burdens but detrimental to patient survival. GCN4 is homologous to the transcriptional activator ATF4 (Murguia and Serrano, 2012), and BAS1 is homologous to MYB oncogene (Tice-Baldwin et al., 1989). Indeed, although there was no distinction between survival curves in low mutation burden cancers regarding ATF4 expression (Figure 7E), in high mutation burden cancers, high ATF4 was associated with significantly decreased survival (Figure 7E). This is consistent with Gcn4 functioning to activate EMBR in yeast. ATF4 is known to promote survival under stress while carefully balancing induction of apoptosis (Wortel et al., 2017). Based on these observations we propose that an additional role could be activation of hEMBR in high mutation burden cancers. We next performed a similar analysis for the BAS1 homolog MYB. There was no distinction between survival curves for patients with low mutation burden cancers with respect to MYB expression (Figure 7E). But in cancers with high mutation burdens, high MYB expression was associated with increased survival (Figure 7E). This is consistent with Bas1 functioning to repress EMBR in yeast. MYB mostly acts as a transcriptional activator balancing cell differentiation, is linked to several leukemias, and can be activated in both colon and breast cancers (Ramsay and Gonda, 2008). Our results suggest that regulation of hEMBR by both ATF4 and MYB could be protective against high mutation burden in cancers and thus lead to poor patient prognosis.
Although HSP90 expression has been implicated in cancer survival in several contexts (Jaeger and Whitesell, 2019; Whitesell and Lindquist, 2005; Whitesell et al., 2014), the ability to compare expression to patient survival data is limited by the low number of patients, particularly when stratifying samples. Despite these limitations we observed a striking correlation in colon cancer patient data (Chandrashekar et al., 2017) between increased HSP90 expression and decreased patient survival that was only evident in cancers with high mutation burden (Figure 7E). We thus propose that mutational buffering could offer an explanation for the strong correlation of HSP90 expression and tumor growth (Ciocca and Calderwood, 2005; Jaeger and Whitesell, 2019; Santagata et al., 2011). The other hEMBR genes that we identified above (Figure 7 – figure supplement 1) appear to show a similar trend in our analysis (Figure 7E). Further study will be required to fully evaluate the effects of hEMBR genes on how tumors tolerate high mutation burdens and how this impacts patient survival. Nonetheless, these analyses suggest that hEMBR may have an important role in hypermutating human cancers.
DISCUSSION
Many pathogenic bacteria and cancer cells exhibit elevated mutation rates, which can confer a wide variety of benefits such as drug resistance and rapid escape from immune surveillance (Davoli et al., 2017; Oliver et al., 2000). However, most mutations are predicted to be deleterious. Accordingly, to optimize fitness, such hypermutator phenotypes should be transient and revert once a beneficial mutation has been assimilated (Sniegowski et al., 1997; Sniegowski and Murphy, 2006). Indeed, this behavior has recently been observed in experimental evolution in E. coli (Swings et al., 2017). However, in a surprisingly large number of biological scenarios, including many human cancers, hypermutator phenotypes are stable, and cells remain capable of surviving, proliferating, and evolving new traits without succumbing to error catastrophe. Our observation that eukaryotic cells possess an intrinsic robustness against accumulating mutations provides a potential resolution for this paradox.
By generating a set of mutation accumulation (MA) lineages with defined mutation rate and spectrum, we were able to systematically examine the relationship between fitness and mutation burden. Rapid acquisition of non-overlapping sets of mutations led to a consistent, non-linear relationship between mutation burden and fitness: mutations acquired early were on average costlier than mutations acquired later. Remarkably, this behavior was consistent across lineages, despite the fact that no single mutation was universal. Because the MA lineages maintained their hypermutator phenotypes over the course of the experiment, we conclude that the cells induced a protective response that limited the fitness cost of mutations that accumulated later in the buffered phase of the experiment.
Similar behavior has been observed in bacteria, in which buffering is mediated by the chaperonin GroEL in several species. For example, GroEL suppresses the phenotypes of temperature-sensitive mutations in the replication protein DnaA (Jenkins et al., 1986) and some phage proteins (Van Dyk et al., 1989). Highly mutated S. typhimurium express high levels of GroEL, and artificial overexpression of the chaperonin further improves fitness (Maisnier-Patin et al., 2005). Bacteria that naturally sustain high mutation burdens, such as the aphid endosymbiont Buchnera, also strongly express GroEL, and this has been linked to their capacity to withstand the extremely high levels of genetic drift caused by their small effective population size (Fares et al., 2002a). However, because the eukaryotic GroEL homolog Hsp60 has limited mitochondrial function, it has remained unclear whether and how buffering might be mediated in eukaryotes.
The stress response that we report here differs from any that have previously been described. A key component of EMBR is the Hsp90 chaperone and several of its accessory co-chaperones. Data from flies (Rutherford and Lindquist, 1998), plants (Queitsch et al., 2002; Sangster et al., 2007; Sangster et al., 2008), yeast (Cowen and Lindquist, 2005; Jarosz and Lindquist, 2010), worms (Burga et al., 2011), Mexican cavefish (Rohner et al., 2013), and most recently humans (Karras et al., 2017) suggests that this chaperone plays a key role in the phenotypic manifestation of genetic variation. In mechanistic terms, Hsp90 can assist in the folding of unstable gain-of-function protein variants, thereby potentiating their immediate phenotypic effect. It can also buffer the impact of other genetic variants, silencing their phenotypic impact. Here we found that inhibiting Hsp90 amplified the fitness defects later in the MA lineages. Our data thus establish that Hsp90 can buffer the cost of de novo mutations. A prior study that did not reach the same conclusion used a much smaller number of mutations (roughly four mutations per lineage in 94 separate lineages; Geiler-Samerotte et al., 2016). In that regime, the biochemical effects of specific mutations themselves might have dominated the fitness measurements. In our experiment, in which a larger number of polymorphisms were examined, the effects of Hsp90 were strongest when mutation burden was sufficient to affect a much larger swath of the proteome.
We identified hundreds of other EMBR components. These were strongly enriched for protein–protein interactions. By contrast, gene ontology enrichments were sparse, and surprisingly weak where they existed. Thus, EMBR is very different from other stress responses that have previously been reported. The strong enrichment for interacting proteins, and the fact that inhibition of EMBR components impaired the fitness of highly mutated cells, suggest that this stress response is likely to be adaptive.
In pathogens hypermutator phenotypes can facilitate antibiotic resistance, immune evasion, or morphological innovation such as filamentous growth. Although the clinical use of antibiotics is a relatively recent phenomenon, host–pathogen conflict is ancient. Conservation of a response that allows pathogens to adopt hypermutator states with minimal fitness cost could be beneficial for evolution over short and long timescales. Indeed, many microbial pathogens, and even wild fungal populations, frequently adopt a hypermutator state (Bui et al., 2017; Guo et al., 2018; Raghavan et al., 2018). The alleles that govern this behavior have been retained across a considerable evolutionary distance (Bui et al., 2017), suggesting that it confers an adaptive advantage.
Hypermutating cancers are evolutionary experts, capable of buffering the cost of accumulating mutations while simultaneously harnessing the potential of this raw genetic material to rapidly evolve new traits. This evolutionary virtuosity can have devastating consequences for human health, such as the emergence of resistance to chemotherapies. Here, we show that eukaryotic cells can broadly limit the cost of accumulating mutations by mounting a conserved stress response that we term EMBR. We posit that highly mutated cells are addicted to EMBR, analogous to oncogene addiction but in this case driven by a gene expression network rather than a single driver mutation. Our data suggest that this addiction may expose multiple Achilles’ heels with potential for therapeutic intervention.
METHODS
Strain construction and propagation
BY4741 haploid yeast were transformed to replace the MSH6 gene with an antibiotic resistance marker (hygromycin) or to replace the POL3 gene with a variant that lacks proofreading activity, pol3-L612M (Nick McElhinny et al., 2007). Double mutants were constructed by performing both types of transformations in series, pol3-L612M followed by msh6∆. Transformants were grown up from single colonies and frozen as passage 0. We constructed our accelerated mutation accumulation (MA) lines in haploid yeast so that we could study the effects of all polymorphisms, including recessive alleles.
Passage 0 strains (parent BY, msh6∆, pol3-L612M, and double mutant pol3-L612M msh6∆) were streaked on both YPD (yeast extract-peptone-dextrose medium) and YPG (yeast extract-peptone-glycerol medium) plates and grown at 25°C for at least 96 hours. Individual colonies of average size were selected, resuspended in a 96-well plate, and then streaked onto a fresh YPD or YPG plate for another round of growth (passage 1). Because S. cerevisiae can lose their ability to respire on fermentable carbon sources, we propagated one set of eight lineages on media containing glucose as a carbon source, and another set of eight lineages on media containing glycerol as a carbon source, so that potential loss of respiration would not confound our findings. At odd-numbered passages, small volumes of the stock plates were then transferred to fresh liquid YPD, and growth in solution was measured every ~5 minutes for 84 hours. At the end of this period, the cultures were diluted and spotted onto YPD, YPG, phosphonoacetic acid (Li et al., 2005), and canavanine plates to assess general growth, growth on glycerol (YPD strains only), maintenance of the pol3-L612M variant (which confers resistance to PPA), and mutation rate. Any loss-of-function mutation in the CAN1 arginine permease gene gives rise to canavanine resistance, providing a readout of mutagenesis. This process was repeated for 50 passages. During the course of the experiment on YPD, one hypermutator lineage lost its mutator phenotype, one went extinct. During the course of the experiment on YPG, one hypermutator lineage lost its mutator phenotype early on, one lost its mutator phenotype halfway through, and three went extinct. The frequency of these events is consistent with expectation from other mutation accumulation experiments in bacteria (Aguilar-Rodriguez et al., 2016). To study the biological consequences of cells surviving despite mutation burden, we focused our phenotypic analysis on lineages that survived for all 50 passages, or approximately 1,250 generations, while maintaining the mutator phenotype.
Growth measurements
At odd-numbered passages, surviving strains were inoculated in liquid culture to measure OD600 on a Synergy Multi-Mode Microplate Reader. Doubling time was extrapolated using a Bayesian curve-fitting model and converted to fitness (w) relative to the first passage on the same media: . WT = wild-type, MUT = hypermutator.
Drug treatment
Hyperutator and wildtype yeast strains were grown to saturation in YPD media in 96-well plates. Dilutions of drugs obtained from Selleckchem, MicroSource, and Sigma were prepared in DMSO and added to YPD; 10-100µM radicicol; 4.4mM H2O2; 10µM fluconazole; 12.5µM bafilomycin, 1mM mycophenolic acid. Media lacking arginine and uracil was obtained from Sunrise Science.
Yeast were diluted, and OD measurements were taken using a Synergy Multi-Mode Microplate Reader (Biotek). Mean and standard error were calculated for three technical replicates of each lineage, and then across independent mutated lineages. Growth rates were compared to extract a dose-response curve and inhibitory concentrations.
Sequencing, variant calling, and gene expression analysis
Variant calls were made using the GATK pipeline and annotations were made with SnpEff (Cingolani et al., 2012; McKenna et al., 2010). Only variants called in multiple biological replicates were considered for comparison between lineages (Fig 1E). Excluding variants present in the shared ancestor, variant calls unique to each lineage, shared between any two lineages, or shared across lineages were counted. The predicted phenotypic impact of these variants was characterized using SIFT (Kumar et al., 2009). Genes in which variants occurred within ORFs were also categorized as essential (as defined by the Stanford Deletion Project) or non-essential.
For gene expression analysis, mutator and wildtype strains were collected during log-phase growth. Pellets were flash frozen and stored at −80°C. Yeast were lysed, RNA extracted, and RNA quality was assessed on a Bioanalyzer. A sequencing library was prepared using the TruSeq RNA Library Prep Kit v2. Reads were filtered based on quality scores and aligned to the genome using Bowtie2 (Langmead and Salzberg, 2012) with a current reference genome (Engel et al., 2014). HTSeq was used to quantify counts (Anders et al., 2015). Normalization, batch effect correction, and differential expression testing was performed using the DESeq2 software suite (Love et al., 2014). For downstream graphing purposes, limma was used for batch effect correction. Cluster 3.0 was used for k-means clustering of gene expression data after row normalization (Figure 4C-D; de Hoon et al., 2004). Motifs and lists of EMBR genes containing transcription factor motifs were obtained from JASPAR (Khan et al., 2018). Gene expression data from deletions were collected from Hu et al. and Cameroni et al. (Cameroni et al., 2004; Hu et al., 2007).
v-Src assay
The plasmid carrying the v-Src gene for expression in yeast, Yep-src (Brugge et al., 1987), was a gift from the Lindquist laboratory. Hypermutator and wild-type yeast were transformed via electroporation, and the plasmid was maintained by growth on SD-Ura. Expression of v-Src was induced by diluting saturated yeast (grown in 2% raffinose media) into 2% galactose media. Growth was monitored by measuring OD600.
Principal component analysis
Yeast expression datasets for environmental stress responses, DNA damage responses, and the unfolded protein response (Fry et al., 2003; Gasch et al., 2001; Gasch et al., 2000; Travers et al., 2000; Travesa et al., 2012) were obtained from the SPELL database (spell.yeastgenome.org; Hibbs et al., 2007). RNA expression data from the mutation accumulation lines were normalized against those of their un-passaged parent and log2-normalized. Expression data from passaged WT lines were normalized to their WT parent. All other expression data were normalized to their internal untreated controls. These expression data were collectively visualized through principal component analysis (PCA), a technique that reduces high-dimensional expression data (Jolliffe and Cadima, 2016).
EMBR human homolog protein–protein interaction enrichment
Human homologs of yeast EMBR genes were identified through deltaBLAST searches using default settings. Only hits with e-values ≤ 1 × 10−10 were collected as homologs. Hits were then mapped to unique Entrez IDs. A total of 568 human homologs of yeast EMBR genes (hEMBR) were identified. The protein–protein interactions of the hEMBR network were mapped using the BIOGRID database. The protein–protein interactions of 500 equally sized sets of random human genes and 500 equally sized random sets of annotated human homologs of all yeast genes (ENSEMBL) were also mapped. A hypergeometric test was computed for the protein–protein interactions in the hEMBR set and the median interactions in the random human gene and human homolog gene sets. To compute the hypergeometric test, key parameters were determined by mapping the total interactions between all annotated (HUGO) human genes in BIOGRID. The amount of total possible interactions was defined as (n*(n-1))/2 for n nodes in an undirected network. The protein–protein interactions of the 500 random sets of human homologs were fit to a gamma distribution. From this null distribution, 10,000 random samples were drawn to compute a p-value against the null hypothesis that the number of protein–protein interactions in the hEMBR is not larger than expected from the null distribution.
Cell lines
Colon cancer cell lines HCT-116, HCT-15, and SW480 were purchased from ATCC and cultured in RPMI supplemented with 10% FBS or recommended medium. Cell viability was measured after 48 hours of growth in drug or DMSO control with PrestoBlue (Thermo Fisher Scientific). Background was subtracted from OD measurements, outliers removed, and normalized to untreated controls. Each condition was repeated in triplicate. IC50 values were calculated using a nonlinear regression (GraphPad Prism v7).
EMBR signature in Colon Cancer
Whole exome sequencing, RNA sequencing, and patient clinical data were retrieved from TCGA (Cancer Genome Atlas, 2012) and assembled using the TCGA-Assembler R package (Zhu et al., 2014a). Tumor mutation burden of each sample was defined as the sum of total variant calls (relative to normal tissue) for each sample from whole exome sequencing and inferring mutations per 100kb based on the approximate length of genomic sequence captured by the exome sequencing preparations used, ~44Mbs, and assuming equal distribution of mutations across the entire genome. A histogram of tumor mutation burdens for the colon adenocarcinoma (COAD) dataset revealed a bimodal distribution with low mutation burden (< 2.55 mutations/100kb) and high mutation burden cohorts (> 2.55 mutations/100kb). For comparison with EMBR, only genes with annotated yeast homologs were kept for subsequent analysis. Expression levels of a given gene were defined by the scaled estimate of counts as reported in TCGA. A principal component analysis was performed on the expression data across all tumor samples to identify any clustering among the high mutation burden cohort. The norm vectors of the coefficients for principal components 1 and 2 (explaining 52.5% of the variance) were rank ordered and the top 8 EMBR homologs were chosen for further analysis on their impact on clinical outcomes. For the top 8 EMBR homologs, expression levels across the high mutation burden tumor samples, or across low mutation burden tumor samples in a parallel analysis, were grouped into low (less than median expression of the given gene in the dataset) and high expression (greater than median expression of the given gene in the dataset) cohorts. Kaplan-Meier survival plots for the high and low expression cohorts were created and compared using the Log-rank (Mantel-Cox) test.
AUTHOR CONTRIBUTIONS
RAZ, JM, RS, MKZ, and DFJ designed the research; RAZ, JM, RS, MKZ, and TRS performed the research and analyzed the data. MKZ passaged the mutator yeast lineages. RAZ, JM, RS, and DFJ wrote the paper. DFJ supervised all aspects of the work.
ACKNOWLEDGEMENTS
We thank K. Cimprich, A. Fire, S. Lujan, and members of the Jarosz laboratory for materials, helpful discussions, and comments on the manuscript. This work was supported by an NIH Pathway to Independence Award (R00-GM098600), a Searle Scholar Award (14-SSP-210), and a Kimmel Scholar Award (SKF-15-154) to DFJ. DFJ is also a Science and Engineering Fellow of the David and Lucile Packard Foundation. RS was supported by a Stanford Graduate Fellowship and a Gerald J. Lieberman Fellowship. RAZ and MKZ were supported by a Stanford Dean’s Postdoctoral Fellowship. MKZ was also supported by a postdoctoral fellowship from the Walter and Idun Berry Family foundation. JM was supported by NIH training grant T32HG000044 and TS was supported by NIH training grant T32-CA009302. Results shown here are in part based upon data generated by the TCGA Research Network: http://cancergenome.nih.gov/.