ABSTRACT
With two genomes in the same organism, interspecific hybrids have unique opportunities and costs. In both plants and yeasts, wild, pathogenic, and domesticated hybrids may eliminate portions of one parental genome, a phenomenon known as loss of heterozygosity (LOH). Laboratory evolution of hybrid yeast recapitulates these results, with LOH occurring in just a few hundred generations of propagation. In this study, we systematically looked for alleles that are beneficial when lost in order to determine how prevalent this mode of adaptation may be, and to determine candidate loci that might underlie the benefits of larger-scale chromosome rearrangements. These aims were accomplished by mating Saccharomyces uvarum with the S. cerevisiae deletion collection to create hybrids, such that each nonessential S. cerevisiae allele is deleted. Competitive fitness assays of these pooled, barcoded, hemizygous strains, and accompanying controls, revealed a large number of loci for which LOH is beneficial. We found that the fitness effects of hemizygosity are dependent on the species context, the selective environment, and the species origin of the deleted allele. Further, we found that hybrids have a larger distribution of fitness consequences vs. matched S. cerevisiae hemizygous diploids. Our results suggest that LOH can be a successful strategy for adaptation of hybrids to new environments, and we identify candidate loci that drive the chromosomal rearrangements observed in evolution of yeast hybrids.
INTRODUCTION
Hybrid organisms are common in nature, particularly in fungi and plants where an estimated 4% of flowering plants and 7% of ferns are hybrids (Otto and Whitton 2000). Even the human genome is now recognized to contain substantial introgressions – remnants of ancient hybridization – that are thought to be adaptive (Huerta-Sanchez et al. 2014; Dannemann et al. 2016; Gittelman et al. 2016; Racimo et al. 2017). Hybrids have been created via artificial selection in agriculture, industry, and the laboratory. For example, wheat, a pillar of civilization, is a triple hybrid between three grass species (Brenchley et al. 2012). Hybridization and introgression are also abundant in budding yeast (reviewed in Morales and Dujon), where hybrids have been found to possess adaptive advantages over their parental species (e.g., Stelkens et al. 2014), show desirable properties as industrial organisms (e.g., Mertens et al. 2015; Peris et al. 2017) and contribute to the emergence of fungal pathogens (Morales and Dujon 2012; Pryszcz et al. 2015; Schroder et al. 2016; Mixao and Gabaldon 2018). A whole genome duplication ancestral to Saccharomyces yeasts – a defining characteristic of the clade – has been recognized as a hybridization event (Marcet-Houben and Gabaldon 2015). Since Saccharomyces has relatively weak prezygotic barriers to speciation (Maclean and Greig 2008; Murphy and Zeyl 2012), Saccharomyces is particularly rife with hybridization, and includes hybridization between species as distant as 20 million years diverged (~80% amino acid and nucleotide identity), which are capable of intermating (Martini and Martini 1987; Naumova et al. 2005; Dunn and Sherlock 2008; Muller and McCusker 2009; Libkind et al. 2011; Nguyen et al. 2011; Almeida et al. 2014; Perez-Traves et al. 2014). Two common yeasts that originated as hybrids between S. cerevisiae and cryotolerant species have even received designation as hybrid species: the wine yeast S. bayanus, a triple hybrid between S. cerevisiae, S. uvarum, and S. eubayanus (Gonzalez et al. 2006; Sipiczki 2008); and the lager yeast S. pastorianus, which is a hybrid between S. cerevisiae and S. eubayanus (Martini and Martini 1987; Dunn and Sherlock 2008; Libkind et al. 2011; Nguyen et al. 2011; Perez-Traves et al. 2014). These species highlight the observation that fermentation environments are particularly rich in hybrids, spanning genera including Saccharomyces, Dekkara, and Pichia (Borneman et al. 2014; Smukowski Heil et al. 2018a).
Similar to plant hybrids (reviewed inChester et al. 2010), yeast hybrids can shed large portions of their genomes from one or both species during evolution (Otto and Whitton 2000; Sun and Xu 2009; Chester et al. 2010; Csoma et al. 2010; Louis et al. 2012; Peris et al. 2012; Pryszcz et al. 2015; Chen et al. 2017; Emery et al. 2018). Resolution of the ancestral whole genome duplication in Saccharomyces involved loss of the majority of duplicated genes, in a process that began shortly after the initial hybridization event (Scannell et al. 2007). These large-scale changes in genome structure and content have been recapitulated in part in the laboratory, demonstrating the rapidity with which these changes can occur and confirming their potential to contribution to adaptation. For example, experimental evolution of yeast hybrids under a number of selective conditions found genome rearrangements after only a few hundred generations (Kunicka-Styczynska and Rajkowska 2011; Piotrowski et al. 2012; Sanchez et al. 2017a; Smukowski Heil et al. 2017). The genome regions affected are dependent on the selective pressure used, and observed events include whole chromosome aneuploidy, both focal and chromosome arm amplifications, translocations, gene fusions, and LOH. Our previous work demonstrated that LOH can result from selection on one species allele and loss of the other. Using a candidate gene approach, we identified a single gene (PHO84) whose allelic differences explained the majority of the fitness benefit in evolved populations relative to their ancestor. However, additional, as yet unidentified driver genes must exist to fully account for the evolved strains’ fitness improvements, and many observed LOH regions remain unexplained. More broadly, the degree to which LOH is a product of genetic drift versus selection is not yet clear. To further complicate matters, improved fitness caused by LOH could have a number of possible explanations, including selecting for the better species’ alleles, uncovering beneficial recessive alleles, and/or resolving hybrid incompatibilities. Studying LOH in hybrid yeast allows a systematic approach that facilitates insights into these phenomena and provides a foundation to guide further investigation into hybrid biology in other, less tractable, contexts.
Such systematic approaches have been made possible via the creation of genome-scale deletion collections, including a near-comprehensive set of diploid S. cerevisiae strains hemizygous for every gene (Giaever et al. 2002; Deutschbauer and Davis 2005). These strains were created such that each carries a unique DNA barcode, facilitating pooled assays for competitive growth in a variety of conditions. Many studies have illustrated that heterozygous deletions can cause fitness defects (“haploinsufficiency”), and a smaller number have also found fitness increases, or haploproficiency (Delneri et al. 2008; Pir et al. 2012; Ohnuki and Ohya 2018). Previously, in order to determine driver mutations, our lab identified haploinsufficient and haploproficient loci in the deletion collection in environments matching laboratory evolution studies (Payen et al. 2016). However, since these loci were identified in S. cerevisiae diploids, the degree to which they explain the prevalence of, and genetic drivers for, hemizygosity in hybrids is unclear. There is reason to believe that loci important for hybrid adaptation are likely to differ from those important in purebred diploids. For example, Herbst et al., 2017, found that in S. paradoxus x S. cerevisiae hybrids hundreds of allelic deletions decreased the growth rate of hybrids but not of S. cerevisiae diploids.
In order to understand hybrid LOH, in this study we utilized two divergent species: S. cerevisiae and S. uvarum. We previously evolved hybrids and diploids of these species in nutrient-limited chemostat culture (Gresham et al. 2008; Sanchez et al. 2017a; Smukowski Heil et al. 2017; 2018b). We created thousands of S. cerevisiae x S. uvarum hybrid yeast strains by mating the S. cerevisiae nonessential deletion collection to WT S. uvarum. These collections of hybrid yeast, along with control populations, were assayed for competitive fitness in three nutrient-limited environments matched to our previous evolution conditions. We find that haploproficiency is common, and more loci are haploproficient in hybrids than in S. cerevisiae diploids, providing giving them more opportunities to adapt via this mechanism. Specific haploproficient loci rarely overlap between S. cerevisiae diploids and interspecific hybrids, indicating that simple gene dosage changes are unlikely to explain their adaptive benefit, and/or that dosage sensitivity is strongly dependent on species context. Furthermore, the specific loci are largely private to single selection environments, agreeing with our previous experimental evolution findings showing repeated occurrence of LOH for specific regions was also condition-specific. Finally, fitness effects are allele-specific – fitness consequences of deletion of the S. cerevisiae allele in the hybrid had no correlation with the fitness consequences of deletion of the S. uvarum allele. Again, these results are consistent with prior observations of species preference for LOH in evolved hybrids. They argue that if relief of genetic incompatibilities is a relevant mechanistic explanation for adaptation, then such incompatibilities must be acting in an allele-specific manner. Our study demonstrates that hybrids offer a unique fitness landscape with potentially more beneficial mutations, which may contribute to their unique ability to adapt, and it provides attractive candidate genes for future study.
RESULTS
We sought to discover the genome-wide fitness effects of hemizygosity in hybrid Saccharomyces in three nutrient-limited conditions that correspond with those previously used for experimental evolution. To this end, we mated S. uvarum to the S. cerevisiae haploid deletion collection creating thousands of hybrid yeast strains, each with a S. cerevisiae allele deleted. For comparison, we used a matched collection of S. cerevisiae hemizygous deletion strains. Additionally, we created control collections of thousands of WT S. cerevisiae and WT hybrid strains that contain unique DNA barcodes but are otherwise isogenic. (All collections are described in Table 1.) These control libraries allowed us to empirically measure technical and biological variation in our strain construction, growth, and sequencing pipeline. All strains were assayed for relative fitness via pooled competitive growth for 25 generations in glucose, phosphate, and sulfate limited chemostat culture followed by barcode sequencing. The barcodes counts track strain abundance over time, allowing us to derive competitive fitness scores (see Materials and Methods, Supplemental Table S1). Each experiment was performed in biological replicate (Supplemental Fig. S1). We confirmed that this pooled approach accurately reflects strain fitness by comparing the results to pairwise competitions of individual deletion strains vs. a GFP-marked WT competitor (Supplemental Fig. S2, Supplemental Table S2).
Fitness effects of hemizygosity in S. cerevisiae diploids
Both collections of control WT strains have narrow fitness distributions around neutrality, with 98% of the S. cerevisiae controls falling between fitness values of 0.047 and -0.040; and 98% of hybrid controls between 0.046 and -0.032 across all experiments (Supplemental Fig. S3). We used these empirical 1% cutoffs to determine significant increases and decreases in fitness of the deletion strains. Out of a total of 6,003 possible deletion strains, we identified 4,806 strains by barcode sequencing in the glucose-limited competition, 4,855 strains in phosphate limitation, and 4,901 strains in sulfate limitation. Compared to the WT control distribution, hemizygous gene deletions in S. cerevisiae caused a broader distribution of fitness effects. The null expectation for 1% cutoffs would be 48, 49, and 49 outliers in each direction for glucose, phosphate, and sulfate limitations, respectively. We observe significantly more deletion strains with fitness values beyond our cutoffs in several conditions: 308 haploinsufficient genes and 64 haploproficient genes in glucose limitation (p<2.2*10-16, p=0.19); 163 and 5 in phosphate limitation (p=4.3*10-15, p=4.4*10-9 fewer); and 58 and 113 in sulfate limitation (p=0.44, p=6*10-7; Comparison of Two Population Proportions performed in R, with Yates continuity correction). Thus we conclude that in in the S. cerevisiae hemizygous collection, there are more deletions that cause extreme fitness effects than we would expect by chance, consistent with our previous results (Payen et al. 2016).
Fitness effects of LOH are more extreme in hybrids
We applied this analysis to the hybrid deletion strains. Out of 4,828 possible deletion strains (a lower number than above because only nonessential S. cerevisiae gene deletions can be used), we identified 3,195 deletion strains in sulfate limitation, 3,179 in phosphate limitation, and 2,955 in glucose limitation. Under the null expectation of 1%, we would expect 32 outliers in each direction in sulfate limited culture, 32 in phosphate limited, and 30 in glucose limited. However, in the hybrids we observed 308 haploinsufficient genes and 63 haploproficient genes in glucose limitation (p<2.2*10-16, p=0.0008); 919 and 453 in phosphate limitation (p<2.2*10-16, p<2.2*10-16); and 216 and 17 in sulfate limitation (p<2.2 * 10-16, p>0.05; Comparison of Two Population Proportions performed in R, with Yates continuity correction). The differences between nutrient limitations are illustrated in the different shapes of the distributions (Supplemental Figure S3). In phosphate limitation, the highest fitness strains had risen to an abundance >1.5% of the population by the final time point, over two orders of magnitude above their initial frequency.
Hybrid deletion mutants had a significantly broader range of fitness values than deletions of the same loci in the S. cerevisiae context (Supplemental Figure S3; Levene test p=1.9 * 10-12, p< 2.2*10-16, p=8.6*10-6, for glucose, phosphate, and sulfate limitations respectively) suggesting loss of one allele in hybrids leads to more extreme fitness outcomes, in both directions. We next compared genes that had significant fitness effects in hybrids to those in the S. cerevisiae diploids. Although there were some genes that were consistent between genetic backgrounds, correlation was low, and some gene deletions even had inverse effects (Fig. 1). Consistent with these findings, the genes identified in the hybrid and S. cerevisiae diploid datasets had different GO enrichments (Supplemental Tables S3 and S4).
Highlighting the effect of genetic background, many of the haploinsufficient alleles in the purebred genetic context are alleviated in the hybrid context, defined as an increase in fitness of at least 0.04, which is the 95% cutoff for the purebred collection. In glucose, phosphate, and sulfate limitations there are 93, 54 and 44 such alleviations of haploinsufficiency, respectively. This represents an alleviation rate of 30% and 34% in phosphate and glucose limited media, and 76% alleviation in sulfate limited media. GO enrichments for these alleviated gene deletions include cytosolic ribosomal subunit and ubiquinone metabolic process in glucose limitation; retrograde transport, endosome to golgi, and large ribosomal subunit for phosphate limitation; and the core mediator complex in sulfate limitation (all p-values<0.05 with a Bonferroni step down correction).
Fitness effects of LOH are condition-specific
We next looked across environments to determine if hemizygosity caused larger fitness differences between conditions, or if effects were condition-specific. We calculated the variance in fitness values across conditions of 2,775 gene deletions present in all 6 mass competitions (hybrid and purebred deletion collections completed in the 3 nutrient limitations). Hybrid deletion strains had significantly larger variance between conditions relative to their purebred counterparts (T-test p<2.2*10-16). In the hybrid genetic context, 92 deletion strains showed antagonistic pleiotropy—low fitness in one condition and high fitness in another. These genes were enriched for GO terms gene expression and RNA metabolic process (p=2.5*10-6, and p=4.3*10-6), suggesting that differential expression may contribute to this pleiotropic phenotype.
One consequence of these patterns is that fitness in one nutrient limitation did not predict fitness in the others (Fig. 2). No loci showed consistent fitness differences across all three environments and in both genetic backgrounds. However, 89 deletions caused fitness deficits in two of the media, with no effect in the third (Supplemental Table S5), and 22 genes were haploproficient in two media and neutral in the third (Table 2). These genes are of particular interest because they may allow hybrid strains to adapt to multiple or heterogeneous environments. We previous found mutations in one of these genes, MHR1, in two phosphate-limited evolved populations (Smukowski Heil et al. 2017), showing the efficacy of this approach in finding potential driver mutations.
Fitness effects of LOH are allele-specific
In our previous work, we found that loss of heterozygosity at the PHO84 locus was beneficial when either allele was lost, i.e. heterozygosity itself had a cost (Smukowski Heil et al., 2017). To determine whether this phenomenon is widespread in our genome-scale dataset, we performed reciprocal hemizygosity analyses (Steinmetz et al., 2002). We deleted 11 S. uvarum genes that represented a broad range of fitness values (Supplemental Table S6), and mated these strains to WT S. cerevisiae, creating a set of reciprocal deletion strains vs. our original experiment. We then competed each strain against a WT hybrid labeled with GFP in the indicated nutrient limitation. With one exception (TPK3), the fitness values of these experiments were uncorrelated with those obtained with the corresponding S. cerevisiae allele deleted (Supplemental Table 6; Fig. 3).
Candidate genes driving LOH in experimental evolution
Together, these results show that beneficial mutations in hybrids cannot be predicted on the basis of screens performed in S. cerevisiae alone. LOH events may be tens of kilobases long and include hundreds of genes, making it impractical to use single gene approaches to understand such events. Instead, we used this genome-wide hybrid screen to identify candidate beneficial mutations implicated in LOH events from evolved strains (Smukowski Heil et al. 2017). Six out of 16 hybrid strains contained a total of nine LOH regions, and four of these regions eliminated the S. cerevisiae portion of the genome. These strains have fitness benefits compared to the fully heterozygous ancestor strain. One strain containing two LOH regions was evolved in sulfate limitation, where adaptation is largely dominated by amplification of the SUL1 sulfate transporter gene (Brewer et al. 2015; Sanchez et al. 2017a). In this strain, we did not identify any single candidate gene deletions that had fitness benefits above our control strain cutoff of 0.046 in either region, consistent with this hypothesis. However, in phosphate limitation, we found candidate driver mutations for the regions deleted in both evolved strains (Table 3). Similar to our results for the whole dataset, none of these candidate gene deletions were beneficial in other nutrient limitations in the hybrid context, or in any nutrient limitation in the S. cerevisiae diploid. They also spanned a diverse variety of biological processes, including a gene of unknown function.
DISCUSSION
LOH is prevalent in hybrid genomes across taxa. In our previous work, we observed these events arising quickly in interspecific yeast hybrids during only a few hundred generations of laboratory selection (Smukowski Heil et al. 2017; 2018b). In this study, we found specific gene deletions within these regions that might contribute to the fitness benefits enjoyed by these evolved strains, and we broadened our analysis to the whole genome to determine how hemizygous deletions behave more generally. We find that hybrids are more likely than purebred diploids to benefit from hemizygous deletions (but also to suffer fitness penalties). However, these benefits are complex – hemizygous deletions are largely condition-, allele-, and species-specific. Our results suggest that LOH may be an attractive means by which hybrids can adapt to strong, narrow selection pressures, but at the cost of reduced fitness in alternate environments. Industrial and fermentation environments, where yeast hybrids are successful, might provide exactly such a scenario (Hittinger 2013; Krogerus et al. 2017; Krogerus et al. 2018).
Though we have focused here on beneficial gene deletions, we note that other groups have used similar datasets to look at mutations that decrease fitness (Herbst et al. 2017; Weiss et al. 2018). We largely confirm the patterns observed in S. paradoxus x S. cerevisiae hybrids, though the different conditions utilized across studies make direct comparisons complicated.
We note several points for improvement and further study. First, while we did observe some significant gene ontology enrichments among gene deletions with shared fitness characteristics (for example differences in ribosomal structure/function and expression differences may contribute to differential fitness in various genetic and environmental contexts, respectively), the lack of strong biological process enrichments provided no simple rationale for the molecular explanations underlying these effects. Combining our data with other results collected from hybrids, such as protein-protein interactions (Chretien et al. 2018) may help provide such explanations. Expanding our study to additional genes would also be desirable. Due to the method we used to generate the hybrid deletion strains, we were limited to genes that are nonessential in S. cerevisiae. Though we deleted a small number of S. uvarum genes to compare with the orthologous S. cerevisiae allele deletions, we did not explore S. uvarum genes genome wide. We hope to remedy this in the future using our recently created S. uvarum insertional mutagenesis library (Sanchez et al. 2017b). In vivo transposition is another potential approach that has recently been applied to S. paradoxus x S. cerevisiae hybrids (Weiss et al. 2018). Both these approaches even have some advantages over the S. cerevisiae deletion collection, which has potential problems with suppressor mutations, mutation accumulation, and aneuploidy (Hughes et al. 2000; Teng et al. 2013; van Leeuwen et al. 2016).
Finally, we have not been able to determine how multiple genes within a segment combine to generate their full fitness consequence, a topic that has bedeviled the aneuploidy field more broadly (Solimini et al. 2012; Davoli et al. 2013; Sunshine et al. 2015; Dodgson et al. 2016; Iyer et al. 2018). Comparison of the fitness values of the evolved hybrids with the fitness of single gene deletions revealed cases where a simple additive model both under- or over-estimated the evolved strain fitness (analysis not shown; data in Table 2 and in Smukowski Heil, et al, 2017). However, the evolved strains are an imperfect basis for comparison since they contain other mutations in addition to the hemizygous region. Alternative selection methods for recovering high fitness strains with a minimal number of additional mutations presents one possible approach (Bellon et al. 2018). An even better approach might be to create hybrids with segmental monosomy and test their fitness directly. Such strains could be engineered using chromosome fragmentation vectors or recombinase-based approaches such as the Sc2.0 shuffle system (Morrow et al. 1997; Dymond et al. 2011; Shen et al. 2018).
METHODS
Strains and collections
The S. cerevisiae strain collections are described in Payen et al, 2016 (Payen et al. 2016).
All S. uvarum strains are derived from the reference strain CBS 7001, sometimes identified as S. bayanus var uvarum.
Hybrid collections were made by spreading 200 μl mid log S. uvarum lys2 MATα on solid YPD omni plates then spotting the appropriate haploid MATa collection (deletion or barcoder) using a pinner with 96 arrayed pins. After overnight growth, colonies were transferred to selective solid minimal media to ensure only hybrid growth. These plates were then transferred to liquid selective media and pooled.
S. uvarum deletion strains were provided by Sarah Bissonnette and Jasper Rine (UC Berkeley) (Sanchez et al. 2017b).
Fitness assays and barcode sequencing
All collections were grown as pools in duplicate in three different chemostat media – glucose limited, phosphate limited, and sulfate limited. All S. cerevisiae competition experiments were taken from Payen et al., 2016, where the protocol is described in detail. Briefly, pools were inoculated into 240 ml chemostats and grown for 24 hours, when peristaltic pumps were turned on at a dilution rate of ~0.17 volumes per hour (Payen et al. 2016). Samples were taken from chemostats twice a day. From these samples, the unique DNA barcodes from each collection were PCR amplified, with each time point having a unique Illumina adapter incorporated during PCR amplification. The barcodes were then sequenced on an Illumina Genome Analyzer IIx. The frequency of the barcodes was used to calculate the fitness of each strain by determining the natural log in the change of proportional barcode frequency over 25 generations. We required a minimum of 100 barcode counts per strain to be identified. For the reciprocal hemizygosity analysis, the 100 barcode limit was reduced to 44 for comparison to the mass competitions because manual curating of these ensured no false positives. Many sequences for the WT barcoded collection DNA barcodes were only determinable through examination of over represented sequences in sequencing data, and these were used for analysis.
Individual competition experiments were done in the respective media in 20 mL chemostats and competed against a single WT clone with a GFP label. Relative strain abundance was monitored using a BD Accuri C6 flow cytometer. Fitness was determined by regressing the slope of generations versus the ln(dark cells/GFP cells).
GO Enrichments in the Dataset
GO enrichments were determined using the ClueGO application in Cytoscape (Bindea et al. 2009), and using the total strains identified in our experiments as the background population. Outliers were determined using a 1% cutoff in each direction based on the WT barcoded collection. All ontologies were corrected for multiple comparisons with a Bonferroni step down analysis.
Statistics
Statistical measures unless otherwise stated were performed in R. The statistics used are stated in the results adjacent to p-values.
DATA ACCESS
Sequencing data are available via BioProject Accession PRJNA283983.
DISCLOSURE DECLARATION
The authors declare no conflicts of interest.
ACKNOWLEDGEMENTS
Thanks to Caitlin Connelly for her contributions to the initiation of this project. Thanks to Corey Nislow, Jasper Rine, and Sarah Bissonnette for sharing strains.