Abstract
Experimental evolution is often highly repeatable, but the underlying causes are generally unknown, which prevents extension of evolutionary forecasts to related species. Data on adaptive phenotypes, mutation rates and targets from the Pseudomonas fluorescens SBW25 Wrinkly Spreader system combined with mathematical models of the genotype-to-phenotype map allowed evolutionary forecasts to be made for several related Pseudomonas species. Predicted outcomes of experimental evolution in terms of phenotype, types of mutations, relative rates of pathways and mutational targets were then tested in Pseudomonas protegens Pf-5. As predicted, most mutations were found in three specific regulatory pathways resulting in increased production of Pel exopolysaccharide. Mutations were, as predicted, mainly found to disrupt negative regulation with a smaller number in upstream promoter regions. Mutated regions in proteins could also be predicted, but most mutations were not identical to those previously found. This study demonstrates the potential of short-term evolutionary forecasting in experimental populations.
Impact statement Conservation of genotype-to-phenotype maps allows successful prediction of short-term evolution in P. protegens Pf-5 and lays the foundation for evolutionary forecasting in other Pseudomonas.
Introduction
An increasing number of experimental evolution studies, primarily using microbes, have provided insights into many fundamental questions in evolutionary biology including the repeatability of evolutionary processes (Barrick and Lenski 2013; Jerison and Desai 2015; Long, et al. 2015; Orgogozo 2015; Blount, et al. 2018). Given the ability to control environmental conditions as well as population size and the use of a single asexual organism, such studies could provide an ideal test of our ability to predict evolutionary outcomes in simplified model systems. High repeatability on both phenotypic and genetic level have been observed in a large number of experimental evolution studies (Wichman, et al. 1999; Conrad, et al. 2009; Lee and Marx 2012; Tenaillon, et al. 2012; Barrick and Lenski 2013; Ferguson, et al. 2013; Herron and Doebeli 2013; Blank, et al. 2014; McElroy, et al. 2014; Fraebel, et al. 2017; Kram, et al. 2017; Blount, et al. 2018; Knöppel, et al. 2018), but it has become clear that high repeatability alone is not sufficient for testing evolutionary predictability beyond the prediction that under identical conditions the same evolutionary outcome is likely. The difficulties of moving from repeatability to predictability are largely a result of the lack of knowledge of the genotype-phenotype-fitness map (Figure 1).
There are several problems that need to be solved to develop a model system for true testing of predictive ability. In some cases adaptive mutations are highly strain specific, so that for example adaptation of different strains to a specific environment will produce different results. This is sometime due to the long history of sub-culturing under laboratory conditions combined with rounds of mutagenesis that has caused, for example, many E. coli and Salmonella strains to accumulate diverse mutations, some of which are rapidly compensated by secondary mutations restoring fitness (Barrick, et al. 2009; Tenaillon, et al. 2012; Knöppel, et al. 2018). Thus, in many cases conclusions from one strain cannot be extended to another because of differences in their genotype-to-phenotype maps (Figure 1C).
Another problem for testing predictability is that in many cases it is not possible to design an experiment where one specific selective pressure is dominant. For example experiments with intended adaptation to high temperature (Tenaillon, et al. 2012) or freeze-thaw-growth cycles (Sleight, et al. 2008) result in similar mutations in uspA, which may indicate adaptation to the medium used (Knöppel, et al. 2018) or generally stressful conditions. Relatively minor changes in environmental conditions can also results in divergent mutational patterns (Deatherage, et al. 2017). This means that the range of possible adaptive phenotypes cannot be defined beforehand (Figure 1B) and that in many cases the phenotypes that solve the intended selective problem are outcompeted by other phenotypes with increased fitness (Figure 1A).
A highly specific selective pressure can be applied by selection for antitbiotic resistance and mutation targets are often highly conserved between different strains and species and between laboratory and natural populations (O’Neill, et al. 2006; Schenk, et al. 2012; Brandis, et al. 2015; Jahn, et al. 2017; Lukacisinova and Bollenbach 2017; Sommer, et al. 2017). However resistance phenotypes are typically explained solely by the molecular phenotype of a single protein and no alternative pathways to resistance are known resulting in a relatively simple parameter and genotype space (Figure 1C, Figure 1D). Thus the prediction will be identical for all species and it cannot provide a test of prediction from general principles. In many cases mutants isolated after selection for high-level antibiotic resistance also lacks the complexity that is inherent to many phenotypic traits where the genotype-to-phenotype map involves a large number of functional interactions and complex regulation (Figure 1C). This complexity comes to light in that mutations allowing adaptation to new environments are commonly found in global regulators of gene expression such as genes involved in the stringent response, DNA binding proteins, supercoiling and core genes for RNA and protein synthesis (Barrick, et al. 2009; Conrad, et al. 2009; Kishimoto, et al. 2010; Tenaillon, et al. 2012; Herron and Doebeli 2013; Sandberg, et al. 2014; LaCroix, et al. 2015; Deatherage, et al. 2017). The physiological effects of these mutations are diverse, sometimes affecting the expression of hundreds of genes making the elucidation of the molecular underpinnings of the adaptive phenotype (Figure 1C, 1D) extremely complex and thus difficult to use for predictive modeling.
The wrinkly spreader model in P. fluorescens SBW25 (hereafter SBW25) is one of the best-characterized experimental evolution systems and has several properties that could make it possible to extend knowledge and principles from this species to related species (Rainey and Travisano 1998; Spiers, et al. 2002; Spiers, et al. 2003; Spiers and Rainey 2005; Goymer, et al. 2006; Bantinaki, et al. 2007; McDonald, et al. 2009; Silby, et al. 2009; Ferguson, et al. 2013; Lind, et al. 2015, 2017b; Lind, et al. 2019). When the wild type SBW25 is placed into a static growth tube the oxygen in the medium is rapidly consumed by growing bacteria (Figure 2A). However oxygen levels at the surface are high and mutants that are able to colonize the air-liquid interface have a major growth advantage and rapidly increase in frequency (Figure 2A). Several phenotypic solutions to air-liquid interface colonization, all involving increased cell-cell adhesion, have been described and are distinguishable by their colony morphology on agar plates (Figure 2A, 2B) (Rainey and Travisano 1998; Ferguson, et al. 2013; Lind, et al. 2017b). The most successful of these is the Wrinkly Spreader (WS) (Ferguson, et al. 2013; Lind, et al. 2017b) that overproduces a cellulosic polymer that is the main structural component of the mat at the air-liquid interface (Spiers, et al. 2002; Spiers, et al. 2003). The WS phenotype is caused by mutational activation of c-di-GMP production by a diguanylate cyclase (DGC) (Figure 2C) (Goymer, et al. 2006). While many different DGCs can be activated to reach the WS phenotype, some are greatly overrepresented due to larger mutational target sizes leading to a hierarchy of genetic routes to WS (Figure 2D) (McDonald, et al. 2009; Lind, et al. 2015). The genotype-to-phenotype map to WS has been characterized in detail (Goymer, et al. 2006; McDonald, et al. 2009; Lind, et al. 2019) allowing the development of mathematical models of the three main pathways to WS (Wsp, Aws and Mws) and the prediction of evolutionary outcomes (Figure 2E) (Lind, et al. 2019).
This study makes initial forecasts of phenotypic and genetic evolutionary outcomes after static experimental evolution for six Pseudomonas species based mainly on their genome sequence and data from SBW25 (McDonald, et al. 2009; McDonald, et al. 2011; Ferguson, et al. 2013; Lind, et al. 2015, 2017b). Predictions of evolutionary outcomes were then experimentally tested for the closely related species, P. protegens Pf-5 (hereafter Pf-5) with a highly conserved genetic repertoire of DGCs but that lacks the main structural component used by WS types in SBW25. Results show that phenotypes, order of pathways used and types of mutations can be predicted and that forecasts are robust to changes in environmental conditions.
Results
Six Pseudomonas species (Figure 3 legend) were chosen based on phylogenetic diversity and their complement of DGCs and EPSs for a first round of predictions. These species encode from none to all three of the main DGCs used in SBW25 and only three species contain genes related to cellulose biosynthesis, the main EPS used in SBW25 (Figure 3). Full details are available in Figure 3 - source data.
Ecotype predictions
Given the range of ways that cells can achieve increased adherence and surface colonization by use of different EPSs, LPS modification and cell chaining as demonstrated by the studies with SBW25 (Spiers, et al. 2002; Ferguson, et al. 2013; Lind, et al. 2017b) all species are be expected to colonize the air-liquid interface if access to oxygen is limiting for growth. This could be achieved simply by changes in gene expression in the wild type, but for an experimental evolution study a mutational solution is sought and environmental conditions are chosen so that the wild type strain does not colonize the air-liquid interface. However changing environment presents a further challenge because, as discussed above, it often leads to a different spectrum of adaptive mutations. Thus a foundational requirement for an extended experimental evolution system to be successful for different species is that the evolutionary solutions are robust to differences in environmental conditions.
Phenotype predictions
Several different phenotypic solutions can be used to colonize the air-liquid interface in SBW25 including at least two different EPSs (cellulose (WS) and PGA (PWS), LPS modification (fuzzy spreaders FS) and cell chaining (CC) (Spiers, et al. 2002; Ferguson, et al. 2013; Lind, et al. 2017b). However wrinkly spreaders that form cellulose-based mats are superior and rapidly outcompete all other types (Ferguson, et al. 2013; Lind, et al. 2017b). Based on the limited data available it is predicted that cellulose-based biofilms are superior in other species as well and that they will be the primary structural solution when available as for P. syringae, P. putida and P. stutzeri. For the three species lacking genes for cellulose biosynthesis, other EPSs are predicted to be used. Based on studies of P. aeruginosa the primary EPS required for pellicle formation at the air-liquid interface in this species is Pel, encoded by the pelABCDEFG operon, which is also present in the Pf-5 genome and is predicted to be the primary phenotypic solution for these species. The genome of P. savastanoi lacks genes for biosynthesis of cellulose and Pel as well as other EPSs that are known to be able to support mat-formation, such as PGA and there is not sufficient data at this point to make a prediction of which one is likely to be the primary phenotypic solution.
Overexpression of EPSs used for mat-formation at the air-liquid interface is in SBW25 and P. aeruginosa linked to mutations increasing c-di-GMP production rather than mutations in the promoters of, or genes in, the EPS operons themselves. This can be explained by the role of post-translation regulation by c-di-GMP in the production of cellulose, Pel, PGA and alginate (Lee, et al. 2007; Römling, et al. 2013; Steiner, et al. 2013; Morgan, et al. 2014; Liang 2015; Whitney, et al. 2015). In these cases transcriptional up-regulation alone is not sufficient to cause overproduction because of lack of a c-di-GMP signal. Possibly there is also an additional benefit to using activation of the c-di-GMP network in that it reduces motility, which is not needed when established at the air-liquid interface and which consumes large amount of energy to sustain and thus is likely to be selected against (Koskiniemi, et al. 2012; Lee and Marx 2012).
Prediction of types of mutations
Disabling mutations are expected to be more common than enabling mutations and therefore the prediction is that most mutations will be in genes where loss-of-function mutations produce an adaptive phenotype (Fig 2D) (Lind, et al. 2015). This is the case for the large majority of mutations in SBW25 including those activating main DGCs WspR, AwsR and MwsR, which are all under negative regulation, as well as disruption of the genes underpinning the FS phenotype (fuzY, PFLU0478) and CC phenotype (nlpD, PFLU1301) (McDonald, et al. 2009; Ferguson, et al. 2013; Lind, et al. 2017b). Next in the hierarchy of mutations are promoter mutations, increasing transcription, and promoter capture events (Lind, et al. 2015). Less common are intragenic activating mutations that enable a particular function by for example increase in catalytic activity or strengthening of interactions to another molecule or another domain of the same protein (Lind, et al. 2015). Gene duplications occur at a high rate and clearly have the ability to increase gene expression of DGCs, but they have not yet been found to cause WS in SBW25, possibly because a two-fold increase in gene expression is insufficient.
Prediction of pathways used
There are at least 16 different pathways to the WS phenotype in SBW25 with similar fitness, but they are used at frequencies that vary over several orders of magnitude based on the differing capacity to translate phenotypic variation into phenotypic variation (Figure 2D) (Lind, et al. 2015). Mutations in three pathways, Wsp, Aws, and Mws account for >98% of WS mutations and based on a detailed understanding of the molecular functions of the genes involved of each pathways mathematical models predicting at which relative rates the pathways should be used were constructed (Figure 2E) (Lind, et al. 2019). The prediction results varies depending on the rates of disabling and enabling mutations, but if it is assumed that disabling mutations are an order of magnitude more common than enabling mutations the models predict that Wsp will account for about 54%, Aws 30% and Mws 16% of the WS mutations (Figure 2E) (Lind, et al. 2019).
When the three common pathways are deleted WS types evolve mainly by mutations in PFLU0085, which contains an intragenic negative regulator region (Lind, et al. 2015), and this is expected to be the fourth most common pathways when present. Less common promoter mutations will also appear at rates at least a magnitude lower, but which DGCs that will be transcriptionally activated cannot be easily predicted except for assuming it will be homologs of the ones used in SBW25. These DGCs must be catalytically active and also be localized to the membrane (Farr, et al. 2017). Possibly the subset of DGCs that are primarily activated by mutations to their promoters is mainly determined by mutation rate and a higher mutation rate might be caused by higher transcription and also influenced by gene direction (Sankar, et al. 2016). Most promoter capture deletion events are less than 5 kilobases (Lind, et al. 2015) in size and the lack of an alternative promoter relatively close upstream to the DGC is likely to rule out these DGCs. The DGCs that can be activated by intragenic activating mutations cannot now be predicted beyond the simple prediction that these are the same genes as in SBW25 (Lind, et al. 2015).
Prediction of mutated genes
In addition to predicting the relative rates of the three main pathways, the previously described mathematical model can also predict which proteins are likely to be mutated (Lind, et al. 2019). High rates of WS mutations are predicted for WspF, WspA, WspE, AwsX and AwsR and MwsR (Figure 2E). A significantly lower rate of enabling mutations is also predicted to occur in WspC, WspR and AwsO (Figure 2E). Despite the simplicity of null model it closely predicted the mutational targets in SBW25 with equal rates for WspF, WspA and WspE and rare mutations in WspC and WspR, suggesting that it is a useful null model also for other species (Lind, et al. 2019).
Prediction of specific mutational targets and effects of mutations
The level of parallelism at the nucleotide level between species is expected to be dependent both on the number of possible mutations to WS and the degree of functional conservation of the proteins involved that define the genotype-to-phenotype map. Mutational hot spots are also expected to contribute to parallelism when they are conserved, but reduce parallelism when they are not. Based on previous analysis of patterns of mutations in SBW25 (McDonald, et al. 2009; McDonald, et al. 2011; Lind, et al. 2019) and homology modeling of protein structure using Phyre2 (Kelley, et al. 2015) regions expected to be mutated were predicted and the likely molecular consequences of different mutations suggested (Figure 4).
Prediction of fitness effects of WS mutations
While conservation of relative fitness of different phenotypic variants might be expected it is less clear if the relative fitness of different DCG pathways and mutations will be conserved between species. Despite this difficulty there might be a way forward to predict the relative fitness of a large range of mutations with limited experimental data. The distribution of fitness effects of new mutations have been found to be bimodal for a large number of genes with different functions with one mode close to neutrality and one corresponding to a complete loss of a particular molecular function (Jacquier, et al. 2013; Jimenez, et al. 2013; Firnberg, et al. 2014; Lind, et al. 2017a; Lundin, et al. 2017). Given that mutations that allow colonization of the air-liquid interface have large phenotypic effects and are believed to also have large effects on molecular function, often a complete disruption of an interaction, adaptive mutations in the same region of a protein are likely to have similar fitness effects. Thus, an approximation of the distribution of fitness effects could be possible with relative few mutations for each gene. This is supported by the relatively small number of WS mutants in SBW25 that have been characterized with sensitive fitness assays and where mutations in the same gene typically have similar fitness effects (Lind, et al. 2015; Lind, et al. 2019). If this assumption is true the distribution of beneficial fitness effects is not continuous and the most advantageous mutations are not predicted to be equally distributed between pathways or genes. Thus the prediction would be that mutants isolated after experimental evolution were concentrated to certain genes even if the mutational rate is similar so that although the prediction from the null model is equal number of mutations for WspA, WspE and WspF such distribution is unlikely to be found. While the mutation rates to WS for the three genes are similar in SBW25, WspA mutants are rarely found after experimental evolution due to their lower fitness (McDonald, et al. 2009; Lind, et al. 2019). There is however no a priori reason to expect that the relative fitness of mutations in different genes or pathways will be conserved between species.
Inactivating mutations in fuzY and nlpD producing the alternative adaptive phenotypes based on LPS modification or cell chaining were also found to have similar fitness (Ferguson, et al. 2013; Farr 2015). Possibly there are other genes that can be mutated with similar phenotypes, but that those mutants have lower fitness and are outcompeted in SBW25. If relative fitness is not conserved between species this could lead to high convergence on the phenotypic level but with completely different genetic bases.
Experimental test of forecasts in Pseudomonas protegens Pf-5
In order to test the predictions presented above, P. protegens Pf-5 was used for a parallel experimental evolution study under static conditions (Figure 1A). Its genome encodes homologs of all except one of the DGCs used in SBW25 including all the three common pathways to WS (Wsp, Aws, Mws). Thus the genetic predictions in terms of types of mutations and mutated genes are nearly identical to SBW25 and the mathematical null models can be directly applied. However Pf-5 lacks genes for biosynthesis of cellulose meaning that if c-di-GMP overproduction is the main pathway used, as predicted, an alternative EPS component must be utilized. The experimental conditions were modified to test the robustness of predictions to changes in media composition, temperature and cell wall material compared to those used in the original SBW25 system (Materials and Methods). After experimental evolution for five days, dilutions were spread on agar plates and then screened for colonies with divergent colony morphology, characteristic of many phenotypes that colonize the air-liquid interface.
Genetic pathways
In total 43 independent mutants were isolated and the causal mutations were identified (Figure 5, Figure 5 – supporting data). As predicted by the null model the majority (40/43) of mutations were associated with the Wsp, Aws, and Mws pathways that are subject to negative regulation (Figure 1D). In addition the prediction that promoter mutations would be the second most common type of mutation was successful with two mutations found upstream of the aws operon, which were predicted to disrupt the terminator of a high expression ribosomal RNA operon representing a promoter capture event. Promoter mutations were also found upstream of PFL_3078, which is the first gene of a putative EPS locus (PFL_3078-3093) that has not previously been described and that is only present in closely related strains. The operon encodes genes typical of exopolysaccharide biosynthetic operons making it highly likely it encodes the main structural component used by these mutants.
The mathematical null model (Figure 2E) successfully predicted that of the three common pathways to WS, Wsp would be the most common one (16 mutants) followed by Aws (14) and then Mws (10). Mutations were predominately found in the negative regulators WspF (15 mutants) or AwsX (9), but also in interacting proteins WspE (1) and AwsR (3). Given that the mutational target size is estimated to be smaller for the interacting proteins (Figure 4) this is not surprising. No mutations were found in WspA despite a predicted high rate.
Mutations were predominantly found in predicted regions (Figure 4) for WspF, WspE, AwsX, AwsR and MwsR, but in most cases they were not identical to those in SBW25 (Figure 5 – supporting data). A mutational hot spot was apparent in WspF with 12 out of 15 mutations being identical V271G missense mutations. The previously described mutational hot spots in SBW25 in the awsX and mwsR genes (Lind, et al. 2019) appeared absent, demonstrating how mutation rate differences can skew evolutionary outcomes even for closely related species with similar genetic architecture.
To determine if there were also rare pathways to the WS phenotype the entire wsp, aws, and mws operons were deleted and experimental evolution repeated as was previously done for SBW25 (Lind, et al. 2015). Mutations in the DGC PFL_0087 accounted for six out of the seven WS types found (Figure 5B). This was also the dominant pathway in the SBW25 Δwsp Δaws Δmws strain where mutations in the corresponding region of PFLU0085 were responsible for 47% of WS mutants. Thus the fourth most common pathway is also the same for both species.
Mutations in WspA are predicted to be one of the major mutational routes to WS based on the mathematical model (Lind, et al. 2019), but no mutations were found either in this study or in SBW25 (McDonald, et al. 2009). However, when the mutational spectrum of WS mutants was determined in the absence of selection for growth at the air-liquid interface, wspA mutant occurred at rates similar to those of WspE and WspF, as predicted by the model, and their low frequency after experimental evolution could be explained by their lower fitness. To investigate if a WspA mutation could cause WS in P.protegens a common deletion found in SBW25 (WspA T293-E299) was introduced and found to cause a WS phenotype and it is included in the experiments described below.
Phenotypic characterization
In total 60 wells were inoculated for the wild type and subjected to experimental evolution for five days after which air-liquid interface colonization was observed for the majority of the wells. Mutants with clearly visible changes in colony morphology were isolated from 43 wells. The experiment was repeated for the Δwsp Δaws Δmws triple deletion mutant and WS types were detected in 7 wells. Typically a single type of divergent colonies was observed and one colony for each well was selected for further characterization at random based on a pre-determined position on the agar plate. Representative mutations were reconstructed using an allelic exchange protocol to determine that the mutations are the sole cause of the air-liquid interface colonization and colony phenotypes and to exclude the influence of secondary mutations (Figure 6A) before further characterization.
The lack of cellulose biosynthetic genes also shows that these ecotypes can evolve by different phenotypes than in SBW25. Two clearly different phenotypes were observed with one very similar to the original WS types in SBW25 with a clear motility defect and mutations in the Wsp, Aws, Mws and PFL_0087 pathways (Figure 6A, 6B). The other type was less wrinkly, had similar motility as the wild type and promoter mutations upstream of the PFL_3078-3093 operon (Figure 6A, 6B).
Fitness of adaptive mutants
Two types of fitness assays were performed, similarly as previously described (Lind, et al. 2015), to measure differences in fitness between the different WS mutants and the alternative phenotypic solution with the mutation upstream of PFL_3078. The first assay measures “invasion fitness” where the mutant is allowed to invade a wild type population from an initial frequency of 1%. This confirms that the mutations are adaptive and that mutants can colonize the air-liquid interface. The invasion assays showed that all reconstructed mutants could rapidly invade an ancestral wild type population (Figure 7A, Figure 7 – source data). Although there were significant differences between selection coefficients of the mutants (one-way ANOVA p < 0.0001), no mutant was significantly different from the most common mutant (WspF V271G, two-tailed t-test p > 0.01).
The second fitness assay measures “competition fitness” and here each mutant is instead mixed 1:1 with the most common WS type (WspF V271G) at the start of the competition. The competition assay showed that the ancestral wild type was rapidly outcompeted by the mutants also at a 1:1 initial ratio (Figure 7B). There was significant variation in fitness between the WS mutants (one-way ANOVA p < 0.0001) and the AwsX had significantly lower selection coefficient (two-tailed t-test p = 0.009) compared to the reference WspF V271G and one of the MwsR mutants (E1081K) had significantly higher selection coefficient (two-tailed t-test p = 0.001). The alternative phenotypic solution used by the PFL_3078 promoter mutant resulted in the lowest fitness (s = −0.1, two-tailed t-test p < 0.0001) meaning that it is expected to be rapidly outcompeted by the WS mutants (Figure 7B). The PFL_0087 mutants that were only found when the common pathways were deleted had lower fitness (two-tailed t-tests p = 0.0005, p = 0.001) and this was also true for the wspA mutant (two-tailed t-test p = 0.002), which could explain why these were not found in the wild type population after experimental evolution.
Identification of EPS used for air-liquid colonization
SBW25 WS mutants use cellulose as the main structural component, but even though there is high parallelism at the genetic level for Pf-5 WS mutants this cannot be the case at the phenotypic level as its genome does not encode genes for cellulose biosynthesis. Given that production of Pel exopolysaccharide has been shown to be induced by mutations in wspF in P. aeruginosa (Hickman, et al. 2005) and that Pel in this species is required for pellicle formation under static growth (Friedman and Kolter 2004) this was predicted to be the main structural component used by Pf-5. To test this prediction the pel operon (PFL_2972-PFL_2978) was deleted from Pf-5 and combined with previously characterized WS mutations and fitness was measured. Both invasion fitness (Figure 8A) and competition fitness (Figure 8B) was significantly lower (two-tailed t-tests p < 0.01) compared to isogenic strains with an intact pel operon (Figure 7A, 7B) except invasion fitness for the AwsX mutant (two-tailed t-tests p = 0.08, one outlier). This suggests that Pel polysaccharide serves as an important structural component for colonizing the air-liquid interface and that its production is activated by mutations leading to increased c-di-GMP levels. Although deletion of pel in WS mutants resulted in less wrinkly colony morphology it did not result in a smooth ancestral type. Neither did deletion of pel abolish the ability to colonize the air liquid interface (Figure 8C) or the ability to invade wild type populations (Figure 8A). This suggests that production of an additional EPS component is induced by increased c-di-GMP levels caused by mutations in Wsp, Aws, Mws and PFL_0087 at least in the absence of pel. When the cellulose biosynthetic operon was deleted from SBW25, typical WS mutations in wsp, aws, and mws resulted in air-liquid colonization by use of the alternative structural component encoded by pgaABCD and subsequent deletion of the pgaABCD operon in these mutants resulted in a wild type colony morphology (Lind, et al. 2017b). Deletion of the pgaABCD operon (PFL_0161-PFL_0164) in Pf-5 strains with deletion of the pel operon combined with WS mutations in either wspF, awsX and mwsR did not result in a change in colony morphology or loss of ability to colonize the air-liquid interface, which suggests that PGA is not the secondary structural component used or that yet another EPS is also produced in response to increased c-di-GMP levels. As expected if the motility defect observed for WS mutants are primarily caused by high c-di-GMP levels rather than high production of Pel, the motility was also reduced for Wsp, Aws and Mws mutants with the pel operon deleted (Figure 8D).
Discussion
The extension of the P. fluorescens SBW25 experimental evolution system to related species shows promise for true testing of evolutionary forecasting method and models. While there is a diversity of DGCs and EPSs between species leading to differences in forecasts, the conserved role of c-di-GMP and limited number of phenotypes allow the use of previous data to improve predictions and makes the experimental system robust to changes in environmental conditions. The experimental test of initial forecasts for P. protegens Pf-5 presented here provides support for the ability to predict some aspects of both genetic and phenotypic evolution while recognizing that the probability of specific mutations cannot in most cases be predicted.
That experimental populations of Pseudomonas will colonize the air-liquid interface when incubated under static condition is a prerequisite of extending the model. Given that a range of phenotypic solutions is predicted to be available for all species the evolution of such mutants for P. protegens is not surprising. The specific environmental conditions used for experimental evolution often have a major impact on evolutionary outcomes and is also likely to influence relative fitness and possibly mutational biases also in the WS system. However despite major changes in growth medium, temperature and material and physical dimensions of the growth vessel, predictions on both the genetic and phenotypic levels proved successful demonstrating robustness to environmental change and the establishment of a dominant selective pressure, i.e. access to oxygen solved by air-liquid interface colonization.
Phenotypic predictions of the structural basis supporting air-liquid colonization is challenging given the limited previous experimental data. For SBW25 cellulose-based solutions are superior in fitness, but for Pf-5 this solution in not available. The prediction that overproduction of structural exopolysaccharides, rather than fuzzy, cell-chaining or mucoid types, would be the primary solution was successful. One of the two phenotypes found here used the Pel EPS, which could be predicted based on its role in P. aeruginosa. However it appears to use a secondary EPS as well, that remains to be identified, given that mutants lacking Pel but with activated DGCs still colonize the air-liquid interface and have a distinct colony morphology. The second phenotype used another EPS, encoded by PFL_3078-3093, which had not previously be described and given that several EPS loci are usually encoded in Pseudomonas genomes its use could not be predicted. However, repeating experimental evolution using other Pseudomonas species is likely to provide more information about which EPSs can be used to colonize the air-liquid interface and their relative fitness to allow improved phenotypic predictions. Deletion of the pel operon, the unidentified secondary EPS and PFL_3078-3093 and subsequent experimental evolution could reveal less fit phenotypic solutions that are expected to exist including fuzzy types caused by defects in LPS modification, cell-chaining types with defects in cell division, adhesive proteins or mucoid types using alginate or levan, two EPSs with lower structural stability.
The general prediction of types of mutations, as described in the hierarchy in Figure 2D, was also successful although the relatively few mutants identified here did not allow for detection of rare activating mutations or double inactivating mutations. The majority of mutations were loss-of-function mutations in negative regulators or interacting proteins followed by less common promoter mutations and promoter captures. In contrast to SBW25, where all promoter mutations resulted in up-regulation of DGCs, the mutation upstream of PFL3078-3093 demonstrates the possibility of direct transcriptional activation of EPS components that are not under post-translational control of c-di-GMP. Two identical mutations were found over 9 kb upstream of the aws operon, in between a ribosomal RNA operon and the recCBD operon, which encodes key genes for recombination. The molecular effects of these mutations have not been further investigated, but the resulting WS phenotype is dependent of the presence of the aws operon, deletion of which reversed the phenotype. This is consistent with an up-regulation of c-di-GMP by AwsR presumably caused by increased transcription. The mutation is located in the predicted terminator of the ribosomal RNA operon and increased transcriptional read-through could put the aws operon under control of a very strong rrn promoter that is most highly transcribed during exponential growth. This could explain the relatively mild colony morphology phenotype as well as high motility of this WS mutant (Figure 6).
A mathematical null model that incorporates information about the Wsp, Aws and Mws molecular networks (Null model IV in (Lind, et al. 2019)) successfully predicted that Wsp would be the most commonly used pathway followed by Aws, and Mws the most rare. However, the number of mutants isolated here is rather small and the high frequency of Wsp mutants seems mainly to be caused by a mutational hot spot in wspF. Still the prediction that the three pathways together would contribute the large majority of adaptive mutations (40 out of 43) is not trivial given that in SBW25 at least 13 additional pathways are available to the high fitness WS phenotype (Lind, et al. 2015). It is also worth noting that direct use of mutation rate data from SBW25 (Lind, et al. 2019) would result in poorer predictions than the mathematical null model due to a strong mutational hot spot in awsX in that species. In addition the fourth most common pathway to WS, PFL_0087, could be predicted based on data from SBW25.
For the multi-protein pathways Wsp and Aws, the mathematical model (Lind, et al. 2019) predicted (Figure 2E) that mutations would primarily be found in WspA, WspE, WspF, AwsX and AwsR. Mutations were detected in all these except in WspA and the majority was found in the negative regulators WspF and AwsX. WspA mutations were not found in the original study in SBW25 either (McDonald, et al. 2009), but this was shown to be due to lower fitness relative WspF and WspE mutants rather than a lower mutation rate to WS (Lind, et al. 2019). This is also a likely explanation for the absence of WspA mutants here as well (Figure 7B), but it is not clear if this fitness difference would be conserved in other species or if sometimes WspA mutants are more fit. Thus the null model prediction of equal rates for WspA, WspF and WspE is not changed for future experimental tests. Direct comparison between competitive fitness of mutants in SBW25 and Pf-5 is not possible as it was measured under different experimental conditions, against different reference mutants and in most cases the mutations are not identical. However it is interesting to note that for both species high fitness WS types have mutations in the same proteins (WspF, WspE, MwsR) and low fitness WS types also appear in the same proteins (WspA, AwsX, AwsO, PFLU0085/PFL_0087).
The molecular effects of the mutations found here are unknown, but knowledge from SBW25 and P. aeruginosa and their positions in protein structure allowed some predictions to be made. Inactivating mutations in the negative regulator WspF were predicted to be either indels or missense mutations in four specific regions. Mutations were found in two of the predicted regions, one in the vicinity to the methylesterase active site where mutations are predicted to cause large disruptions in protein structure and the other one directly disrupting the phosphorylation active site in the signal receiver domain. No mutations were found in the surface exposed regions hypothesized to be involved in interactions with WspA and WspE, which could be due to differences in function between SBW25 and Pf-5 or simply that they appear at lower frequency and would be detected if additional mutations were isolated. The sole mutation in WspE is, as predicted, located in the direct vicinity of the phosphorylation active site. Mutations in AwsX were amino acid substitutions throughout the gene as well as in frame deletions inactivating the gene as predicted. Mutations in AwsR and MwsR were also found in predicted regions, but no mutations were found in the small periplasmic region of AwsR, which is the most commonly targeted region in SBW25. Known mutational hot spots in awsX, awsR and mwsR in SBW25 (Lind, et al. 2019) were not conserved in Pf-5 resulting in divergent spectra of mutations, while mutated regions and predicted functional effects remain conserved between the two species. Little is known about the molecular function of the putative DGC encoded by PFL_0087/PFLU0085, but it is clear that a multitude of amino acid substitutions, deletions and insertions in a more than 40 amino acids long region can lead to WS (Lind, et al. 2015). Thus it functions as a small intragenic negative regulator region that might be involved in oligomerization and loss of this interaction results in constitutive activation of c-di-GMP production.
The diversity of phenotypic solutions observed after experimental evolution is dependent on fitness differences between the phenotypes, but also on the rate of which phenotypes are introduced by mutations, which is dependent on the genetic architecture underlying the trait as well as mutational biases. The Pf-5 strain has at least three DGC pathways (Wsp, Aws and Mws) that are subject to negative regulation leading to prediction of a high rate of WS mutants, which are then expected to outcompete other phenotypic solutions. If instead only one of these pathways were present, a larger diversity of phenotypes would be expected to be observed with relative fitness becoming less important as the first mutant that gains a foothold at the air-liquid interface will have a large advantage and priority effects, i.e. being first, will increasingly determine which adaptive mutants are observed.
Given that the mutational target upstream of PFL_3078-3093 is likely be relatively small and that these mutants are rapidly outcompeted by all WS types tested, their relatively high frequency (3/43) is unexpected. Possibly this is due to a higher mutation rate at these sites (Sankar, et al. 2016) or that population structure limits direct competition between these different phenotypes and reduces the importance of relative fitness. In SBW25 low fitness phenotypes that colonize the air-liquid interface based on LPS modification or cell-chaining are observed prior to the rise of WS to high frequencies (Lind, et al. 2017b) due to the presence of mutational hot spots in these genes which make these mutants appear early during the growth phase despite their relatively small mutational targets (Ferguson, et al. 2013; Farr 2015; Lind, et al. 2017b).
In partially predicting evolutionary outcomes in P. protegens, this work lays the foundation for future tests of evolutionary forecasting in related Pseudomonas species by clearly stating predictions on several different levels from phenotype down to which specific regions of proteins are likely to be mutated. Given what is already known about the effects of (for now) unpredictable mutational biases and differences in fitness between different WS types many of the forecasts will inevitably fail. However hopefully they will fail in interesting ways thereby revealing erroneous assumptions. The ability to remove common genetic and phenotypic pathways provides a unique opportunity to also find those pathways that evolution does not commonly use. This is necessary to determine why forecasts fail and update the predictive models for another cycle of prediction, experimental evolution and mutant characterization that make it possible to use this iterative model to define the information necessary to predict short-term evolutionary processes.
Materials and methods
Strains and media
Pseudomonas protegens Pf-5 (previously known as P. fluorescens Pf-5) and derivatives thereof were used for all experimental evolution and phenotypic characterization. E. coli DH5α was used for cloning PCR fragments for genetic engineering (Paulsen et al). P. protegens Pf-5 was grown in tryptic soy broth (Tryptone 17g, Soytone 3g, Glucose 2.5g, NaCl 5g, K2HPO4 2.5g per liter) supplemented with 10 mM MgSO4 and 0.2% glycerol (TSBGM) for experimental evolution and fitness assays. Lysogeny broth (LB) was used during genetic engineering and LB without NaCl and supplemented with 8% sucrose was used for counter-selection of sacB marker. Solid media were 1.5% agar added to LB or TSB supplemented with 10 mM MgSO4, 0.2% glycerol and 10 mg/l Congo red. Motility assays were conducted in 0.3% agar TSB supplemented with 10 mM MgSO4, 0.2% glycerol. Kanamycin was used at 50 mg/l for E. coli or 80 mg/l for P. protegens and gentamicin at 10 mg/l for E. coli or 15 mg/L for P. protegens. Selection plates for cloning contained 5-Bromo-4-Chloro-3-Indolyl β-D-Galactopyranoside (X-gal) at 40 mg/l. 100 mg/L nitrofurantoin was used to inhibit growth of E. coli donor cells after conjugation. All strains were stored at −80°C in LB with 10% DMSO.
Experimental evolution
30 central wells of a deep well plate (polypropylene, 1.1 mL, round walls, Axygen Corning Life Sciences) were inoculated with approximately 103 cells each and incubated at 36°C for 5 days without shaking on two different occasions. Suitable dilutions were plated on TSBGM plates with Congo red after 5 days and incubated at 36°C for 48 h. Plates were screened for colonies with a visible difference in colony morphology and one divergent colony per well were randomly selected based only on its position on the agar plate. In total 43 independent mutants were streaked for single cells twice before overnight growth in LB and freezing. An identical protocol was used for the Δwsp Δaws Δmws strain.
Genome sequencing
Seven mutant strains that did not contain mutations in the wspF and awsX genes were analyzed by genome resequencing. The strains had mutations in awsR, mwsR, wspE, upstream PFL_3078 (2 strains) and in the intergenic region between rrfB and recC upstream of the awsXRO operon. Genomic DNA was isolated with Genomic DNA Purification Kit (Thermo Fisher). Sequencing libraries were prepared from 1μg DNA using the TruSeq PCRfree DNA sample preparation kit (cat# FC-121-3001/3002, Illumina Inc.) targeting an insert size of 350bp. The library preparation was performed according to the manufacturers’ instructions (guide#15036187). Sequencing was performed with MiSeq (Illumina Inc.) paired-end 300bp read length and v3 sequencing chemistry.
Sequencing was performed by the SNP&SEQ Technology Platform in Uppsala. The facility is part of the National Genomics Infrastructure (NGI) Sweden and Science for Life Laboratory. The SNP&SEQ Platform is also supported by the Swedish Research Council and the Knut and Alice Wallenberg Foundation. Sequencing data were analyzed with using Geneious v. 10.2.3 with reads assembled against the P. protegens Pf-5 genome sequence (CP000076.1).
Sanger sequencing
Sanger sequencing were performed by GATC biotech and used to sequence candidate genes to find adaptive mutations and to confirm reconstructed mutations. Primer sequences are available in Table S1.
Reconstruction of mutations
Thirteen mutations representing all candidate genes found using Sanger or Illumina sequencing as well as PFL_0087 and WspA mutations were reconstructed in the wild type ancestral P. protegens Pf-5 to show that they are the cause of the adaptive phenotype and to be able to assay their fitness effects without the risk of secondary mutations that might have occurred during experimental evolution. A two-step allelic replacement protocol was using to transfer the mutation into the ancestor. First a 1-2 kb fragment surrounding the putative adaptive mutations were amplified using PCR (Phusion High-Fidelity DNA polymerase, Thermo Scientific) and ligated into the multiple cloning site of the mobilizable pK18mobsac suicide plasmid (FJ437239) using standard molecular techniques. The ligation mix was then transformed into competent E. coli DH5α using heat shock. After confirmation of correct insert size by PCR the plasmid was transferred to P. protegens Pf-5 by conjugation with the donor strain and an E. coli strain carrying the conjugation helper plasmid pRK2013. Cultures were grown overnight of the recipient P. protegens Pf-5 (20 ml per conjugation at 30°C in LB) and 2 ml each of the donor and helper E. coli strains per conjugation at 37°C in LB with kanamycin. The culture of P. protegens Pf-5 was heat shocked for 10 minutes at 42°C prior to centrifugation at 4000 rpm for 10 minutes and resuspension in a small volume of LB. Donor and helper cells were collected by centrifugation 4000 rpm for 10 minutes, resuspended in LB, and mixed with the concentrated recipient cells. After another round of centrifugation the conjugation mix was resuspended in 50 μl LB and spread onto several spots on a LA plate followed by incubation overnight at 30°C. Each spot of the conjugation mix was scraped of the plate and resuspended in 200 μl LB each and plated on LA plates with kanamycin, to select for transfer of the plasmid, and nitrofurantoin that prevents growth of the E. coli donor and helper cells. The pK18mobsac plasmid has a pBR322 type origin and cannot replicate in P. protegens Pf-5 and only cells where the plasmid has integrated into the chromosome by homologous recombination, with the homology provided by the cloned fragment, can grow in the presence of kanamycin. After streaking for single cells on LA plates with kanamycin, the P. protegens Pf-5 strains with integrated plasmids were grown overnight in LB at 30°C without antibiotics to allow for double crossover homologous recombination resulting in loss of the integrated plasmid. The plasmid also contains the sacB marker conferring sucrose sensitivity, which allows for counter-selection by plating on LA plates with sucrose. Sucrose resistant colonies were checked for loss of the kanamycin marker and DNA sequencing of the cloned region to find strains with the reconstructed mutation and no other mutations.
Deletion of the wsp, aws, mws and pelABCDEFG (PFL_2972-PFL_2978) regions was accomplished using the same two-step allelic exchange protocol using SOE-PCR to generated at fragment surrounding the operon as previously described (Ferguson, et al. 2013; Farr 2015; Lind, et al. 2017b). Gene synthesis (Thermo Fisher) was used to make DNA fragments used for deletion of PFL_0161-PFL_0164 and WspA T293-E299. Primer sequences are available in Table S1.
Fitness assays
Two types of competition fitness assays were performed similarly to previously described (Lind, et al. 2015). The first assay measures invasion fitness, where a mutant is mixed 1:100 with the wild type ancestor, simulating early stages of air-liquid interface colonization where a rare mutant establishes and grows at the surface with no competition from other mutants. The second assay instead measures competition fitness in a 1:1 competition against a reference mutant strain, which here was chosen to be the WspF V271G mutant because it was the most commonly found in the experimental evolution study and thus is highly successful either because of a high rate of emergence, i.e. a mutational hot spot, or higher fitness than most other WS mutants. In addition, the WspF V271G mutant has a temperature sensitive colony morphology phenotype in that it is highly wrinkly at 30°C, but only have a very mild phenotype when grown at room temperature, thus allowing it to be distinguishable from both the smooth ancestor and all other wrinkly mutants isolated here.
Fluorescent reference strains of the wild type ancestor and the WspF V271G mutants were created using a miniTn7 transposon (miniTn7(Gm) PA1/04/03 Gfp.AAV-a) (Lambertsen, et al. 2004) that allows integration at a defined locus (attTn7) in the chromosome. This allows the colonies to be distinguished not only by morphology but also by fluorescence under blue/UV light and gentamicin resistance, which provides a way to ascertain that secondary adaptive mutants that might occur during the competition experiment do not bias the results (for example the ancestor could evolve WS types or a WS mutant can evolve to cheat on the other type by inactivation of EPS production or reduced c-di-GMP signalling). Introduction of the transposon into P. protegens Pf-5 was performed by tri-parental conjugation from E. coli with helper plasmids pRK2013 (conjugation helper) and pUX-BF13 containing the transposase genes) using the same conjugation protocol described above.
The invasion assay was performed by mixing shaken overnight cultures of the competitor 1:100 with the GFP-labeled reference ancestor followed by 1000-fold dilution and static incubation at 36°C for 48 h in TSBGM medium in deep well plates (1 ml per well, using only the central 60 wells). For the competition assay, the GFP-labeled reference strain WspF 271G was mixed 1:1 with the competitor and diluted 6-fold and grown for 4 h (shaken at 30°C), before plating to determine initial ratios, to ensure the cells were in a similar physiological state at the start of the competition. The competition cultures were then diluted 1000-fold in TSBGM medium and grown in deep well plates (1 ml per well, using only the central 60 wells) static for 24 h at 36°C. Selection coefficients (s) were calculated as previously described (Dykhuizen 1990), where s = 0 is equal fitness, positive is increased fitness and negative is decreased fitness relative to the reference strain. Briefly s is calculated as the change in logarithmic ratio over time according to s = [ln(R(t)/R(0))]/[t], where R is the ratio of mutant to reference and t is the number of generations of the entire population during the experiment (estimated from viable counts). The cost of the fluorescent marker were calculated from control competitions where the GFP-labeled reference strains (wild type and WspF V271G) were competed against isogenic strains without the marker and included in each plate under identical conditions during the fitness assays and used to adjust the selection coefficients to compensate for the cost.
Motility assays
Swimming motility assays were performed in TSBGM plates with 0.3% agar (BD) and the diameter was measured after 24 h of growth at room temperature. Each strain was assayed in duplicates on two different plates.
Bioinformatics analysis of DGCs and EPS genes
Homologs for all DGCs in P. fluorescens SBW25 were found using the Pseudomonas Ortholog Database at Pseudomonas.com (Winsor, et al. 2016). Blast-p searches for GGDEF domains were performed to find remaining DGCs in the six Pseudomonas species and their homologs again found using the Pseudomonas Ortholog Database (Whiteside, et al. 2013) and manually inspected. Annotations (Pseudomonas.com. DB version 17.2) were also searched for diguanylate cyclase and GGDEF. Not all DCCs found are likely to have diguanylate cyclase activity, but given the difficulties of predicting which of the partly degenerate active sites are likely to be inactive combined with the possibilities of mutational activation during experimental evolution, none were excluded.
There is no simple way to find all genes that can function as structural or regulatory genes to allow colonization of the air-liquid interface. Thus the selection in Figure 3B and Figure 3 – source data should not be considered complete. Putative EPS genes were found using blastp searches with sequences from known exopolysaccharide biosynthesis proteins including cellulose, PGA, Pel, Psl, Pea, Peb, alginate and levan. Homologs were then found using the Pseudomonas Ortholog Database (Whiteside, et al. 2013) at Pseudomonas.com (Winsor, et al. 2016). Annotations (Pseudomonas.com. DB version 17.2) were also searched for glycosyltransferase, glycosyl transferase, flippase, polysaccharide, lipopolysaccharide, polymerase, biofilm, adhesion and adhesion. Based on previous work in SBW25 and literature searches a few additional genes were added.
Competing interests
The author declares no competing interests.
Acknowledgements
This work was supported by grants from Carl Tryggers Foundation for Scientific Research and Magnus Bergvalls Foundation.