Abstract
Predicting the course of evolution is critical for solving current biomedical challenges such as cancer and the evolution of drug resistant pathogens. One approach to studying evolutionary predictability is to observe repeated, independent evolutionary trajectories of similar organisms under similar selection pressures in order to empirically characterize this adaptive fitness landscape. As this approach is infeasible for many natural systems, a number of recent studies have attempted to gain insight into the adaptive fitness landscape by testing the plausibility of different orders of appearance for a specific set of adaptive mutations in a single adaptive trajectory. While this approach is technically feasible for systems with very few available adaptive mutations, the usefulness of this approach for predicting evolution in situations with highly polygenic adaptation is unknown. It is also unclear whether the presence of stable adaptive polymorphisms can influence the predictability of evolution as measured by these methods. In this work, we simulate adaptive evolution under Fisher’s geometric model to study evolutionary predictability. Remarkably, we find that the predictability estimated by these methods are anti-correlated, and that the presence of stable adaptive polymorphisms can both qualitatively and quantitatively change the predictability of evolution.
1 Introduction
Predicting evolution is one of the fundamental challenges of evolutionary biology (reviewed in Lobkovsky and Koonin 2012; de Visser and Krug 2014). This question became particularly prominent with Gould’s famous thought-experiment on “replaying the tape of life” (Gould 1989), which asks how our evolutionary history would have changed if we could re-run evolution from some point in the past. Evolutionary predictability can be studied using multiple methods (de Visser and Krug 2014; Achazet al. 2014), including studying populations evolving in parallel (e.g. Gould’s thought-experiment) or characterizing the likely order of mutations in a historical bout of adaptation (Weinreichet al. 2006) using an experimental design pioneered by Malcolmet al. (1990).
To study evolutionary predictability, one must be able to quantify the similarity between two different evolutionary paths. This quantification can be done in a number of different ways, which we classify using four different properties. The first property is the “type” of predictability one is studying, which can be either the predictability of the evolutionary future of a given system (“future predictability”, i.e. predictability as defined by Gould), or the predictability of the order of mutations in a completed adaptive path (“historical predictability”, i.e. predictability as inferred using the Weinreichet al. (2006) method). The second property is the “level” at which we are studying predictability, which can either be conducted at the genotype, phenotype or fitness level. More concretely, one can compare two evolutionary paths by the specific adaptive mutations that occurred in each path, the similarities of the organisms in some physiological trait (e.g. body-mass index), or by comparing their fitness. The third property is the “starting point” of the evolutionary paths being studied, and is only relevant for future predictability studies. In “parallelism” studies, one tries to characterize evolutionary predictability using organisms starting from similar or identical genetic, phenotypic or fitness states (e.g. when different populations of the same species independently evolve in similar ways when under similar selective pressures), while “convergence” studies characterize how organisms with distinct initial states reach similar evolutionary solutions for similar biophysical problems (e.g. the evolutionary convergence in both birds and bats to use wings for flight). The fourth property is the “resolution” of the data being used to study evolutionary predictability, which can range from a single sample of genomes from an extant population, a few major phenotypic transitions inferred from the fossil record or detailed knowledge of every single adaptive mutation that ever reached appreciable frequency in the population during the study period.
Perhaps the best known experimental study of future predictability is a set of 12 parallel Escherichia coli lineages that have been evolving in controlled laboratory conditions for over 60,000 generations (Wiseret al. 2013; Lenskiet al. 2015; Tenaillonet al. 2016). In our formalism, these are studies of parallel future predictability, and analyses of this system have been conducted at the genotypic, phenotypic and fitness levels with very high resolution. One remarkable result from these experiments is that replicate independent evolution experiments frequently acquired similar adaptive mutations and experienced similar gains in fitness over the course of evolution (Crozatet al. 2010; Wiseret al. 2013), which has also been shown in other short-term laboratory evolution experiments (Tenaillonet al. 2012; Langet al. 2013; Kvitek and Sherlock 2013; Venkataramet al. 2016). This suggests that evolution can be future predictable to a surprising degree over both short and long timescales.
Studying future predictability is critical to many current problems in evolutionary biology, including cancer and the development of drug resistance. However, future predictability studies are either technically or ethically infeasible in many systems of interest, leading to the development of methods to study historical predictability. While it is possible to study historical predictability at the phenotypic level (e.g. by inferring the order of phenotypic transitions from the fossil record), most recent studies characterize genotypic historical predictability. In the first major study of historical predictability, Weinreichet al. (2006) reconstructed all 32 possible combinations of 5 mutations in the beta-lactamase gene in E. coli, which are known to confer resistance to the drug rifampicin. They then quantitatively assayed the drug resistance of these 32 genotypes as a proxy for fitness, and used this information to analyze all 5! = 120 possible mutational paths (orders in which the 5 mutations could occur) from the ancestral to the resistant five-mutation genotype. A mutational path was deemed viable if resistance monotonically increased with every mutational step, and the relative likelihood of each viable path was calculated using standard population genetic methods. A number of other groups have also studied historical predictability, including studies in Aspergillus niger (Frankeet al. 2011), an empirical RNA-protein binding landscape (Buenrostroet al. 2014), adaptive mutations identified from ancestral protein reconstruction (Bridghamet al. 2006; Ortlundet al. 2007) and adaptive mutations from a laboratory evolution experiment (Khanet al. 2011).
Despite these numerous studies, it is unknown whether historical predictability is truly a useful proxy for studying future predictability. While historical predictability is likely accurate in systems where adaptation is known to be limited to a very small number of mutations (e.g. the 5 mutations used by Weinreichet al. (2006) are the only major rifampicin resistance mutations in that system), it is unclear whether conclusions drawn from studies of historical predictability are similar to the results of studies of future predictability when not every available adaptive mutation is used for inferring historical predictability. For example, it is unknown whether the historical predictability results of Frankeet al. (2011) in 2-6 mutation subsets of an 8 locus system are informative in understanding adaptation on the entire 8-locus fitness landscape. Due to the combinatorial nature of studying historical predictability, where 2n genotypes need to be considered in a system of n adaptive mutations, it is extremely challenging to study historical predictability over all known adaptive mutations in systems with highly polygenic adaptation.
In addition, all of these studies of both future and historical predictability were conducted under the assumption that all adaptive mutations successively fix in the evolving population. The presence of stable polymorphisms may significantly influence the relationship between future and historical predictability by modifying the fitness advantage of a mutation based on the alleles already present in the population. As stable polymorphisms can be generated by a variety of mechanisms, including heterozygote advantage, frequency dependent selection, and spatio-temporal variability in selective pressures, one might expect that they play a major role in the evolution of natural populations and thus influence the study of evolutionary predictability in natural systems.
In this work, we simulate adaptive evolution using Fishers geometric model (FGM, Fisher 1930; Orr 1999, 2005) to study the relationship between future and historical predictability, as well as the impact of balanced polymorphisms on the study of predictability. FGM is a phenotypic model where individuals have phenotypes defined as points in n-dimensional space. Mutations are arbitrary vectors in this n-dimensional space, allowing for an infinite number of functionally distinct possible mutations and providing an excellent model of polygenic adaptation. A fitness function is then used to map phenotypes into fitness, generating a fitness landscape that can be used to simulate adaptive evolution. FGM is a useful framework to study adaptation as it has been found to be consistent with many empirical results, including the distribution of fitness effects (DFE) of beneficial mutations, the distribution of epistasis and the effect of epistasis on the DFE, and the presence of antagonistic pleiotropy (Martinet al. 2007; Sousaet al. 2012; MacLeanet al. 2010; Blanquartet al. 2014)(reviewed in Martin 2014; Tenaillon 2014). In addition, Selliset al. (2011) showed that adaptive mutations in diploid FGM simulations are frequently overdominant (exhibit heterozygote advantage) if the mutations are sufficiently large in phenotype space, resulting in balanced polymorphisms. These overdominant mutations are temporarily stable, but they can be driven out of the population by subsequent adaptive mutations.
In this work, we compute both the parallel phenotypic future predictability and genotypic historical predictability of evolution from the same simulations of adapting populations to test whether future predictability and historical predictability are correlated. We then use both of these metrics to test whether overdominant mutations significantly impact the predictability of evolution. We find that these two types of predictability are anti-correlated in our simulations, and that the presence of stable polymorphisms can both quantitatively and qualitatively change our ability to predict evolution.
2 Model
2.1 Fisher’s Geometric Model Simulations
2.1.1 Overview of model and simulations
In order to study predictability, we first need to generate a large number of independent adaptive trajectories. We utilize a variant of the standard haploid continuum-of-alleles Fisher’s geometric model (FGM) framework (Fisher 1930; Kimura 1965) that has been modified to consider diploid individuals (Selliset al. 2011).
In FGM, we model a set of n independent quantitative phenotypic traits, which can be considered a vector r in n-dimensional space. This vector defines an allele, thus resulting in a continuum of possible alleles across this n-dimensional space. In our simulations, the initial population was defined to be monomorphic with a single fixed allele ranc, where ||ranc|| was set to be 2 units from the optimum. Mutations in this model are defined by a mutation vector m, which is used to modify an existing allele to generate a new allele. These vectors are drawn from a continuous distribution, and thus new mutations can produce an infinitely many different alleles. The phenotype vector associated with any allele can thus be calculated as ranc + Σmi where we sum over all mutations that gave rise to the allele of interest. All mutations are assumed to be in complete linkage with each other. A haploid individual, which has one allele, has a phenotype identical to the phenotype of the allele, while the phenotype of a diploid individual is the average of the phenotype vectors of the constituent alleles. We assume sexual reproduction for diploids with free assortment of alleles and no recombination. The fitness of an individual is a spherically symmetric gaussian function of an individual’s distance from the optimum where a is a constant defined by our parameter regimes (described in the next section).
As a concrete example (Figure 1), let us consider a geometric model consisting of n = 2 phenotypic dimensions (traits 1 and 2) and considering two separate mutational events (A and B). We will begin with the haploid case (Figure 1a). The ancestral allele (anc) has a predefined phenotype (rab, Figure 1a cross symbol) with fitness defined as a gaussian function of its distance from the optimal phenotype (||ranc||). A mutation mA on this ancestral allele would result in the allele A (plus symbol), defined by its phenotype rA = ranc + mA. A further mutation mB on the A allele would result in the AB allele with phenotype rAB = rA + mB (open circle).
We will now consider the diploid case (Figure 1b), using the same alleles and mutation vectors as in the haploid example. For clarity, we do not show the mutation vectors or phenotype vectors, but display the phenotypes associated with a given genotype as a point. In this case, as individuals have multiple alleles, the phenotype of an individual with a given genotype is the midpoint of the phenotypes of the component alleles. Thus, an individual homozygous for the A allele (A/A genotype) would have the same phenotype as an A haploid individual, while an individual heterozygous for the A mutant and ancestral alleles would have an intermediate phenotype (A/anc, star in Figure 1b).
This model is used to conduct forward Wright-Fisher simulations to generate the adaptive walks used for analysis throughout this work. The simulations use the code modified from Selliset al. (2011) to allow for more than 2 dimensions. We perform 2,500 replicate simulations using a standard Wright-Fisher approach (Fisher 1930; Wright 1931) in both haploid and diploid populations. Haploid simulations are conducted with a population size N = 10,000, while diploid simulations were conducted with N = 5,000. Simulations are conducted for 10,000 generations, where each generation consists of mutating alleles and then propagating alleles to the next generation, during which we also impose selection. Individuals to be mutated are uniformly sampled according to the mutation rate µ = 5 ∗ 10−6, while the mutation vectors m have a magnitude drawn from the exponential distribution with λ = 2 and a uniformly distributed direction. Propagation is conducted via sexual reproduction in diploids with the possibility of selfing, with an implicit assumption of random mating between all individuals present in the current generation to compute the genotype frequencies of the offspring. We assume that there are an infinite number of offspring, which are then multinomially sampled in a manner weighted by both the frequency of each genotype in the offspring pool and the fitness of that genotype to give rise to the population present in the next generation (i.e., viability selection on the offspring). Note that this sexual reproduction process is only meaningful in the diploid model to compute the offspring pool genotype frequencies. In haploids, the genotype weights used in the multinomial sampling pool are identical to their frequencies in the current generation multiplied by their fitness.
For all of our statistical analyses, we considered only those mutations that are present on the most frequent allele at generation 10,000. These mutations are the set of mutations present in the adaptive walk that are most easily sampled in natural populations. The low per-generation mutation rate (0.05 mutations per generation) allowed us to use the strong-selection weak-mutation (SSWM) assumption for our analysis to consider each mutation by itself. This assumption is consistent with biological systems with small effective population sizes which results in periodic fixation of beneficial mutations. Some examples of systems with small enough effective population sizes to be in this regime (tens of thousands of individuals or less) include humans (Tenesaet al. 2007), microbial communities in nectar flowers (Herreraet al. 2009) and many species listed by the IUCN as endangered or vulnerable (IUCN 2016). We additionally limited our analysis to the first five mutations of each adaptive walk in order to compare adaptive walks of equal lengths and re-ran simulations with fewer than five mutations until they met this criteria. This is comparable to many recent studies studying genotypic historical predictability (Weinreichet al. 2006; Khanet al. 2011; Frankeet al. 2011). For simplicity of analysis, diploid simulations that generated stable polymorphisms containing three or more alleles were discarded and re-run until they met this criteria. We partitioned the diploid simulations into those that did and did not contain overdominant mutations to study the impact of stable polymorphisms on the predictability of evolution (described in a subsequent section).
2.1.2 Parameter regimes
In all of our simulations, the population initially contains a single allele with a distance of 2 units from the optimum. Since we are using only spherically symmetric fitness functions, the exact position is irrelevant. We conduct our first set of simulations in a two dimensional regime (n = 2) with a poorly adapted initial population that is far from the optimum (2D-Far regime). The gaussian fitness function for this regime is defined with a = 2. We conduct additional simulations in two additional regimes to validate our results: one regime where the population is initialized close to the phenotypic optimum (2D-Close regime, a = 18), and one regime where the population is evolving in 10-dimensional space rather than 2-dimensional space (10D-Far regime). The 2D-Close regime was selected such that the initial population, at 2 units from the optimum, is in the “concave-down” portion of the fitness surface.
2.1.3 Partitioning adaptive walks
In order to explore the effect of overdominance on predictability we have separated the diploid simulations into those with and without overdominant mutations. The methodology for this separation is based on the fact that all overdominant mutations must be capable of creating a stable polymorphism with two alleles, so by inferring whether or not a mutation can create such a stable polymorphism, we can infer whether or not it is overdominant. We begin with the five-mutation adaptive walks identified previously and first determine the time t5 at which the allele containing these first five mutations exceeded 5% frequency in the population. All time-points after t5 are no longer considered for analysis. At each generation t ≤ t5, we isolate all alleles in the population at ≥ 1% frequency. For every subset of these alleles, we compute their equilibrium frequencies and mean fitness using the method of Kimura (1956).
Briefly, this is done by computing an n x n matrix A for the n alleles under consideration, where the value of Aij is the fitness of the genotype defined by alleles i and j. We also consider a matrix T, where Tij = Aij − Ain − Ajn + Ann. If we denote the equilibrium frequency of the ith allele by xi, we get where Δi, is the determinant made by substituting ones for all the elements of the ith column in the matrix A. With these equilibrium frequencies, we can easily calculate the mean fitness at this equilibrium as where Δ is the determinant of A.
The necessary and sufficient conditions for determining whether a set of n alleles can make a stable equilibrium in the first place are: 1) that the quadratic form T be negative definite and 2) (−1)n−1Δi > 0 for all i = 1, 2, .. , n. For further reading, please see Kojima (1959); Mandel (1959); Kingman (1961).
If a set of alleles generates a stable polymorphic state at equilibrium, we infer that there is an overdominant mutation present among those alleles. An FGM simulation is determined to contain an overdominant mutation if, for any generation t ≤ t5, the subset of alleles with the highest mean fitness at generation t is a stable polymorphism at equilibrium. For simplicity, we removed simulations that contained stable polymorphisms with ≥3 alleles for ≥50 generations so that we only need to consider 2 allele stable polymorphisms for the remainder of this work.
For the diploid simulations in each parameter regime, we ensured that at least 500 of the simulations did not contain any overdominant mutations (the identification of which is described in the next section) by simply rerunning some of the simulations until this criteria was met. This was done to ensure that we had a sufficient number of simulations with and without overdominant mutations for statistical analysis.
The rationale for this approach is as follows. An allele generated by an overdominant mutation that successfully invades the population must produce a stable polymorphism. This is also the only means by which a stable polymorphism can be generated in our simulations, as we do not allow for any other mode of balancing selection. However, it is not possible to robustly detect stable polymorphisms simply by tracking the frequencies of the mutations in the population and detecting whether they are being maintained in the population, due to (rare) issues with clonal interference (Desai and Fisher 2007; Herron and Doebeli 2013; Kvitek and Sherlock 2013; Langet al. 2013). It is also not possible to directly infer that an allele is overdominant by simply comparing the fitness of different genotypes to detect heterozygote advantage as there are potentially more than two alleles present in the population when the mutation reaches substantial frequency either due to clonal interference or due to a mutation invading the population when there is already a stable polymorphism from a prior overdominant mutation.
Therefore, we need to separately test whether each new mutation in the population could result in a stable polymorphism if no additional mutations were allowed (eliminating the clonal interference problem). Since there are an arbitrary number of alleles present in the population at any one time (again, due to clonal interference), we make a simplifying assumption that the set of alleles will not result in an equilibrium resulting in more than 2 alleles being stably maintained in the population, which is valid as we have prescreened all of our simulations to eliminate any that contain stable polymorphisms of more than 2 alleles.
We can now utilize the Kimura method for calculating both whether a set of alleles could be stably maintained at equilibrium and the mean fitness of the population at that equilibrium for determining 1) whether the new mutation can invade the population and 2) whether it will be maintained as a stable polymorphism if it does invade. The set of alleles under consideration is all of the alleles that already existed in the population and the new allele generated by the new mutation. We use the Kimura method on all pairs of these alleles and identify the pair that generates the highest mean fitness at equilibrium. This highest fitness equilibrium state is then used to determine whether 1) the new allele successfully invaded (if the pair with the highest equilibrium fitness results in an equilibrium state that does not include the new allele, it cannot invade the population) and 2) whether it is being maintained as a stable polymorphism or has fixed in the population (the presence of a stable polymorphism implies that the mutation that generated the new allele was overdominant).
2.1.4 Identification of hidden alleles
For each generation of every FGM simulation, we identified the expected equilibrium state of the population considering only those alleles at >1% frequency at that generation in section 2.1.3. We then identify the hidden alleles for a simulation as alleles in the equilibrium population states for all generations t < t5 with that does not solely consist of a subset of the five mutations under consideration. In other words, hidden alleles are those alleles that reach substantial frequency before t5 and would have been present in a stable equilibrium if they had not been out-competed by a later allele and were thus excluded from our original identification of the 5-mutation adaptive walk.
2.2 Studying future predictability
We first quantify the average adaptive walk in phenotype space for each ploidy and parameter regime used in our simulations. Given that all of our adaptive walks consist of exactly five mutations, the average adaptive walk also consists of five mutations, where the first mutation vector is the average of all of the observed first mutation vectors across all simulations, the second mutation vector in the average walk is the average of all observed second mutation vectors, and so on. This average adaptive walk matches the expected adaptive walk of a straight line in phenotype space leading from the ancestral phenotype directly towards the fitness optimum. We use the average instead of the straight line as the average is directly measurable from experimental data while the line requires comprehensive knowledge of the fitness landscape, the knowledge of which would eliminate any need to study the predictability of the system.
In order to compute a summary statistic for the phenotypic parallel future predictability of an adaptive walk, we calculate the minimum distance of each observed allele during the adaptive walk from the average walk, and then taking the maximum across all observed alleles of these minimum distances as a measure of the deviation of the adaptive walk from the average walk in phenotype space. An adaptive walk that has a smaller deviation is more predictable than one that has a larger deviation.
2.3 Studying genotypic historical predictability
Studies of genotypic historical predictability seek to reconstruct the order in which a set of mutations arose in an adaptive trajectory. We can then study the probability distribution of all of the possible adaptive trajectories to understand how predictable the system is overall.
2.3.1 Computing the likelihood of a particular order of mutations
We begin with an overview of the historical predictability method used by Weinreichet al. (2006) for inference when assuming that each adaptive mutation fixes in the population, and then continue on to a description of our implementation of the method which is suitable when stable polymorphisms are possible. As mentioned before, we expect stable polymorphisms to frequently occur in our diploid simulations. We explicitly model the phenotypes of the alleles and mutation vectors, and use the same fitness functions as in the FGM simulations to compute fitness.
Weinreich et al (2006) inference method
Weinreichet al. (2006) describe the probability of the ancestral allele (Awt) evolving into the derived allele containing all 5 mutations available (Ader) going through a particular order of mutations (Mi) with intermediate alleles a, b, c and d. This can be computed as because “along any particular trajectory the choice of each next fixation is statistically independent of all previous fixations. Here, the Pr(i → j) are the conditioned fixation probabilities of a particular single mutant neighbor j of an allele i given by where Πi → j is the unconditioned fixation probability of allele j from allele i, and Ni is the set of all mutational neighbors of allele i.” (modified from Weinreichet al. (2006) Supplementary Methods). In essence, (Weinreichet al. 2006) compute the probability of a particular order of mutations as the product of the probabilities of each mutation in that order successfully fixing in the population in succession.
Our method to study genotypic historical predictability
As our simulations violate some of the assumptions of the historical predictability inference method of Weinreichet al. (2006), we need to modify the method to account for these violations (please see the supplementary text for a detailed example of the implementation described in this section). First, since we are using a diploid model, new mutations occur as heterozygotes and thus must invade the population as heterozygotes. Therefore, we cannot compute the fixation probability, but must compute the probability of an allele successfully invading the population from low frequency and reaching its equilibrium frequency. Secondly, in the presence of a stable polymorphism, new mutations can occur on multiple available backgrounds. This allows for the generation of hidden alleles. This also implies that it may take more than 5 mutations in a mutation order to generate the allele with all 5 mutations. Finally, a new mutation that successfully invades can either fix or balance with any of the alleles already present in the population, violating the Weinreichet al. (2006) assumption that the fitness of an allele is independent of the other alleles already present in the population. Therefore, we cannot treat the adaptive walk as a series of independent steps but need to take an integrated approach to study historical predictability. As it is challenging to describe the method using closed form analytic equations as in Weinreichet al. (2006), we will describe the recursive algorithm we use to implement the method using pseudocode. Every call to the algorithm requires a population state (set of alleles and their frequencies), a set of alleles observed during the recursion and the probability of the mutation order so far. Using global variables outside of the algorithm, we keep track of Φ(Mi), the unconditioned probability of every possible mutation order Mi. All Φ(Mi) are initialized to 0.
Historical predictability inference(Sexisting, Aexisting,Pexisting)
1: Sexisting ← the population state = a set of alleles and their frequencies 2: Aexisting ← the set of alleles observed so far in this mutation order 3: Pexisting ← the unconditioned probability of this order of mutations so far 4: if Ader ε Sexisting then 5: We need to first determine the order Mi in which the mutations were introduced into Ader and add Pexisting to the unconditioned probability for this order of mutations (Φ(Mi)) 6: return // We are done since we have successfully generated Ader 7: else 8: ρtotal = 0 9: for all new alleles An that can be generated by a single mutation on the alleles in Sexisting, excluding those where An ∈ Aexisting do 10: for all pairs of alleles Ai, Aj in the set of alleles including An and every allele in Sexisting do 11: Compute the frequency of Ai and Aj and the mean fitness of the population at equilibrium assuming these are the only two alleles in the population 12: Snew = the pair of alleles and their frequencies with the highest mean fitness computed in the preceding for loop excluding all alleles at frequency 0 13: if An ∉ Snew then 14: An cannot invade Sexisting and can thus be ignored 15: else 16: compute = the probability of invasion of An into Sexisting through 10,000 forward Wright-Fisher simulations 17: The unconditioned probability of An succeeding in this population ρn = * the frequency of the allele in Sexisting that was mutated to generate An 18: ρtotal+ = ρn 19: for all new alleles An with ρn > 0 do 20: Snew and ρn defined as above for An 21: if using the Weinreich et al method then 22: Snew = An at frequency 1 (fixation) 23: Anew = Aexisting ∪ An 24: 25: Historical predictability inference(Snew, Anew, Pnew) // recursive call
The initial call to this algorithm has Sexisting be the ancestral population used in the FGM simulations i.e. a population monomorphic for an allele two units from the optimum, Aexisting is the set containing the single element Awt and Pexisting = 1. Once we have computed the unconditioned probability Φ(Mi) for every Mi, we then use this information to compute the conditioned probability for each mutation order. Note that we track mutation orders by the order in which the mutations were introduced on allele Ader, which is always five mutations long, not the order in which the mutations were introduced in the population in the algorithm which is >= 5 mutations. If multiple recursions through the algorithm use the same order of mutations in Ader, the likelihoods from all of these recursions are summed to get Φ(Mi) for that particular Mi.
We compute the invasion probability of a new allele using 10,000 forward Wright-Fisher simulations. In these simulations, we set N = 5,000 diploid individuals as in our FGM simulations, with no new mutations allowed.
The probability of a new allele successfully invading and reaching the deterministically inferred stable equilibrium is then the fraction of Wright-Fisher simulations where An reaches 90% of its expected equilibrium frequency in Snew. These simulations are entirely separate from the FGM simulations used to generate the adaptive walks used throughout the rest of this work. We are forced to utilize empirical estimations through simulations and not the classical analytic solutions to compute (Haldane 1927; KIMURA 1962) as many of the observed mutations have a selective advantage exceeding 100%, violating the assumptions of the analytic solutions that the mutations are weakly beneficial. Our simulations suggest that the analytic solutions significantly overestimate the invasion probability under these conditions (data not shown).
We study historical predictability in our haploid simulations using a similar algorithm to that used for the diploid simulations. The major differences are that we 1) consider only single alleles instead of pairs of alleles when identifying the equilibrium population after a mutation and 2) empirically compute the invasion probability of a new mutation using forward Wright-Fisher simulations using a population of N = 10,000 haploid individuals instead of using a diploid model.
2.3.2 Quantifying genotypic historical predictability
To quantitatively study the results of our genotypic historical predictability analysis across simulations, we define the effective number of paths statistic as The effective number of trajectories is defined to be 0 when there are no viable trajectories, i.e. when This is similar to the effective number of alleles in a population (Kimura and Crow 1964), the predictability metric of Roy (2009) and the entropy metric of Palmeret al. (2013).
When a single trajectory dominates the probability density, the effective number of trajectories is close to 1, indicating high historical predictability. On the other hand, if every trajectory has equal probability, since we know that there must be n! possible mutation orders for a system of n mutations. In this situation, the effective number of paths = n! = total number of possible mutation orders, indicating low historical predictability. This provides a single metric of the diversity of mutational orders that are possible while accounting for their relative likelihoods and summarizes the historical predictability of the adaptive walk.
2.4 Source Code
Complete source code for the FGM simulations is available at https://github.com/sunthedeep/Fisher-Geometric-Model.
3 Results
In this work, we study the predictability of evolution when adaptation can be highly polygenic, such that comprehensively sampling the entire fitness landscape is combinatorially infeasible. We compute the phenotypic parallel future predictability of an FGM simulation by comparing it to the average simulation from the same parameter regime (see Model section 2.2 for details). Simulations with a larger deviation from the average are thus less future predictable. We then compare these results to our effective number of paths statistic, which measures genotypic historical predictability. This statistic captures the distribution of the likelihoods of all possible orders of mutations observed in a given simulation (see Model section 2.3.2 for details). If measuring historical predictability is useful in inferring the future predictability of evolution, then the historical predictability of a simulation should be positively correlated with its future predictability.
3.1 Comparison of future and historical predictability
We first correlate future and historical predictability using both haploid and diploid simulations in three different parameter regimes (2-dimensional regimes close and far from the optimum, and a 10-dimensional regime far from the optimum, see Model section 2.1.2 for details). We find a strong and significant negative correlation in all of these comparisons (Figure 2). In other words, adaptive walks that are highly historically predictable are future unpredictable as they are highly divergent from the average adaptive trajectory.
3.2 Stable polymorphisms and evolutionary predictability
We also study the impact of stable adaptive polymorphisms on the predictability of evolution. In our FGM simulations, stable polymorphisms are generated through overdominant mutations in our diploid simulations. In each of our three sets of diploid simulations, we separate the adaptive walks into those that do and those that do not contain a stable polymorphism using the method of (Kimura 1956) (see Model section 2.1.3 for details). We can then compare the distributions of our historical and future predictability metrics between these two groups to test for a significant effect. We find that simulations with stable polymorphisms are significantly less future predictable than simulations without such polymorphisms (Figure 3, top row), which is concordant with the results of Selliset al. (2011) that overdominant mutations allow an adaptive walk to explore a larger portion of the fitness landscape. In contrast, we find that simulations with overdominant mutations are significantly more historically predictable than simulations without such mutations (Figure 2, bottom row). These results again show the anti-correlation between future and historical predictability, and suggest that stable polymorphisms significantly impact the predictability of evolution.
3.3 Historical contingency generated by stable polymorphisms
In the course of our analysis, we were struck by the presence of 70 diploid simulations across all three parameter regimes where historical predictability analysis showed that the order of mutations that actually occurred in the FGM simulation was inviable. While some of these instances are due to multiple mutations, i.e. a population gaining a mildly deleterious mutation and then quickly gaining a highly beneficial mutation on the same background, the remaining simulations contain derived alleles that reached high frequency through a stable polymorphism but were eventually lost, which we term “hidden” alleles (Figure 4a). We hypothesized that in these simulations, hidden alleles were necessary to the evolutionary path, and not considering these alleles leads to the mistaken inference that no order of mutations was viable (Figure 4b). To test this, we selected one of these simulations at random to recompute its historical predictability while including all hidden alleles in the inference. This modification allowed us to successfully infer that the order of mutations observed in that FGM simulation was viable (data not shown), suggesting that our inability to detect hidden alleles in most systems can lead to significant errors in inference. In general we find that ≥ 25% of diploid simulations in each of the three parameter regimes contain at least one hidden allele (see Model section 2.1.4 for details). While we find no evidence for hidden alleles generating a systematic effect on historical predictability in our model, the finding that some rare bouts of evolution are highly dependent on detecting hidden alleles highlights their potential impact in natural systems.
4 Discussion
In this work, we sought to answer two major questions when studying the predictability of evolution in highly polygenic systems. First, we wanted to study the relationship between future and historical predictability. Second, we wanted to investigate how predictability changes when comparing simulations with and without stable polymorphisms.
In our simulations, we found that future and historical predictability are anti-correlated. This anti-correlation can be intuitively understood in the FGM framework used to conduct our simulations. Adaptive walks that are phenotypically similar to the average walk (high future predictability) tend to move relatively directly from the ancestral phenotype to the optimal phenotype on the fitness landscape. As each mutation in the adaptive walk changes the phenotype of the individual in a similar direction, there is very little sign epistasis between these mutations. A recent study has also shown that the amount of sign epistasis is correlated with the distance of the population from the optimal phenotype (Blanquartet al. 2014), consistent with our observations from the 2D-close and 2D-far parameter regimes (Figure S3). As previous work has shown that historical predictability is highly correlated with the amount of sign epistasis present in a system (Weinreichet al. 2005), high future predictability results in low historical predictability (i.e. most orders of these mutations are viable and have similar likelihood). In contrast, adaptive walks that are highly divergent from the average walk (low future predictability) are more likely to have mutations with sign epistasis that constrain their order, resulting in high historical predictability.
We can gain a similar intuitive understanding of the effect of overdominant mutations on predictability in our model. Overdominant mutations tend to overshoot the phenotypic optimum (Selliset al. 2011), resulting in low future predictability and requiring subsequent mutations to be compensatory (i.e. move the phenotype in the opposite direction of the overdominant mutation). This generates large amounts of sign epistasis in simulations with overdominant mutations, resulting in high historical predictability compared to adaptive walks without overdominant mutations.
While previous work has shown how underdominant mutations in a metapopulation can generate priority effects (Altrocket al. 2010), our work is the first to show how overdominant mutations can generate historical dependencies in the evolutionary trajectory. The presence of transient stable polymorphisms during adaptive evolution create unsampled “hidden” alleles, which, in extreme cases, makes us infer that the true order in which the mutations occurred in the adaptive walk is inviable. This presents significant problems when attempting to study historical predictability in natural systems with a single time-point sampling resolution, as our inability to detect hidden alleles in such systems would make any inferred likelihood suspect. Studies of historical predictability under such conditions would allow the calculation of mutation order likelihoods for the set of mutations under consideration, however, we would have no idea whether these likelihoods are accurate, nor would we have any estimate of how inaccurate they are due to the missing “hidden” alleles. Functionally important hidden alleles are challenging to identify even in extant populations, due to the vast amounts of variation present in any natural population, making this a particularly difficult problem to solve.
Our simulations focused on one mechanism for the generation of stable adaptive polymorphisms, namely heterozygote advantage, but a number of other mechanisms exist in nature. These include negative frequency dependent selection (Levinet al. 1988; Iserbytet al. 2013), and spatially or temporally variable selection (Rainey and Travisano 1998; Kasumovicet al. 2008; Saltz and Nuzhdin 2014). Natural populations can also generate unstable polymorphisms through a number of other mechanisms, such as clonal interference (Desai and Fisher 2007; Herron and Doebeli 2013; Kvitek and Sherlock 2013; Langet al. 2013), genetic drift, admixture and other demographic processes. Excluding clonal interference, which is minimized through our parameter choices in our simulations, none of these other processes are considered in this work, and some of them may also have a substantial qualitative effect on the relationship between future and historical predictability as they may also modify the amount of sign epistasis between the mutations present in an evolutionary path.
The underlying genetic, phenotypic and fitness landscape models used in our simulations are also limited in a number of ways, and could be expanded by including the possibility of multiple adaptive optima, genetically unlinked loci that are capable of adaptation (i.e. recombination between mutations during sexual reproduction), epistasis between loci and the presence of standing genetic variation. Consideration of these processes will likely further complicate the inference of historical predictability. Finally, simulation systems have the advantage of having exact knowledge of the fitness of every genotype. This information must be estimated in natural systems, which may introduce significant noise the inference process. While phenomena such as hidden mutations are likely universal to all of these more complex scenarios, we suspect that the relationship between future and historical predictability may vary between different systems.
Despite our use of a very simple model, we have shown a number of limitations of studying historical predictability when attempting to predict evolution. Not only is historical predictability not directly correlated with future predictability, but it is anticorrelated in our model, suggesting that studying historical predictability may give misleading information about the future predictability of evolution in a given system. In addition, these trends were only discovered through the study of a large number of independent simulations, so the analysis of single adaptive walks is likely of limited utility in systems with highly polygenic adaptation. We also find that the presence of polymorphisms can create historical dependencies in the evolutionary trajectory that are extremely difficult to account for in experimental studies. Thus, this work opens up a number of new questions for study. First, the relationship between future and historical predictability in natural systems is unknown, nor do we know whether this relationship varies between different biological systems. If this relationship does vary, we need to understand the parameters that cause this variation so that one can understand which systems may be amenable to historically predicting evolution and which are not. A similar approach should be taken to understand the conditions under which historical contingency significantly influences adaptive evolution. Our work shows that historical predictability cannot be used as a naive proxy for predicting future evolution, and highlights the need for new approaches to studying future predictability.
5. Funding Acknowledgements
The work was funded by the NIH/NHGRI training grant T32 HG000044 and a Stanford Center for Evolutionary and Human Genomics (CEHG) pre-doctoral fellowship to SV, a Stanford Graduate Fellowship and a Stanford CEHG pre-doctoral fellowship to DS and NIH grants RO1 GM115919, GM10036601, GM097415 to DAP. The content of this work is solely the responsibility of the authors and does not necessarily represent the official views of Stanford University or the National Institutes of Health. The authors would like to thank five anonymous reviewers for their valuable feedback while preparing this manuscript.