ABSTRACT
Beneficial mutations drive adaptive evolution, yet their selective advantage does not ensure their fixation. Haldane’s application of single-type branching process theory showed that genetic drift alone could cause the extinction of newly-arising beneficial mutations with high probability. With linkage, deleterious mutations will affect the dynamics of beneficial mutations and might further increase their extinction probability. Here, we model the lineage dynamics of a newly-arising beneficial mutation as a multitype branching process; this approach allows us to account for the combined effects of drift and the stochastic accumulation of linked deleterious mutations, which we call lineage contamination. We first study the lineage contamination phenomenon in isolation, deriving extinction times and probabilities of beneficial lineages. We then put the lineage contamination phenomenon into the context of an evolving population by incorporating the effects of background selection. We find that the survival probability of beneficial mutations is simply Haldane’s classical formula multiplied by the correction factor , where U is deleterious mutation rate, is mean selective advantage of beneficial mutations, κ ∈ (1, ε], and ε = 2 – e−1. We also find there exists a genomic deleterious mutation rate, , that maximizes the rate of production of surviving beneficial mutations, and that . Both of these results, and others, are curiously independent of the fitness effects of deleterious mutations. We derive critical mutation rates above which: 1) lineage contamination alleviates competition among beneficial mutations, and 2) the adaptive substitution process all but shuts down.
Beneficial mutations are the ultimate source of the genetic variation that fuels evolutionary adaptation, but deleterious mutations are likely to be far more abundant (Muller 1950; Sturtevant 1937). Perhaps for the sake of simplicity, the evolutionary effects of these two types of fitness-affecting mutations were generally considered separately in early studies. For example, Muller (1964) assumed that beneficial mutations were negligible and reasoned verbally that deleterious mutations should have disastrous consequences for populations in the absence of recombination because of the recurrent, stochastic loss of genotypic classes with the fewest deleterious mutations–Muller’s ratchet (Felsenstein 1974). Haldane (1927), on the other hand, focused on the fate of single beneficial mutations in the absence of other fitness-affecting mutations and used single-type branching process theory to show that most such beneficial mutations are lost to what is now called genetic drift: the fixation probability of such a beneficial mutation is only about twice its selective effect, sb, for small sb.
In reality, of course, multiple fitness-affecting mutations (both beneficial and deleterious) can be present simultaneously in populations, and these mutations can influence each others’ fates and evolutionary effects as a consequence of linkage (reviewed in Gordo and Charlesworth (2001); Charlesworth (2013, 2009); Barton (2009)). Interactions between beneficial and deleterious mutations are of particular interest in this regard, because such interactions-in contrast to interactions between beneficial mutations alone–can determine whether a population will increase or decrease in fitness. Indeed, recent studies (Bachtrog and Gordo 2004a; Poon and Otto 2000; Goyal et al. 2012; Kaiser and Charlesworth 2009; Silander et al. 2007) have indicated that beneficial mutations (including reversions of deleterious mutations) can impede or halt the fitness loss predicted in asexual populations under Muller’s ratchet, as originally suggested by Haigh (1978a). Moreover, a number of theoretical studies (Johnson and Barton 2002a; Jiang et al. 2011; McFarland et al. 2014; Bachtrog and Gordo 2004a; Charlesworth 2013; Peck 1994) have shown that Haldane’s classical fixation probability of 2sb for a beneficial mutation can be reduced by the effects of selection against linked deleterious mutations (Good and Desai 2014; Hartfield et al. 2010; C W Birky and Walsh 1988): in principle, such effects include both background selection against deleterious mutations already present in the genome on which the beneficial mutation appears, and selection against deleterious mutations that arise and accumulate in genomes carrying the beneficial mutation. The latter form of selective effect has not previously been analyzed in isolation; it is the primary focus of the current paper and will be referred to as lineage contamination.
In preliminary computer simulations, we observed that the fixation probability of a beneficial mutation appearing in an otherwise initially homogeneous asexual population with a high genomic mutation rate is considerably reduced below Haldane’s classical 2sb expectation. We hypothesized that the lower probability of fixation of a beneficial mutation in this situation can be attributed to lineage contamination: specifically, Muller’s ratchet operates at a much faster rate in the small lineage founded by the beneficial mutation than in the rest of the population. Here, we present the results of analytical modeling and further computer simulations that support this hypothesis and show how lineage contamination affects fixation probabilities, dynamics of differential load (relative fitness dynamics), sojourn dynamics, and fitness effects of surviving beneficial mutations.
We model the influences of background selection and lineage contamination, both singly and jointly, on the fate of beneficial mutations. Under background selection alone, a beneficial mutation that lands on the best genetic background (the one least loaded with deleterious mutations) always has a non-zero probability of achieving fixation in an asexual population. In contrast, under lineage contamination alone, a beneficial mutation can have a probability of fixation that is zero if the mutation rate is high enough. Our simulations and analytical results suggest that when both background selection and lineage contamination are operating – as they do in real populations – asexual populations traverse a continuum of evolutionary regimes as the genomic mutation rate increases: at low mutation rates, beneficial mutations appear infrequently enough that they do not interfere with each others’ progress to fixation (the “periodic selection” regime; Sniegowski and Gerrish (2010)); as the mutation rate increases, alternative beneficial mutations begin to compete with each other (the “clonal interference” regime); as mutation rate increases further, we find that lineage contamination can suppress a fraction of beneficial mutations that is sufficient to cause a population to revert to the periodic selection regime; ultimately, at very high mutation rates, a regime can be reached in which beneficial mutations are no longer substituting. Significantly, these last two regimes would not obtain without the operation of lineage contamination: at high mutation rates, background selection alone cannot shut down clonal interference or the adaptive substitution process, but lineage contamination can. Our results, therefore, indicate that the lineage contamination effect is central to determining the adaptive fate of a population when both beneficial and deleterious mutations are arising (Bull and Wilke 2008; Bull et al. 2007; Springman et al. 2009).
Beneficial lineages in a homogeneous population
We model the random accumulation of deleterious mutations within a growing lineage founded by the occurrence of a beneficial mutation (henceforth, beneficial lineage). Our main objective in this section is to study the effects of lineage contamination in isolation, and to this end we model beneficial lineages arising within initially homogeneous populations.
We are interested in how the accumulation of linked deleterious mutations affects the dynamics and fate of beneficial mutations; as a first approximation, we assume complete asexuality. We assume that relative fitness effects of mutations do not change over the relevant time span, i.e., the environment remains constant over this time span, and there are no frequencydependent effects other than the one examined here (due to differential rates of Muller’s ratchet). All models assume that fitness effects of mutations are multiplicative, i.e., no epistasis. Finally, our multitype branching process model assumes that reproduction is by binary fission (e.g., bacteria, cell lines): thus, individuals can have a maximum of two offspring. Simulations that relax this assumption give qualitatively similar results; quantitatively, however, fixation probabilities derived from the binary fission model may be approximately halved for the more general model in which numbers of offspring are Poisson-distributed.
Finally, it will facilitate further reading to precisely define three terms: extinction probability, , is the probability that a beneficial lineage, arising in an otherwise infinite population, becomes extinct in finite time (the superscript b indicates that this probability pertains to the beneficial lineage in question and not the whole population); survival probability is the complement of the extinction probability: ; finally, fixation probability, pfix, is the probability that a lineage will displace the rest of a finite population (becomes fixed) in finite time. We note that we have dropped the superscript in psvl and pfix, as reference to the beneficial lineage is implied. We further note that it is possible for an ultimately doomed lineage to become fixed in a finite population, implying pfix > psvl.
Multitype branching process model
Our stochastic model is a discrete-time multitype branching process, where a “type” corresponds to the number of acquired deleterious mutations. The model describes the evolution of the composition of the population Xt = (Xt,0, Xt,1,…), Xt,i being the number of individuals carrying i deleterious mutations at time . We denote by U ⩾ 0 the deleterious mutation rate and by 0 ⩽ sb ⩽ 1 and 0 < sd < 1 the selective advantage of beneficial mutations and disadvantage of deleterious mutations, respectively. The model can be described as follows: at each time-step, each individual produces two descendants carrying as many deleterious mutations as itself. Each descendant might accumulate during this reproduction k additional deleterious mutations, with probability e−UUk/k!. If the parent was of type i, the descendant is then of type i + k and is selected according to its fitness, i.e. with probability proportional to (1 + sb)(1 − sd)i+k. Therefore, an individual of type i produces a total number of 0, 1 or 2 descendants, each of them being of a type greater than or equal to i. We refer to the Supporting Information (SI) for a more detailed description of the model.
We consider a sub-population carrying a beneficial mutation (i.e., a single beneficial lineage) arising in a large wild-type population. In order to study the lineage contamination effect in isolation, we assume that both populations initially do not carry any deleterious mutations. For this purpose we consider two independent branching processes: describing the evolution of a wild-type population of initial size N, hence with sb = 0 and initial state X0 = (N,0,0,…), and describing the evolution of a single beneficial lineage, with sb > 0 and initial state .
Mean demographic dynamics of each sub-population
The mean wildtype population size (all types combined) at time t is given by Ne−UteU(1−(1−sd)t)(1−sd)/sd (S4). A beneficial mutation occurring within the wildtype population founds a beneficial lineage whose mean size is given by (1 + sb)t e−UteU(1 −(1−sd)t)(1−sd)/sd (S5). Note that as time tends to infinity this quantity tends to +∞ if U < ln (1 + sb), to eU(1−sd)/sd if U = ln (1 + sb), and to 0 if U > ln (1 + sb). The latter convergence will typically not be monotonic (Figure S1).
Extinction and survival probabilities
The previous result concerning the evolution of the mean beneficial lineage population size can be refined by looking at the extinction prohahility of the beneficial lineage. By this we mean the probability that the process does become extinct, i.e. . We show (Proposition 1; see SI) that although the number of types is infinite, this probability also equals . The beneficial lineage almost surely becomes extinct if and only if U ⩾ Uc, where the critical deleterious mutation rate is given by
Of course, this implies that if U < Uc then , i.e. the beneficial lineage can survive with positive probability. We find that survival probability is bounded by: where
Lower bound fl is achieved when sd > sb, and the upper bound fu is achieved when sd → 0. Exact computation of psvl is achieved numerically using algorithm (S11) derived in the SI. Figure 1 plots examples of such computations (thin intermediate curves) as well as limiting cases fl and fu (thick curves) as a function of the deleterious mutation rate U and selective advantage sb, respectively.
Fixation probabilities
In this branching process model the fixation prohahility of the beneficial lineage is the probability pfix that at some point the whole population carries the beneficial mutation. Note that because we take into account stochastic variation in population size, the beneficial mutation might not be permanently established even after fixation, because the population might eventually become extinct afterward. The fixation probability corresponds here exactly to the probability that the wild-type population dies out before the single beneficial lineage does: , where is the extinction time of the wild-type population (alternatively, beneficial lineage). From what precedes we know that Text is almost surely finite, whereas is almost surely finite if and only if U ≾ Uc. Note also that because of the strict inclusion of the probability events , we know that . This implies in particular that in this model the fixation probability is never zero. Although we cannot provide a closed-form expression for pfix, this probability can be well approximated numerically by Equations (S12) - (S13). We illustrate this result in Figure S5 where we plot pfix as a function of the deleterious mutation rate.
Fitness dynamics of a beneficial lineage within a population
The fitness of the beneficial lineage at time t is given by
Because of the potential extinction of the population, the random variable is only defined for similarly define the fitness of the whole population , and focus our study on the dynamics of the relative fitness . Because we assume in our model that the wild-type population is initially large, we approximate the relative fitness by its almost certain limit as N tends to infinity (S14), namely
We also prove that the mean value of this relative fitness tends as time tends to infinity to (1 + sb) (1 − pext) (S15)-(S16). A plot of this long-term limit is given in Figure S6. In order to have a more accurate description of the evolution over time of the relative fitness, we provide in addition an upper and lower bound (S17)-(S18) of its mean value for each , as illustrated in Figure 2.
Mutational meltdown of a beneficial lineage
Our goal here is to study the synergy between the loss of the least-loaded classes and the potentially decreasing size of the beneficial lineage. For technical reasons detailed in the SI, we consider in this section the continuous-time analog of the branching processes studied previously. Assuming that at time t the least-loaded class in the beneficial lineage’s population is of type i, the process at this time is of the form . Conditionally on , we define the extinction time of the least-loaded class as . The mutational meltdown effect is then fully described by the sequence of random variables T0, T1,…
Note that T1 strongly depends on the random value taken by the process at the beginning of the time interval [T0, T0 + T1]. Note also that assuming , the strong Markov property enables study of the process on the latter interval to be reduced to its study on [0, T1], conditionally on . We thus provide in Proposition 2 (SI) an explicit computation of the cumulative distribution function of the time to extinction of the least-loaded class of type i, for any i and any initial condition . From this we deduce its mean value . Again, three different regimes appear depending on whether U < Uc, U = Uc or U > Uc. We illustrate this result in Figures S2 and S3, where we plot the cumulative distribution function and mean value of the extinction time T0 of the first least-loaded class, with .
Finally, in order to study not only the behavior of each extinction time separately but to take into account the stochastic evolution of the process , we compute the sequence of the mean extinction times , where the deterministic sequence is chosen to reflect as accurately as possible the mean evolution of . We naturally choose n0 = (1,0,0,…), and then define n1 as the mean value of the process at the end of the first time interval [0, T0]. Because this mean value might not be integer-valued, we round each of its coordinates to the closest integer. Hence we set , and iteratively define in a similar manner n2, n3,…. As proved in Proposition 2 (SI), we can explicitly compute each , which combined with the previously mentioned computation of for any initial condition ni, enables us to obtain the desired sequence . Figure 3 illustrates this result and provides a visualization of the mutational meltdown effect in a single beneficial lineage for different values of U, sb and sd.
Beneficial lineages in an evolving population
Until now, we have examined the process of lineage contamination in isolation; that is, the accumulation of deleterious mutations occurring after the production of a beneficial mutation. In addition, we have assumed that we know the selective advantage of the focal mutation.
In real populations, however: 1) deleterious mutations can occur both after and before the appearance of a beneficial mutation, and 2) we generally will not know the selective advantages of beneficial mutations. Deleterious mutations that appear before create a deleterious background upon which the beneficial mutation arises; selection against this deleterious background is background selection (Charlesworth et al. 1993; Stephan 2010). Here, we model the growth and fate of beneficial mutations of varying selective advantages arising in a population already contaminated with deleterious mutations.
Angled-bracket notation in this section indicates average over all possible trajectories, or “states”, (ensemble average) of a beneficial lineage emerging in an otherwise heterogeneous (evolving) population. (The absence of angled-brackets indicates that the focal beneficial lineage arises in an otherwise homogeneous population, as in the previous section.)
Incorporating background selection
If a beneficial mutation is produced on a background carrying j deleterious mutations, the initial growth rate of the resulting lineage is: where Sb is a random variable denoting the selective advantage of the beneficial mutation, with mean . Essentially, to incorporate background selection, we simply replace 1 + sb in the previous section with Wj.
Of course, we do not know beforehand how many deleterious mutations will be present in the background upon which a beneficial mutation arises. But we do have accurate predictions for both the average number of deleterious mutations in the population, as well as the probability that a beneficial mutation will arise on a background with a given number of deleterious mutations.
When mutation rates are low and population sizes are large, classical theory (Haigh 1978a; Johnson 1999b) predicts that individuals in a population will acquire a Poisson-distributed number of deleterious mutations with parameter . For our purposes, the assumptions of low mutation rate and large population size may be too restrictive, as we wish to explore effects of high mutation rates in finite populations. For exact computation of results, therefore, we will rely on the more encompassing results derived by Gessler (1995) that relax these assumptions, giving the probability of a background having j mutations (re-derived in SI): where λ = θ − k, and k and β are integers defined by , and . To derive approximate analytical expressions, where defensible, we nevertheless resort to the straight Poisson distribution from classical theory, the rationale being that tail probabilities lower than 1/N will have negligible effects on the quantities being derived.
Survival probability
The ensemble-averaged probability of survival is bounded by: where K = 4 for binary fission, and K ≈ 2 for Poisson-distributed offspring; Wl = (1 − sd)Jl (1 + Sb) and and is also a random variable and, for each value of and , and Wj is defined by Equation (3). Figure 4 plots exact calculations of psvl and of 〈psvl〉 by fixing Sb = sb. Monte Carlo integration of (5), in which Sb was drawn from an exponential distribution with mean , gives probabilities that are indistinguishable from the approximations we now derive.
Approximate survival probability
This approximation is suggested by the observation (Figure 4) that fixation probability increases sharply at the critical selective advantage, , above which fixation probability becomes positive. We thus explored the possibility that fixation probability might be approximated as simply the probability that times the probability that the beneficial mutation survives. If the beneficial mutation in question arises on a background with j deleterious mutations, then , where c0 = 1 for and c0 = eU for , and . Taking the logarithm of this probability, multiplying by the corresponding Poisson probabilities and summing over j, we employ Jensen’s inequality to derive expressions providing exact minimums on both upper and lower bounds for the ensemble survival probability: where . This expression is a bound-of-bounds and thus of questionable utility. Comparison with simulations reveals the upper bound (for ) to be quite accurate but the lower bound (for ) to be overly conservative.
Employing a different approach that does not rely on Jensen’s inequality (SI), we find that, to a very good approximation:
From this, it is apparent that the smallest value of 〉psvl〈 is achieved when , so that survival probability is bounded as: where ε = 2 − e−1 ≈ 1.63. Remarkably, the foregoing bounds on survival probability are independent of . Comparison with simulations (Figure 5) reveals that the bounds given by Equations (7) and (8) are very accurate. From (7), we can see that the upper bound in (8) is approximated under the wider range of circumstances, because: 1) when is small the upper bound is approximated, and 2) when the lower bound obtains, but as becomes increasingly larger than , 〉psvl〈) moves away from the lower bound and back towards the upper bound. There is nevertheless a restricted range of values for – namely when is close to – over which the lower bound obtains.
Selective advantages of surviving beneficial mutations
Because of lineage contamination, beneficial mutations of small effect will have a very small or zero chance of survival; beneficial mutations that do survive, therefore, will tend to be of larger selective advantage. Following logic similar to that of the previous subsection, we derive the ensemble survival probability for a beneficial mutation of given selective advantage sb. Given this survival probability, expected selective advantages of surviving beneficial mutation rates, as well as approximate bounds, are plotted in Figure 6 as a function of deleterious mutation rate.
Mutation rate that maximizes production of surviving beneficial mutations
The recruitment rate of beneficial mutations increases with genomic mutation rate, but because of lineage contamination, the survival probability of beneficial mutations decreases with genomic mutation rate. Therefore, there must exist a genomic mutation rate that maximizes the rate of production of surviving beneficial mutations. Setting ∂UU 〉psvl〈 = 0 and solving for U, we find this maximum production rate occurs at mutation rate , bounded as:
Figure 7 compares the foregoing predictions to simulation results and shows them to be quite accurate. The smallest value of is achieved when , resulting in the bounds: again displaying a curious independence of . For reasoning similar to that given after Equation (8), the upper bound on is approximated under the wider range of circumstances.
Effects of excessive mutation
The lineage contamination effect we describe will increase with increasing deleterious mutation rate. When the mutation rate is high enough, this effect can cause the within-population mutational meltdown of many newly-arising beneficial lineages causing a reduction in competition and clonal interference. At even higher mutation rates, this effect can suppress most or all newly-arising beneficial lineages, resulting in the partial or complete cessation of adaptive evolution.
Clonal interference threshold
Evolutionary dynamics may be naturally partitioned into different regimes, depending on the recruitment rate of beneficial mutations. At very low rates of recruitment of beneficial mutations, adaptive evolution proceeds through isolated selective sweeps – a regime that has been dubbed the “periodic selection” regime (Sniegowski and Gerrish 2010). As the recruitment rate of beneficial mutations increases, a point is reached at which two or more alternative beneficial mutations may coexist and compete for fixation (the “clonal interference” regime; Gerrish and Lenski (1998)). As recruitment rate of beneficials continues to increase, it may become likely that competition occurs not among single beneficial mutations but among genotypes carrying multiple beneficial mutations (the “multiple-mutations clonal interference” regime; Desai and Fisher (2007); Desai et al. (2007)).
What much of this previous work failed to account for (c.f. Orr (2000); Bachtrog and Gordo (2004b)) was the fact that, as beneficial recruitment rate increases via an increase in overall genomic mutation rate, the rate of deleterious mutation increases in parallel. The findings we have presented so far suggest an intriguing implication of this parallel increase: whereas beneficial recruitment rate increases linearly with genomic mutation rate, survival probability of beneficial mutations decreases exponentially with genomic mutation rate. This fact suggests that, at high genomic mutation rates, the effects of lineage contamination can overwhelm the increased production of beneficial mutations, such that the effective recruitment rate of beneficials (i.e., the rate of production of surviving beneficial mutations) can decrease as mutation rate increases further. As mutation rate increases, therefore, adaptive evolution may eventually revert to a regime in which it proceeds only through isolated selective sweeps; put differently, the population may revert from one of the clonal interference regimes back to the periodic selection regime at high mutation rates.
As delineated in Sniegowski and Gerrish (2010), the clonal interference regime is entered when a second, alternative beneficial mutation is likely to be produced on the ancestral background before the first, or focal, beneficial mutation becomes fixed. Mathematically, this transition occurs when Nμ〈psvl〉 ln(Nsb/2) / sb = 1 (SI), where μ = cU is the beneficial mutation rate and c is thus the ratio of beneficial to deleterious rates. In previous work, the transition considered was that which occurs as very low mutation rates increase, and 〈psvl〉 was taken to be some function ( or some variant thereof) that was independent of U. Here, we have shown that, at high mutation rates, 〈psvl〉 can depend strongly on U. We define the “clonal interference threshold” to be the critical mutation rate above which adaptive evolution reverts from a clonal interference regime back to the periodic selection regime. This threshold is defined as: where 〈psvl〉 is defined by (8), c is the ratio of numbers of potential beneficial to deleterious mutations, i.e., c = μ/U, and , the expected selective advantage of beneficial mutations that survive (SI).
Fixation threshold
The critical selective advantage below which a beneficial mutation does not survive increases approximately linearly with mutation rate when lineage contamination is considered in isolation, and faster than linearly when background selection is also accounted for. In contrast, the fittest mutation produced by a population has a selective advantage that increases approximately linearly with the log of the mutation rate. This necessarily implies that, as mutation rate increases, eventually a point will be reached at which even the selective advantage of the fittest beneficial mutation will not be sufficient to overcome the effects of lineage contamination. This point defines the “fixation threshold”, and its existence follows from the fact that the critical selective advantage required and the maximum selective advantage produced by a population have qualitatively different relationships with mutation rate.
The fixation threshold is exceeded when no beneficial mutation produced by a population has a selective advantage strong enough to survive the effects of lineage contamination. Concretely, in a given interval of time τ, we suppose a population produces a total of n beneficial mutations; then, the fixation threshold is defined as the mutation rate that ensures extinction of even the fittest of these mutations. This critical mutation rate, which defines the fixation threshold and which we will denote by Uf, thus ensures the extinction of all n beneficial mutations produced with specified probability pc; it is given by: where Û is given by (9) and bounded by (10), n(U) = NUcKŝbτ, and τ denotes the relevant time period; for example, to compute the mutation rate at which, over a time period of 5000 generations, all fixations will be suppressed with probability 95%, we set τ = 5000 and pc = 0.95.
Discussion
Summary
Evolutionary interactions between linked deleterious and beneficial mutations have received increasing attention in recent years. It is now well accepted, for example, that background selection caused by the continual rain of deleterious mutations into regions of low recombination decreases the fixation probability of beneficial mutations (Charlesworth et al. 1993; Peck 1994) and decreases nucleotide diversity (Stephan 2010; Kim and Stephan 2000; Birky 1988; Keightley and Otto 2006); moreover, theoretical and empirical studies have shown that selective sweeps of beneficial mutations can cause the fixation of linked deleterious mutations (McDonald et al. 2011; Bachtrog and Gordo 2004b; Good and Desai 2014; Hartfield et al. 2010; C W Birky and Walsh 1988). To date, work in this area has been focused on populations with relatively low genomic mutation rates. In such populations, the key consideration in analyzing the interaction between beneficial and deleterious mutations is the number of deleterious mutations already present in the linked genomic background on which a new beneficial mutation arises. In the current paper, we have focused, in contrast, on populations in which genomic mutation rates may be very high: We have examined the possibility that the genomic background on which a beneficial mutation arises can become progressively contaminated with newly arising deleterious mutations even as the beneficial mutation spreads into the larger population. Our work is motivated in part by numerous studies indicating that adapting asexual populations tend to evolve high mutation rates through genetic hitchhiking (Sniegowski et al. 1997, 2000; Johnson 1999a; Elena and Sanjuán 2005; Gentile et al. 2011; Söderberg and Berg 2011; M’Gonigle et al. 2009; Raynes et al. 2011) and by the substantial literature that has been devoted to the question of when the genomic mutation rate will be sufficiently high to cause population extinction (Gerrish et al. 2007, 2013; Gerrish and Sniegowski 2012; Bull and Wilke 2008; Bull et al. 2007; Springman et al. 2009; Biebricher and Eigen 2005; Eigen 2002, 2000, 1971; Eigen and Schuster 1977).
Multiple beneficial mutations
Our multitype branching process model assumes that beneficial mutations occur infrequently enough that acquiring a second beneficial mutation in linkage with the focal beneficial mutation is improbable in the time required for the focal mutation to either survive or go extinct. In reality, it might be the case that multiple beneficial mutations arise on the same background and sweep to fixation, collectively overcoming the lineage contamination effect.
To assess the strength of our assumption, we studied the effects of allowing additional beneficial mutations to arise at different rates within the lineage founded by the focal beneficial mutation. To this end we varied the parameter c, introduced above and defined as: c = μ/U, or the ratio of numbers of potential beneficial to deleterious mutations. We assessed the effects of doing so in both an extension of our analytical model and in simulations.
To assess the effects of additional within-lineage beneficial mutations on lineage contamination in isolation, we extended our multitype branching process model so as to allow a beneficial lineage to acquire a second beneficial mutation with the same selective advantage sb as the first. This additional beneficial mutation could be acquired during reproduction with probability 1 – e−cU. Figure S7 compares, for different values of c, survival probabilities of a single beneficial lineage in an otherwise homogeneous population (lineage contamination only) as a function of the deleterious mutation rate. It is apparent from this figure that, for reasonable values of c, there is minimal quantitative difference in survival probabilities and only a slight increase in the apparent threshold. This indicates that our assumption of no additional within-lineage beneficial mutations is a weak assumption. Mathematically, there is a qualitative difference in that, for c > 0, the critical deleterious mutation rate above which a beneficial lineage becomes extinct almost surely (the “hard” threshold) is twice what it is without the additional beneficial mutation (i.e., 2 ln (1 + sb)). Practically, this is of little consequence, however, because the survival probabilities are typically minuscule for mutation rates in the region between the hard threshold for which no additional beneficial mutation is allowed (ln (1 + sb)) and the hard threshold for which one additional beneficial mutation is allowed (2 ln (1 + sb)). And, while not shown here, survival probabilities become even smaller in regions between higher thresholds that allow more beneficial mutations.
To assess the effects of additional within-lineage beneficial mutations arising in evolving populations, where both background selection and lineage contamination are operating, we performed simulations in which there was technically no limit on the number of additional beneficial mutations. Figure 5 plots survival probabilities computed from simulations, for the cases c = 0, 0.001, and 0.01, and would seem to indicate, again, that our original assumption of no additional within-lineage beneficial mutation is a very weak assumption. Survival probabilities such as those plotted in Fig. 5 were computed for a range of different parameters and in all cases, for what we considered to be reasonable beneficial-to-deleterious ratios (c ≤ 0.01), survival probabilities were essentially unaffected by the incorporation of additional beneficial mutations.
Independence of our results from the selective effects of deleterious mutations
The critical mutation rate above which lineage contamination in isolation ensures extinction of a beneficial mutation, derived in the first section (1), depends only on the selective advantage of the focal beneficial mutation; it does not depend on the selective disadvantages of deleterious mutations. In the next section, when we incorporate background selection, most of the solution bounds we derive are also independent of the selective disadvantages of deleterious mutations. These results stand in contrast to some previously published results that have focused primarily on the effects of background selection. Particularly striking is the contrast between our results, which find a surprising lack of dependence on , and the results of Orr (2000), which instead find a surprising lack of dependence on . For example, we find that , whereas Orr finds that . A thorough exploration of the relevant parameter space and assessment of Orr’s result is found in Johnson and Barton (2002b). The single factor that accounts for the qualitative discrepancy between our results and Orr’s is lineage contamination: when only background selection is accounted for, , yet when lineage contamination is also accounted for, . These “opposite” results shine a light on the impact of lineage contamination generally.
Lineage contamination in nature
The effects of lineage contamination only become significant under linkage and relatively high mutation rates. While we have focused exclusively on the case of asexuality, lineage contamination should also operate in organisms that undergo some form of genetic exchange as well: the fitness of a newly-arising beneficial mutation will be eroded at a faster rate than the same linkage region in the rest of the population – on principles similar to those studied here. The requirement of high mutation rates would seem to restrict the relevance of our findings to organisms like RNA viruses, although the evolution of high mutation rates has been predicted (Gerrish et al. 2007) and increasingly reported in natural (Matic et al. 1997) and laboratory (Sniegowski et al. 1997; Shaver et al. 2002; Denver et al. 2009; Wichman 2005; Pal et al. 2007; Gentile et al. 2011; Chao and Cox 2008; Cox and Gibson 1974) populations of RNA and clonal DNA organisms and in somatic (esp. cancerous) cells. Indeed, the process we have analyzed has been implied in conjunction with background selection as a mechanism that can slow the evolution of tumors (McFarland et al. 2013; Solé 2004).
Lineage contamination, mutational meltdown, and lethal mutagenesis
As alluded to in the presentation of our branching process model, lineage contamination may be thought of as within-population mutational meltdown. If this meltdown is induced by treatment of a population with a mutagenic agent, then it may be thought of as within-population lethal mutagenesis. Put differently, our findings may be thought of as the population-genetic analogues of these processes. Indeed, one of the processes we model – Muller’s ratchet in a growing beneficial lineage – is similar to previous models of Muller’s ratchet in freely-growing populations (Fontanari et al. 2003; Bull et al. 2007; Bull and Wilke 2008). Our work differs from these previous studies, however, in that we model the fitness erosion of a growing lineage within the context of a larger population.
In a recent experiment, lethal mutagenesis failed to cause extinction in a laboratory population of the bacteriophage T7 (Springman et al. 2009) because the accumulation of deleterious mutations opened up new genetic pathways that could increase fitness, i.e., it increased the number of available beneficial mutations. The theory we present here may offer insight into what would be required to thwart the evolutionary rescue afforded by these newly-available beneficial mutations. In particular, our “fixation threshold” might offer an appropriate quantitative guideline for the mutation rate required.
Lineage contamination and the error threshold
There is an intriguing relationship between our findings and predictions of “error threshold” models (Eigen 2002, 1971; Eigen and Schuster 1977; Biebricher and Eigen 2005; Bagnoli and Bezzi 1998; Nowak and Schuster 1989; Bonhoeffer and Stadler 1993). Generally and somewhat loosely speaking, an error threshold is a critical mutation rate, Uet, above which all genotypes deterministically converge to the same equilibrium frequency, independent of their fitness (in the absence of mutational biases).
Single-peak model
The simplest model of the error threshold – the so-called “single peak” model – assumes that there is a single fittest genotype of fitness 1 + sb (the beneficial mutant) and all other (mutationally accessible) genotypes have fitness equal to one (Wiehe 1997; Tejero et al. 2011), i.e., the fitness landscape has a set of two possible fitness classes {1,1 + sb}. This fitness landscape is obviously unrealistic; its original conjecture may have been based on the fact that many other such “phase transition” phenomena are robust to severe model simplification. When the number of possible genotypes may be assumed to be infinite, the error threshold is Uet = ln(1 + sb) (Wiehe 1997). Curiously, while the assumed fitness landscapes are very different, this critical mutation rate is identical to the one we derive for lineage contamination in isolation (1).
Multiplicative model
The set of possible fitness classes on the “multiplicative” fitness landscape is {(1 + sb)(1 – sd)i ∀ i ∈ [0, D]}, where D is the maximum number of deleterious mutations allowed. Curiously, for the case D < ∞ and sb = sd, the error threshold is Uet = sb (we note that for small sb, Uet ≈ ln(1 + sb), in agreement with the “single-peak” model), whereas when D = ∞, there is no error threshold: Uet = ∞ (Wiehe 1997). Oddly, our lineage contamination model corresponds most closely to the case D = ∞, for which there is no error threshold, but there is a lineage contamination threshold.
Concluding remarks
Wittingly or not, the presence of lineage contamination has been implicit in many previous models of mutation-induced fitness erosion. To our knowledge, however, it has not previously been modeled in isolation, as a process separate from background selection. Our theoretical framework partitions these two processes, and allows lineage contamination to be scrutinized separately from other processes. We find, for example, that newly-arising beneficial mutations can be driven extinct almost surely by lineage contamination whereas background selection alone cannot ensure their extinction.
Acknowledgements
We thank Thomas Bataillon, Guillaume Martin, Alan Perelson, Nick Hengartner, and Thomas Burr for helpful discussions. SP received financial support from the “Soutien à la recherche des jeunes maîtres de conferences” program at the Université Paris-Est Créteil. PG carried out much of this work in, and received financial support from, visiting faculty programs at Aarhus University, Denmark, and at University of Montpellier (Labex/CeMEB/ISEM), France. PS and PG received financial support from NASA grant NNA15BB04A.