ABSTRACT
The within-host evolutionary dynamics of TB remain unclear, and underlying biological characteristics render standard population genetic approaches based upon the Wright-Fisher model largely inappropriate. In addition, the compact genome combined with an absence of recombination is expected to result in strong purifying selection effects. Thus, it is imperative to establish a biologically-relevant evolutionary framework incorporating these factors in order to enable an accurate study of this important human pathogen. Further, such a model is critical for inferring fundamental evolutionary parameters related to patient treatment, including mutation rates and the severity of infection bottlenecks. We here implement such a model and infer the underlying evolutionary parameters governing within-patient evolutionary dynamics. Results demonstrate that the progeny skew associated with the clonal nature of TB severely reduces genetic diversity and that the neglect of this parameter in previous studies has led to significant mis-inference of mutation rates. As such, our results suggest an underlying de novo mutation rate that is considerably faster than previously inferred, and a progeny distribution differing significantly from Wright-Fisher assumptions. This inference largely reconciles the seemingly contradictory observations of both rapid drug-resistance evolution but extremely low levels of genetic variation in both resistant and non-resistant populations.
INTRODUCTION
Tuberculosis (TB) is a public health threat worldwide (WHO, 2018). Despite clear motivation for study, the observed within- and between-host evolutionary dynamics of Mycobacterium tuberculosis (M.TB) are not well understood, and results to date represent something of a paradox. On the one hand, drug resistance evolves rapidly (Fonseca et al. 2015; Eldholm et al. 2015); on the other, the genomic characteristics of M.TB do not appear conducive for such rapid adaptation, with inferred mutation rates being amongst the slowest of any human pathogen (Rocha et al. 2006; Ford et al. 2011; Ford et al. 2013; Colangeli et al. 2014; Payne et al. 2019; Menardo et al. 2019) and remarkably little genetic variation observed within or between hosts. Furthermore, purifying selection has been argued to play both a dominant as well as a weak role in shaping patterns of variation (Hershberg et al. 2008; Pepperell et al. 2013), and demographic estimates suggest a population history of TB that either matches or is uncorrelated with that of its human host (Comas et al. 2013; Bos et al. 2014; Brites et al. 2015; Eldholm et al. 2016).
To obtain a more robust understanding of TB evolutionary dynamics, it is essential to first appreciate that between-population observations are simply an aggregation of within-population processes. As such, studying the population genetics of within-patient data is critical to understanding the genetic differences observed between patients as well as their treatment outcomes. Fortunately, recent advances in sequencing technologies have allowed for more abundant and higher quality within-patient data. These published datasets have revealed a few common features of M.TB, including low-levels of genome-wide variation. For instance, Trauner et al. (2017) deep-sequenced twelve patients across four-time points and observed fewer than 50 polymorphic sites per patient genome-wide. In addition, the observed site frequency spectrum (SFS) is generally characterized by an abundance of rare variants (i.e., it is strongly left-skewed). These patterns have partly led to the suggestion that purifying selection effects may be wide-spread in the M.TB genome (Brown et al. 2016; Phelan et al. 2016; Mortimer et al. 2018).
Additional evolutionary factors likely contribute to these genomic patterns as well. For example, population bottlenecks may reduce genetic variation and alter the shape of the SFS (see review Thornton et al. 2007). Previous M.TB studies have investigated these effects separately in both the deep-time view of the population bottleneck and subsequent growth experienced by the host human population (Hershberg et al. 2008, Liu et al. 2018), as well as the shallow-time view of the population bottleneck and subsequent growth characterizing each novel transmission event and treatment (e.g., Trauner et al. 2017). Additionally, in fitting the left-skewed SFS, Pepperell et al. (2013) found that such a demographic history combined with a mix of both deleterious and neutrally-evolving sites produced the nearest fit to the observed SFS. Finally, given the lack of recombination in M.TB, related linkage effects (i.e., background selection (Charlesworth et al. 1993)) have similarly been discussed within these contexts (Pepperell et al. 2010; Copin et al. 2016).
While these studies have provided many important insights, there remains a relatively unexplored, though potentially highly significant, effect: clonality. Indeed, clonality and the related progeny distribution represents an important violation of commonly used evolutionary inference approaches based upon the Wright-Fisher (WF) model and the related Kingman coalescent (Eldon et al. 2006; Dos Vultos et al. 2008; Huillet et al. 2011; Lapierre et al. 2016). Specifically, progeny distributions under the WF model are Poisson distributed with a mean and variance of 1. Therefore, when an individual produces many offspring, far in excess of simple replacement in the next generation, the assumption that only two lineages coalesce at a time is violated, resulting in multiple-merger coalescent (MMC) events (see reviews of Tellier & Lemaire 2014; Irwin et al. 2016).
While perhaps abstract at first blush, this violation has very important implications for the study of sequence variation and diversity. Namely, as M.TB has been found to exhibit strong progeny skew owing to obligate clonal reproduction (Baker et al. 2004; Dos Vultos et al. 2008), the null model against which the above studies are comparing becomes incorrect. For example, under a multiple-merger model, the effective population size (Ne) no longer scales linearly with census size (N) as it does under the Kingman coalescent (Huillet et al. 2011). As a result, genetic diversity is a nonlinear function of the underlying population size - a result of interest given the strongly constrained and similar levels of variation observed across TB patients, regardless of infection time or resistance status. Similarly, under these progeny-skew models, the SFS is skewed towards an excess of low-frequency variants, generating a negative Tajima’s D even under equilibrium neutrality (Eldon et al. 2006; Birkner et al. 2013; Blath et al. 2016) – which appears of relevance to M.TB populations given the pervasively left-skewed SFS observed both within and between TB patients. Finally, the fixation probability of beneficial mutations under progeny skew may become much larger than under the WF model, owing to the increased probability of rapidly escaping stochastic loss (Der et al. 2011). This is fundamental to understanding the rapidly and independently evolving drug-resistance mutations in global TB populations - a result seemingly at odds with the previously inferred mutation rates (Sherman et al. 2011; Colangeli et al. 2014; Duchêne et al. 2016). In sum, the general theoretical expectations owing to progeny skew alone appear to qualitatively match empirical observations from M.TB; observations which, to date, have been attributed to alternate processes.
Recent progress has been made in utilizing these models to disentangle and even co-estimate patterns of demography, progeny skew, and selection. While there exist a variety of potential MMC models (see review of Tellier & Lemaire 2014), the so-called Ψ-coalescent has been a major focus of this literature given the straight-forward biological interpretation. Namely, the parameter Ψ represents the proportion of the next generation arising from a single parent (e.g., Ψ = 0.05 implies that one individual contributes offspring that comprise 5% of the next generation). In addition to which, recent experimental measures from viral populations are offering real-time insights in to such progeny distributions (Vahey and Fletcher 2019). Three results are of particular importance here. First, Eldon et al. (2015) demonstrated that population growth may be distinguished from multiple-merger coalescent events owing to progeny-skew, given differing expectations in the SFS. Second, Matuszewski et al. (2018) derived analytical expectations for the SFS under a multiple merger coalescent model with changing population size and further demonstrated that these parameters can indeed be accurately inferred jointly within a likelihood framework. Finally, building upon the two above results as well as the approximate Bayesian statistical framework developed by Foll et al. (2014, 2015), Sackman et al. (2019) recently extended these results and demonstrated an ability to co-estimate progeny skew, effective population size, as well as per-site selection coefficients from time-sampled polymorphism data.
Thus, a tremendous opportunity now exists to understand better the impact of mutation, genetic drift, and selection in dictating patterns of M.TB evolution and genomic variation under this more realistic coalescent model accounting for the underlying progeny distributions inherent to clonal reproduction. While the approximate Bayesian approach of Sackman et al. (2019) would appear ideal for this purpose, it is a time-sampled estimator reliant upon considerable levels of segregating variation in order to track changing allele frequencies (as is commonly observed in viral populations, for example). Thus, this approach is under-powered given the minimal levels of variation observed in M.TB. Further, as the underlying mutation rate itself is a question of great interest and importance in M.TB (Payne et al. 2019), it is desirable to additionally co-estimate this parameter rather than assume it to be known.
With this motivation, we developed a novel statistical approach utilizing the insights described above pertaining to infection dynamics and widespread purifying selection, while overlaying inference of underlying mutation rates, progeny distributions, and the severities of infection-related population bottlenecks. By re-analyzing published data within this framework, our results demonstrate the great importance of the multiple merger consideration and resolve many of the evolutionary paradoxes surrounding M.TB.
MATERIALS AND METHODS
Simulations
We conducted forward-in-time simulations using the SLIM version 3 software package (Haller & Messer 2019). M.TB populations were modeled using a genome size of 441,153 kb, equivalent to a tenth portion of the true genome size, for computational efficiency. As M.TB is a compact, highly-coding genome (Cole et al. 1998; Fleischmann et al. 2002), we assumed a distribution of fitness effects (DFE) characterized equally by deleterious (s = −0.01) and nearly neutral (s=-0.001) mutations. Mutation rate (μ) measured in vitro has been reported to be as slow as 2e-10 (Ford et al. 2011). In contrast, higher estimates ranging from 1e-9 to 9e-6 (Ford et al. 2013) have been proposed; therefore, our study considered the full extent of this range. Furthermore, it is important to note that previous experiments have measured only the neutral mutation rate, not the total mutation rate. In other words, the large input of strongly deleterious mutations - comprising a substantial component of the total mutation rate - has not been included in earlier estimates as these mutations are unlikely to be sampled as segregating variation or as fixed differences. However, as these mutations are important for shaping diversity via both purifying selection and background selection effects, and as our interest is in understanding the total rate at which all de novo mutations are input into the population, we considered the total rather than the neutral mutation rate. In order to infer this parameter within the context of an appropriate progeny-skew model, μ was drawn from a prior uniform distribution between 1e-9 and 9e-6 per site per generation.
Following an initial burn-in period of 10N generations, we considered a three-stage demographic model characterizing a single patient infection: moving forward in time, we describe 1) a neutral equilibrium population of size N, 2) an initial infection bottleneck leading to an instantaneous population reduction to size N2, and 3) a subsequent population size recovery to size N. In stage 1, we modeled a population of size N = 1,000. In order to quantify the effects of underlying assumptions pertaining to population size, additional simulations and inference were performed at N = 25,000. During stage 2, the severity of the population bottleneck (β) was sampled from ∼U[0.001, 0.1], where N2 = N*β - as the distribution of infection size in humans is unknown. However, it has been reported that in cattle TB (M. bovis) infection can be established by a single cell forming unit (Dean et al. 2005). During stage 2, the degree of progeny skew (or Ψ) was sampled from a prior distribution of ∼U[0, 0.2]. A value of 0 corresponds to the standard WF model. Progeny skew was simulated following the procedure of Sackman et al. 2019. In brief, one individual is chosen from the primary population A and founds a separate subpopulation B, the single generation unidirectional migration rate from B to A is set to Ψ, and the chosen individual thus contributes NΨ offspring to the following generation of A. A series of mate choice callbacks in SLiM force the migration rate to be exact rather than stochastic (see Supplementary Materials of Sackman et al. 2019). Subpopulation B is removed, the next generation begins, and a new individual is randomly sampled for the following generation. As such, each generation is a combination of N(1-Ψ) replacement events and a single sweepstakes event of magnitude NΨ. To emulate patient sampling at the onset of symptoms (approximately three months minimum (Behr et al. 2018)), we allowed stage 2 to run for 90 generations, assuming a generation time of 24 hours (Cole et al. 1998), and stage 3 to run for 910 generations before outputting genome alignments in ms format. Thus, the total generation time of our model was 11N.
Drawing from these prior distributions, 10,000 points (i.e., parameter combinations) were sampled. For each parameter combination, we conducted 1,000 replicates in order to characterize both the mean and variance. Summary statistics were calculated in the R package PopGenome version 2.6.1 (Pfeifer et al. 2014).
Data analysis and joint posterior estimates
For comparison to patient data, we examined the distribution and mean of segregating sites in samples published by Trauner et al. (2017). (https://zenodo.org/record/322377#.XO2CAy2ZNBw). A subset of 1,000 populations was simulated under the conditions described above. Given that patient data are subject to stringent filtering criteria, removing SNPs under a particular frequency cut-off, it was necessary to filter the simulated data to allow fair comparison. Thus, 100 genomes were sampled per each simulated population, and variants < 2% frequency were filtered out before the calculation of summary statistics (Table S1). Afterward, only genomes with ≥ 2 segregating sites were considered for further analyses, and the fraction of “invariable genomes” was noted for comparison to patient samples (Figure 2 and Table S1).
RESULTS & DISCUSSION
Considering levels of variation
We here report the first joint consideration of mutation rate, purifying selection, infection history, and progeny-skew in M.TB populations. A correlation of summary statistics (Figure 1) demonstrates that while mutation rate (μ) increases the number of segregating sites as expected, progeny skew (Ψ) acts to reduce variation. Furthermore, as the Ψ-parameter is of relevance every generation (i.e., every reproduction event), the impact of this previously unconsidered progeny skew on levels of variation is in fact much stronger than the single bottleneck event associated with infection. Considering the full distribution of sampled μ values (Figure S1), it is apparent that Ψ drastically reduces the average proportion of segregating sites genome-wide even for fast mutation rates, and that the probability of producing genomes devoid of any variation will naturally increase as μ decreases.
In order to consider a range of μ consistent with observed data - once pervasive purifying selection and progeny skew have been taken in to account - two examples of patient data collected by Trauner et al. (2017) representing low (20 segregating sites genome-wide) and high (50 segregating sites genome-wide) variation samples were plotted and compared to the simulated data. In order to compare simulated data with patient data, the same filtering steps must be applied. In this case, SNPs under 2% frequency were filtered in the empirical data, and thus, the simulated data were similarly filtered in order to be comparable (Figure 2).
As shown, fast mutation rates (μ on the order of 1e-6 and 1e-7) routinely produce expected numbers of segregating sites far above that observed in patients, regardless of other parameter values, while slow mutation rates (μ on the order of 1e-9) generally result in too little variation to match observation. This result is of interest as M.TB mutation rates are generally believed to be exceedingly slow (Sherman et al. 2011) - though importantly, this inference has largely neglected the important contribution of these additional evolutionary processes. Thus, once accounting for the diversity-reducing effects inherent to clonality, as well as the extent of purifying selection effects inherent to a compact, non-recombining genome, it is evident that the de novo mutation rate is likely faster than previously believed, with mutation rates on the order 1e-8 well matching the range of observed data (Figure 2). In addition, in order to consider the impact of underlying assumptions pertaining to population size, simulations were re-performed with a 25x larger population size. As shown in Figure S2, owing to these diversity reducing effects, mutations rates on the order of 1e-8 remain the best fit to observed levels of variation, with slower mutation rates still producing too little variation and too many invariable genomes to be consistent with patient data.
Considering distributions of variation
For the general range of μ identified above, the number of genome-wide segregating sites in simulated population data ranged from a minimum average of 2.1 to a maximum average of 78.7 SNPs (Table S1), depending on the combination of μ and Ψ drawn from the priors. Specifically, higher values of μ may be off-set by higher values of Ψ, resulting in a similar number of segregating sites for multiple parameter combinations. For example, for the patient sample containing 10 SNPs, simulation results demonstrate that μ = 8.13e-08 and a Ψ =0.06 would produce an average of 10.15 +/-4.64 SNPs, potentially suggesting a good fit to the data. However, μ = 3.1e-08 and a Ψ =0.02 can also generate similar results, yielding 10.40 +/-4.45 SNPs. More generally, this ridge in the posterior distributions (Figure 3A) between these two parameters suggests that they will be difficult to estimate independently if only levels of variation are used.
Thus, while comparisons with general levels of variation are useful for identifying a range as in the above section, more information is needed to parse values further. Importantly, previous theoretical results (Eldon & Wakeley 2016; Matuszewski et al. 2018) have well described the effect of Ψ on the observed distribution of genetic variation (i.e., SFS). To utilize this information, a general summary of the SFS, Tajima’s D (1989), was calculated on the filtered simulated data. As shown (Figure S3), the shape of the SFS, and thus the value of the D- statistic, is related to the value of Ψ. As the degree of progeny skew initially increases, D becomes increasingly negative, as previously described. However, as progeny distributions become highly skewed, levels of variation are sharply reduced, resulting in an apparent increase in D values (Figure S3). Increasing D values could also be a result of the underlying filtering criteria, as after filtering simulations to match real data with segregating sites > 2%, D values increased proportionally (Figure S4). However, Tajima’s D is consistently negative in all mutation ranges, even after filtering.
Parameter inference from patient samples
Thus, we next considered these results in light of published patient data. Recent publications have suggested that NGS technologies could facilitate personalized treatment in TB patients, allowing for improved outcomes (Copin et al. 2016; Cancino-Muñoz et al. 2019). To utilize such data, however, it is vital to understand the evolutionary dynamics shaping within-host M.TB diversity. As an illustrative example, we have re-examined the number of segregating sites in patient samples and estimated a mean ∼10 segregating sites per sample (Trauner et al. 2017) (Figure S5). Further, using the results and expectations obtained above regarding the level and distribution of variation, the patient data appear best fit by simulated populations with μ values ranging from 7.3e-9 to 3.8e-7, with the strongest posterior density at μ ∼ 6e-8 and Ψ ∼0.06 (Figure 3A). Notably, faster μ values could produce similar results, but only if Ψ proportions are in excess of 0.15 (Figure 3A).
In addition, owing to the strong per-generation reductions associated with progeny skew, there remains no signal in the data to estimate the severity of the infection bottleneck accurately (Figure 3B). Specifically, while there is an increase in density towards stronger bottleneck values, the posterior distribution is not notably distinct from the prior distribution ∼U[0.001, 0.1] (Figure 3B). Apart from the population size reduction associated with infection, these results have important implications for the ability to detect other parameters of clinical interest - namely, the presence of selective sweeps associated with beneficial mutations (e.g., potentially owing to drug-resistance). First, while there is strong statistical power to infer both μ and Ψ from patient data, there is little power to detect isolated events in the past (Figure 3B). This result, though unexpected under standard WF assumptions, is intuitive given the non-WF progeny distributions related to clonality. Namely, the diversity reduction associated with a single bottleneck event multiple generations in the past is not discernible from the per-generation diversity reduction related to clonal reproduction. Further, given that a selective sweep is, in fact, a type of population bottleneck (Barton 1998), this result also demonstrates that detecting selective sweeps associated with resistance mutations in this non-recombining organism, based on levels and patterns of genomic variation, will be exceedingly difficult. However, this observation well reconciles the fact that levels of variation do not appear significantly different between resistant and non-resistant M.TB patient populations (Trauner et al. 2017) - that is, these additional evolutionary processes are shaping variation so strongly, that the presence or absence of a resistance-associated selective sweep does not result in strongly differentiable expectations.
Implications for characterizing the history of TB in humans
A topic of wide-spread interest in the literature pertains to the history of TB in the human host. This inference has primarily been made within a phylogenetic context, relying on the construction of a single consensus sequence per patient. While such a comparison of consensus sequences can be highly mis-leading when making evolutionary inference (see Renzette et al. 2017), these age-estimates also inherently rely on an accurate knowledge of mutation rates in order to invoke the ‘clock-like’ accumulation of neutral mutations as a proxy for time (Menardo et al. 2019). As our results demonstrate that previous mutation rate estimates have likely been downwardly-biased, it is of interest to consider what these revised mutation rates would imply for this evolutionary history. However, there are at least three difficulties in directly comparing population-level estimates with previous consensus-based phylogenetic inference. Firstly, estimates are generally given per year, whereas the preferred evolutionary rate is per generation (as given here). There is support for one generation per day as a conversion (Cole et al. 1998), though further study is necessary to quantify the correct scaling factor. Secondly, when invoking a divergence-based clock, previous studies are measuring the neutral mutation rate, given that the rate of mutation is equivalent to the rate of divergence for neutral mutations only (Kimura 1968). However, we are here interested in the total mutation rate (that is, the rate at which neutral and non-neutral mutations arise per generation); therefore, our rate must be parsed into neutral and non-neutral components to enable appropriate comparison. Similar to the first point, additional research is necessary in order to better quantify the distribution of fitness effects in M.TB, as understanding the fraction of total mutations represented by neutral mutations is necessary for the conversion. Finally, we here consider the alleles segregating within a population for inference (i.e., within a patient), whereas previous studies often call a consensus sequence per patient (i.e., per population) and make inferences based on a collection of such consensus sequences. Such a summary of population-level variation into a single sequence is difficult to interpret, though what is evident is that a great majority of rare alleles will be neglected, and thus only a small subset of total variation (i.e., common alleles) will be considered (Renzette et al. 2017). As such, we propose that future evolutionary inference pertaining to TB would benefit tremendously from a full consideration of within-patient diversity, as we demonstrate here. In sum, any direct comparison with consensus-based phylogenetic age estimates would be overly speculative at this juncture.
CONCLUSIONS
TB patient infection dynamics have remained enigmatic. We here argue that much of the difficulty in interpreting patterns of variation and evolution has owed to an inappropriate underlying null model, relying on classical expectations developed for organisms with very different underlying biological properties. Fortunately, recent theoretical extensions in non-WF and alternative coalescent models, more appropriate for clonally reproducing organisms, have created an opportunity to revisit existing M.TB patient data. By accounting for the pervasive purifying selection effects associated with this non-recombining, highly-coding genome, as well as the skewed progeny distributions inherent to clonal reproduction, we have provided improved insights into the evolutionary dynamics shaping within-host variation. Further, through a consideration of these diversity-reducing effects, results suggest an underlying de novo mutation rate that is considerably faster than previously inferred. This largely reconciles the seemingly contradictory observations of both rapid resistance evolution, but extremely low levels of population variation. Namely, the population mutation rate may indeed be sufficiently fast to provide a steady input of beneficial mutations, explaining the rapid resistance evolution clinically observed. However, recurrent purifying selection and progeny skew act together to rapidly eliminate segregating variation from the population, reconciling the minimal levels of variation observed as well as the general homogeneity in levels and distributions of variation in both resistant and non-resistant patient samples alike. Furthermore, the role of these per-generation evolutionary forces in shaping patterns of variation is sufficiently strong that periodic events, including the infection-associated bottleneck and selective sweeps centered on drug-resistance mutations, will be challenging to detect and quantify on top of these more common processes.