Response to a population bottleneck can be used to infer recessive selection

Daniel J. Balick; Ron Do; David Reich; Shamil R. Sunyaev

doi:10.1101/003491

Abstract

Here we present the first genome wide statistical test for recessive selection. This test uses explicitly non-equilibrium demographic differences between populations to infer the mode of selection. By analyzing the transient response to a population bottleneck and subsequent re-expansion, we qualitatively distinguish between alleles under additive and recessive selection. We analyze the response of the average number of deleterious mutations per haploid individual and describe time dependence of this quantity. We introduce a statistic, B_R, to compare the number of mutations in different populations and detail its functional dependence on the strength of selection and the intensity of the population bottleneck. This test can be used to detect the predominant mode of selection on the genome wide or regional level, as well as among a sufficiently large set of medically or functionally relevant alleles.

1 Introduction

In diploid organisms, selection on an allele, or a group of alleles, can be categorized as additive, dominant or recessive, or as part of a more general epistatic network. A large body of existing work is devoted to statistical methods to detect and quantify selection using DNA sequencing data, including comparative genomics and the sequencing of population samples [1,2,3]. However, much less progress has been made toward identifying the predominant mode of selection as additive, recessive or dominant. Genetics of model organisms and of human disease provide plenty of anecdotal evidence in favor of the importance of dominance [4]. Although genome-wide association studies suggest that alleles of small effects involved in human complex traits frequently act additively, estimation of genetic variance components from large pedigrees suggests a substantial role for dominance in a number of human quantitative traits [5]. Alleles of large effects involved in human Mendelian diseases, spontaneous and induced mutations in model organisms, such as mouse, zebrafish, or Drosophila, are frequently recessive [6]. In spite of these observations, the role of dominance in population genetic variation and evolution remains unexplored and no formal statistical framework to test for dominance coefficient is currently available.

Using a combination of theoretical analysis and computer simulations, we demonstrate that recessive selection can be qualitatively distinguished from additive selection in populations that experienced a population bottleneck and subsequent re-expansion. Previous studies of non-additive variation in the presence of a bottleneck lack a complete description of the dynamics after re-expansion [7,8,9,11,3], or focus on epistatic interactions rather than recessive selection [12,13,14,15,16,17], with the notable exception of a recent independently conducted complementary analysis found in [18]. Contrary to naive expectation, the number of deleterious recessive alleles per haploid genome is transiently reduced after a population bottleneck, while the number of additively or dominantly acting alleles is increased. In spite of a well-documented increase in frequency of some recessively acting variants in founder populations, the average number of recessive alleles carried by an individual is reduced as a consequence of the bottleneck. With the growing availability of DNA sequencing data in multiple populations, these results demonstrate the potential to directly evaluate the role of dominance, either on a whole genome level, or in specific categories of genes.

Figure 1

A schematic representation of two populations is presented above (A). Initially a single population prior to the bottleneck event, the populations split and have distinct demographic profiles. The equilibrium population has constant size for easy comparison to the founded population. The latter drastically reduces its population size to N_B for a short time T_B during the founder’s event. Our statistical comparison between populations B_R is represented here for cases of purely additive (B) and purely recessive (C) variation. The statistic B_R > 1 for recessive variation (dominance coefficient h = 0) and B_R < 1 for additive variation (h = 1 /2), providing a simple test for the primary mode of selection of polymorphic alleles in the populations.

Population bottlenecks are a common feature in the history of many human populations. For example, the “Out of Africa” bottleneck involved ancestors of many present-day human populations. Numerous recent bottlenecks affected, among others, well studied populations of Finland and Iceland. More generally, bottlenecks followed by expansions are standard features in the recent evolution of most domesticated organisms. We suggest that complex demographic history may assist rather than complicate statistical inference of selection in population genetics. Here we use the distinct demographic histories of two subpopulations to identify the type of selection dominating the dynamics, and show that the average number of mutations per individual, 〈x〉, is dependent on the mode of selection. We introduce a measure B_R (the “burden ratio” defined below) that provides a simple statistical test for any set of polymorphic alleles in the population, where B_R < 1 corresponds to predominantly additive selection and B_R > 1 to predominantly recessive selection, as shown in Figure 1. This test is not restricted to the simplified demographic model presented in this paper, but rather provides a quite generic qualitative test for the predominance of recessive selection in comparison between two populations, one of which experienced a bottleneck event.

2 Model

We work with a simple demography described by an ancestral population of N₀ individuals that splits into two subpopulations, one with population size N₀ equal to the initial population size (“equilibrium”), and one with reduced bottleneck population size N_B (“founded”). The latter population persists at this size for T_B generations before instantaneously re-expanding to the initial population size N₀, as shown in Figure 1. Time t is measured after the re-expansion from the bottleneck, as we are interested in the dynamics during this period. Quantities measured in the equilibrium population, and equivalently prior to the split, are denoted with a subscript “₀”. We consider only deleterious mutations with average selective effect of magnitude s > 0, such that s represents the strength of deleterious selection. Extensions of this analysis to a full distribution of selective effects can be found in the SI. The initial population is in steady state with 2N₀U_d deleterious alleles introduced into the population at a mutation rate U_d per haploid individual per generation. In a steady state equilibrium, the site frequency spectrum (SFS) of polymorphic alleles is given by Kimura [19].

Here h > 0 is the dominance coefficient for deleterious mutations, where h = 1/2 corresponds to a purely additive set of alleles, and h = 0 corresponds to the purely recessive case. For the present analysis, we primarily focus on these two limits, contrasting their effects on the genetic diversity. The solution represents a mutation-selection-drift balance in which new mutations are exactly compensated for by the purging of currently polymorphic alleles due to selection and extinction due to stochastic drift. In this way, an approximately static number of polymorphic alleles exists in the population at any given time.

3 Results

We follow the expected number of mutations per chromosome in the population, which is simply the first moment of the SFS.

When multiplied by s, this is the effective “mutation load” of each individual in the additive case, but in the case of purely recessive selection this is not proportional to the fitness, as selection acts only on homozygotes. We refer to this statistic generally as the “mutation burden” to avoid assumption of any given mode of selection. Comparison between the mutation burden in the equilibrium and founded populations in the form of the “burden ratio”, B_R, provides a test for recessive alleles.

To gain intuition for this qualitative difference, we work to quantitatively understand the population dynamics in a simple demography, first for purely additive selection, and then for purely recessive selection for comparison.

3.1 Additive selection and response to a bottleneck

The initial site frequency spectrum for purely additive alleles is given by Equation (1) with h = 1/2.

Here . When , the SFS rapidly decays as x → 1 simplifying the functional form[20]. We approximately compute the initial mutation burden as follows.

Now we deviate from equilibrium by reducing the population size to 2N_B chromosomes, representing a population bottleneck. The effect that a bottleneck has on the site frequency spectrum is twofold: a fraction of alleles are removed from the population due to increased random drift, and the mean of the remaining alleles occurs at higher frequency. The dynamics of the distribution ϕ(x, t) during such a change in demography can be computed from Kolmogorov’s forward equation, as detailed in the SI. The first moment of the distribution, the mutation burden, follows the temporal dynamics derived from summing the Kolmogorov equation over all alleles in the genome, and takes the following form.

The burden of additive mutations is not directly affected by drift, as the drift term vanishes from the dynamics of the first moment, however the dependence on the second moment introduces an indirect dependence on drift. In the strong selection regime, in the limit where , extinction of some alleles is exactly compensated for by an increase in frequency of other alleles. This is true in the equilibrium distribution prior to the bottleneck when , where and . During the bottleneck, the mutation burden 〈x〉 monotonically increases; the second moment 〈x²〉 increases, as well, reaching a maximum value in the case of a long bottleneck where it scales as . Provided , the second moment is guaranteed to be subdominant to the first moment, simplifying the dynamics as follows.

For a bottleneck of duration T_B, this equation admits solutions of the form,

After plugging in the initial value , we find that the time dependence drops out completely, demonstrating that the population remains in mutations selection balance throughout the bottleneck. After instantaneous re-expansion to the initial population size, the dynamics of the distribution ϕ(x) are completely analogous to those inside the bottleneck in this limit, such that the mutation burden never deviates during the demographic perturbation.

In the opposite limit of completely relaxed selection during the bottleneck, the dynamics of the mutation burden are completely driven by the influx of new mutations.

The net effect of this accumulation over the course of the bottleneck is simply the integral of this quantity. For a bottleneck with duration T_B generations, the net effect of mutation accumulation due to relaxed selection is given simply by the following expression.

Additionally, one can show that the second non-central moment gains an analogous con-tribution in addition to the net effect of drift.

Here we have expressed the second moment as a function of the bottleneck intensity . Immediately after re-expansion from the bottleneck, selection is again efficient, such that the dynamics are completely described by Equation (6). Although the second moment is increased due to relaxed selection during the bottleneck, we find that this increase is negligible in comparison to the direct accumulation in the first moment provided I_B « 1. As a result, the primary effect of the bottleneck in this limit is to accrue new mutations that are subsequently purged when selection is again efficient in the re-expanded population. The dynamics for the two limiting cases can be summarized as follows.

We note that at all times in both limiting cases, and asymptotically decays to the equilibrium frequency on a timescale given by the strength of selection of the accumulated deleterious mutations. In the case of an instantaneous bottleneck, we find that the mutation burden is only slightly shifted even if selection is fully relaxed, resulting in effectively no observable change in either limit. Our statistical measure, the burden ratio B_R, in the additive case can be written approximately as follows.

As we will see in the following sections, recessive selection results in depleted mutation burden with corresponding values B_R > 1, proving a contrast to the additive scenario and justifying our use of this statistic as a test for recessivity.

3.2 Recessive selection and dynamics of the mutation burden

Prior to the bottleneck, the initial site frequency spectrum for alleles under recessive selection is given by the h = 0 limit of Equation (1).

At low frequencies the spectrum decays slower than in the additive case, representing alleles protected from recessive selection by existing primarily in heterozygous form. In contrast, at high frequencies the spectrum decays faster than the additive exponential decay, falling off as .

3.2.1 Instantaneous population bottlenecks

First, we restrict our analysis to an instantaneous bottleneck with intensity I_B = 1/2N_B, as this provides insight into the non-equilibrium response of the frequency spectrum to a downsampling event. Later, we extend our analysis to finite bottlenecks that persist for T_B generations, with total intensity I_B = T_B/2N_B. We represent the increase in drift due to a single generation bottleneck by downsampling. During this time step, N_B diploid individuals are chosen at random from the initial larger population of N₀ individuals.

Binomial sampling gives the distribution ϕ_Β of deleterious alleles with frequency x = k/2N_B. There is a loss of allelic variation due to the bottleneck, corresponding to the k = 0 term in Equation (13).

Re-expansion is modeled as up-sampling the distribution ϕ_Β(x) from N_B to N₀ diploid individuals, which has negligible effect on the first and second moments of the distribution. As a result of drift to higher frequencies during the bottleneck, much of the existing variation appears in homozygous form immediately after the increase in population size. These individuals are rapidly selected out of the population, driving high frequency alleles to lower frequencies on a very short time scale. Since drift is once again suppressed, selection becomes far more efficient, particularly for alleles of large selective effect.

The time evolution of ϕ after the bottleneck is given by the forward Kolmogorov equation for recessive selection (see SI). The mutation burden follows the time dependence,

Here we suppress a selection term proportional to 〈x³〉 of in analogy to the additive case. Since recessive selection depends quadratically, rather than linearly, on the allele frequency, the increased variance of the distribution drives the motion of the mutation burden. Alleles with frequency appear in homozygous form and are rapidly pushed down to lower frequencies. This happens on a time scale of order s^−1/2 and effectively reduces the variance, slowing the decrease in the mutation burden 〈x〉. New mutations introduced during this period slowly drift to appreciable frequencies, replacing those lost in the bottleneck. This process is drift controlled, rather than selection controlled, and thus occurs on a time scale of 0(2N₀) generations. As a result, the mutation burden quickly decreases due to selection immediately after the bottleneck until it slows to a stop, and then gradually increases as the population accumulates new mutations and re-equilibrates.

A minimum in the mutation burden 〈x(t)〉_founded occurs when the time derivative van-ishes. This corresponds to a characteristic time scale associated with the selective effect s, where our statistical test is maximized. Since this time scale is shorter than the time scale of drift, we can imagine rescaling time by the effective population size 2N0 and then working in the perturbative regime t/2N₀ ≪ 1. This allows us to Taylor expand near the re-expansion time t = 0 to understand the motion of the mutation burden at times soon after the bottleneck.

To understand the time dependence of 〈x²〉, specifically the time derivative, we analyze the higher moments in the same fashion as employed for the first moment in Equation (14). All relevant moments are computed in the SI and we note sufficient convergence to validate this expansion. This allows for the re-expression of Equation (15) to second order in t in terms of the first three moments of the site frequency spectrum immediately after re-expansion. The moments of the post-bottleneck initial distribution can be written in terms of the initial equilibrium distribution using the integral form given in Equation (13). Details of this calculation appear in the SI. In the strong selection limit 2N₀s ≫ 1 these initial equilibrium moments are readily approximated by standard convolutions of a polynomial with a Gaussian. Suppressing subdominant contributions in the limit , we find the following approximation to the trajectory of the mutation burden immediately after the bottleneck re-expands.

Concentrating on this second order expansion in t, we find that the curve first drops from its initial value , quickly reaches a minimum, and is then brought back up by the the positive second order term. The location of the minimum is easily found to have the following parameter dependence.

The second derivative is positive at this extremum, implying a local minimum. Plugging t_min into our expression for 〈x(t)〉 in the limit N₀s ≫ 1, we find the following minimum value for the average number of recessive deleterious mutations per genome following a bottleneck.

We note that is the approximate mutation burden for the equilibrium distribution in the 2N₀s ≫ 1 limit, allowing us to simply write the extreme value of the B_R statistic as follows.

We find the following dependence on time in immediate response to a population bottleneck.

This expansion is only valid in the small time limit where the quadratic term is subdominant, such that all values are positive. Long before this simple quadratic expression becomes negative, higher order contributions become relevant and dominate. As seen in simulations described in the following section, for recessive deleterious mutations, the burden ratio remains positive at all times.

This precise result applies strictly in the limit of a strong, single generation bottleneck, where N₀ ≫ N_B. Additionally, the technique used to compute integral expressions re-quired the strong selection limit 2N₀s ≫ 1. Analysis of higher order contributions to the trajectory are made substantially easier by restricting to the limit , which happens to be biologically reasonable, for example, in human populations where most examples of founding events are on the order of N₀ ~ 10⁴ and N_B ~ 10³ (see further discussion in the SI on general dominance coefficients). Despite these analytic restrictions in parameter space, our simulations described below indicate that the signature of B_R > 1 is ubiquitous for populations under predominantly recessive selection.

3.2.2 Extended population bottlenecks

We argue that for the case of relatively low intensity bottlenecks, where intensity is defined as I_B = T_B/2N_B ≪ 1, we can approximately express the magnitude of B_R using a simple substitution . This is equivalent to the claim that for low intensity bottlenecks, the B_R statistic depends only on the ratio of the bottleneck time to the bottleneck population size, and any explicit dependence on T_B occurs in subdominant contributions. This intuition is confirmed by simulations described in below, where we show that the accuracy of our analytic approximation breaks down as I_B → 1 and the intensity becomes non-perturbative. For short bottlenecks with I_B < 1/10, the approximation of an instantaneous single generation sampling event remains sufficiently accurate, even for strong selective coefficients s ~ 0.1. Under this trivially extended instantaneous approximation, B_R(t) can be written in terms of the intensity of a short bottleneck as follows.

The B_R of maximum effect, has a magnitude given approximately by,

Figure 2

The time dependence of B_R(t) after a population bottleneck is shown for various selective coefficients. Peak B_R values vary in both magnitude and time as a function of s. The founded population was simulated with 2N₀ = 20000, 2N_B = 2000, and T_B = 200 and plotted for 5000 generations after re-expansion.

For illustration of the behavior described in the above analytics we present a time series of recessive simulations with curves representing various selection coefficients in Figure 2. The time dependence of the B_R statistic is plotted to demonstrate the simulated population’s response to a founder’s event. Crucially, we find that the peak B_R values vary in both magnitude and time as a function of s, as is consistent with our analytic understanding and intuition.

3.3 Transient response and time of observation determine detectable selection coefficients

Thus far, we have detailed the dynamic dependence of a set of alleles in a population, all with selective effect s, in response to demographic perturbation in the form of a bottleneck. Notably, for recessive selection, a peak response occurs in the B_R statistic at some time t_min after re-expansion. In general, both the magnitude of B_R(t_min) and the time of the peak itself depend sensitively on the selection coefficient. In general, a distribution of mutations with different selective effects will be present, many of which may be simultaneously polymorphic in a given population. Since alleles of different selective effect respond to the bottleneck on different time scales, one can ask what selective effect is most likely to be observed at a given time. For example, very strong selection has the tendency to peak and subsequently re-equilibrate immediately after the bottleneck, such that observation of alleles with large s is substantially more difficult at later times. On the other hand, alleles under relatively weak selection have a peak effect at very late times, such that at the time of data collection a statistically significant response may not yet have occurred.

Figure 3

At the time of observation t_obs, the value of B_R(t_obs) is determined by the average strength of selection s for additive or recessive variation, or variation with any intermediate dominance coefficient . A range of B_R values observed at a single time slice are plotted for various s values. Different dominance coefficients appear as solid lines with fully recessive selection (h = 0) at the top and purely additive selection at the bottom. B_R approaches one both in the limit of very strong selection s → 1 due to the rapid transient response, and in the very weak selection limit s → 0 due to the nearly neutral insensitivity to the bottleneck. For some intermediate dominance coefficients h_c, a crossover occurs ( in the example shown) where the effects of additive and recessive variation cancel such that B_R(h_c) ∼ 1. The parameter dependence of the crossover is explored analytically in the SI.

We would like to understand the transient behavior of the burden ratio B_R(t), as well as the value of the selection coefficient s for which B_R is largest at a given time. When comparing to population data, one has little control over the demographic history, and thus it becomes important to understand the selective coefficient that dominates at the time of observation. According to the time dependent expression in Equation (21), we expect the effect to decrease quite rapidly for very large s. However, the peak occurs quite early in the case of larger s values, allowing the mutation burden to equilibrate over a longer period of time between the peak and observation to return to mutation burden values close to B_R ~ 1. This tells us that the equilibration process is what reduces the magnitude of B_R for large s. In the case of very recent bottlenecks, the large s values dominate, but for later times of observation, this signal has partially equilibrating, potentially allowing a smaller s value to dominate the statistic. At a given time of observation t_obs, one can represent B_R(s, t_obs) as a function of various selection coefficients s. Figure 3 represents B_R(s) for a fixed t_obs for various dominance coefficients h. We concentrate here on recessive variation with h = 0, but note that a crossover occurs at some value h_c where additive and recessive effects offset each other in the B_R statistic (detailed in SI). Based on our analytics, we expect the peak to shift from extreme high s values at early times to extreme low s values at late times, eventually dissolving into neutrality. We take the s derivative of Equation (21) to find the maximum at t_obs.

One can easily show that the second derivative evaluated at this point is negative, confirming that this is a maximum. This result matches our intuition: maximum s values of B_R(s, t) are found at high s for early times, s_max(t → 0) ≫ 1, and at low s for late times, s_max(t → ∞) ≪ 1. This is qualitatively observed in our simulations by comparing the relative values of B_R(s) as a function of time.

As the effect is transient, we can define a relaxation time t_relax corresponding to the vanishing of any response to the bottleneck. This is given by determining when s_max is dominated by effectively neutral variation at roughly s_max ∼ 1/2N₀. After this time, B_R(s, t) cannot be differentiated from one for any s.

We note that the return to equilibrium happens on a time scale faster than random drift, even for the weakest selective effects, thus validating our perturbative approximations using t/2N₀ ≪ 1. Higher order time dependence in Equation (21) may substantially correct this estimate, but we feel that the presentation of this methodology is conceptually important and provides a greater understanding of the transient dynamics of population response to bottlenecks. As it is relevant to human populations, we note that if both populations expand exponentially after the bottleneck, the effect may persist long beyond t_relax. This is explored analytically in the SI and in simulations in an accompanying paper [21].

4 Comparison of analytic results to simulation

Figure 4

Maximum response values of the burden ratio B_R(t_min) are plotted for recessive selection as a function of bottleneck intensity. A wide range of parameter sets are plotted with all combinations of 2N_B = {2000,1000,400,200,100}, s = {0.1,0.02,0.01,0.001}, T_B = {200,100,50,20,10}, each simulated for 10⁸ nucleotide sites. For relatively low intensity bottlenecks we note excellent agreement over the parameter ranges plotted. Intensities with I_B = T_B/2N_B > 0.1 are excluded, as the instantaneous bottleneck scaling breaks down in favor of a long bottleneck scaling. The approximation necessarily weakens for simulations that represent longer bottlenecks, and only for strong selective coefficients, as expected. This quantifies the limitations of the instantaneous bottleneck approximation, as we observe substantial deviation only around I_B = 0.1 and with selection strength s = 0.1.

We checked our analytic results using a forward time population simulator, described in detail in the SI. Given the ubiquity and analytic simplicity of the exponential decay in the additive scenario, we focus here on our predictions for recessive variation. We compare analytic expressions of B_R(t_min) at the peak response given in Equation (22) for various selection coefficients. We simulated a wide range of bottleneck parameters to test the limitations of our theoretical understanding. In Figure 4, we demonstrate the accuracy of our analytic results, by plotting the ratio of the simulated values of B_R(t_max,s,I_B) to our analytic predictions B_R(t_max, s, I_B) as presented in Equation (22). We arrange our simulated data by bottleneck intensity I_B, as we expect the instantaneous bottleneck approximation to break down as intensity is increased due to longer bottleneck duration T_B ≫ 1. As plotted, complete agreement between simulated data and analytic predictions is represented by a flat line at . As expected, we find deviations as we approach the limitations of our perturbative approximation roughly around T_b ∼ 2N_B/10 when I_B ∼ 0.1. Below these higher intensities, we find quite good agreement for all parameter sets well below 10% error, even at I_B = 0.05.

5 Discussion

The increase in prevalence of recessive phenotypes following population bottlenecks has been attracting the interest of geneticists for a long time [7,22]. Theoretical analysis of allele frequency dynamics in a population expanding after a bottleneck suggested that frequency of an individual allele may rise due to increased drift [22,23,24]. Here, we focus on a more general question of the collective dynamics of recessively acting genetic variation. Surprisingly, our analysis suggests that the number of recessively acting variants per haploid genome is reduced in response to a bottleneck and subsequent re-expansion. Generally, we have demonstrated that the frequency spectra of recessive deleterious polymorphisms behave distinctly from additively acting variation following a population bottleneck and subsequent re-expansion. The response of additive variation depends crucially on the average number of deleterious alleles, and on the number of generations for which selection is relaxed during the bottleneck. In contrast, the dynamics of recessive variation crucially depend on the width of the site frequency spectrum, rather than the average number of mutations per individual, such that the accumulation of deleterious mutations can respond strongly even to a single generation bottleneck. Importantly, the temporal dynamics of the accumulation of deleterious alleles depends qualitatively on dominance coefficient and quantitatively on selection coefficient. The qualitative dependence on dominance coefficient allows for a robust statistical test for recessivity. If the variation is additive, the number of deleterious variants per a haploid genome is larger in a bottlenecked population than in a corresponding equilibrium population. If the variation acts recessively, this number is smaller. The selection coefficient determines the timing of response to a bottleneck.

By explicitly analyzing the non-equilibrium response to a bottleneck, we have demon-strated a technique for using potentially confounding demographic features to probe the underlying population genetic forces. In realistic populations, for example in modern humans, substantial work has been done to identify and understand the recent demographic history of geographically disparate populations [25,26,27,28,29,30,31,32,33,34]. In the case of the “Out of Africa” event, a historically substantiated and believable demographic model can be used to model the difference between African and European populations since their divergence. Comparison between populations that have and have not undergone a bottleneck can be used to infer plausible selection and dominance coefficients. In an accompanying paper [21], we specialize this analysis using a realistic demographic model to attempt to bound the selection and dominance coefficients in modern human populations. Parameterizing only by the duration of the bottleneck T_B, along with s and h, one can show that a substantial fraction of this three dimensional space is disallowed by the observation of even a single bottleneck.

Although the net number of recessive deleterious mutations is reduced as a consequence of a founder’s event and subsequent re-expansion, the fitness of individuals carrying these alleles is not increased, but rather decreased; selection acts only at homozygous sites and the number of homozygotes is known to increase after a population bottleneck. However, the number of heterozygous deleterious sites, or the average carrier frequency for associated alleles, is suppressed, such that the mating of individuals from disparate bottlenecked populations may result in a decreased incidence of recessive phenotypes in such mixed lineages. In studies of model organisms, this may have applications when comparing laboratory populations founded from a few wild type individuals to their corresponding natural population.

In principle, the results of this study are applicable to the analysis of specific groups of genes and pathways. Sufficiently large subsets of alleles that are medically relevant may be analyzed in humans to identify the mode of selection for candidate variants of recessive diseases. For model organisms with a significant density of deleterious alleles, it may be possible to create a dominance map of the genome.

In sum, the non-equilibrium dynamics induced by demographic events is an essential, and indeed insightful, feature of most realistic populations. Population bottlenecks, abundant in laboratory populations and in natural species, have the potential to provide a novel perspective on the role of dominance in genetic variation.

Acknowledgments

6 Acknowledgements

The authors would like to thank Benjamin Good, Alexey Kondrashov, Nick Patterson, Jonathan Pritchard, and Guy Sella for particularly useful discussions. DJB and SS were generously supported by NIH grants R01 MH101244 and R01 GM078598. RD was supported by a CIHR Banting fellowship. DR is grateful for support from NIH grant R01 GM100233.

References

[1].↵
A. D. Cutter and B. A. Payseur. Genomic signatures of selection at linked sites: unifying the disparity among species. Nat. Rev. Genet., 14:262–74, 2013.
OpenUrl CrossRef PubMed
[2].↵
Y. Zhang, et al. Inaugural Article: Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proc. Natl. Acad. Sci., 109:15553–15559, 2012.
OpenUrl Abstract/FREE Full Text
[3].↵
W. G. Hill, M. E. Goddard, and P. M. Visscher. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet., 4:e1000008, 2008.
OpenUrl CrossRef PubMed
[4].↵
M. Lynch and B. Walsh. Genetics and analysis of quantitative traits. 1998.
[5].↵
D. L. Newman, et al. The Importance of Genealogy in Determining Genetic Associ-ations with Complex Traits. Am. J. Hum. Genet., 69:1146–1148, Dec. 2001.
OpenUrl CrossRef PubMed Web of Science
[6].↵
B. J. Herron, et al. Efficient generation and mapping of recessive developmental mutations using ENU mutagenesis. Nat. Genet., 30:185–189, 2002.
OpenUrl CrossRef PubMed Web of Science
[7].↵
A. Robertson. The effect of inbreeding on the variation due to recessive genes. Genetics, 37:189–207, 1952.
OpenUrl FREE Full Text
[8].↵
E. H. Bryant, S. A. McCommas, and L. M. Combs. The effect of an experimental bottleneck upon quantitative genetic-variation in the housefly. Genetics, 114:1191–1211, 1986.
OpenUrl Abstract/FREE Full Text
[9].↵
J. L. Wang, A. Caballero, P. D. Keightley, and W. G. Hill. Bottleneck effect on genetic variance: A theoretical investigation of the role of dominance. Genetics, 150:435–447, 1998.
OpenUrl Abstract/FREE Full Text
[10].
M. Kirkpatrick and P. Jarne. The Effects of a Bottleneck on Inbreeding Depression and the Genetic Load. Am. Nat., 155(2):154–167, Feb. 2000.
OpenUrl CrossRef PubMed
[11].↵
X.-S. Zhang, J. Wang, and W. G. Hill. Redistribution of gene frequency and changes of genetic variation following a bottleneck in population size. Genetics, 167:1475–1492, 2004.
OpenUrl Abstract/FREE Full Text
[12].↵
J. M. Cheverud and E. J. Routman. Epistasis as a source of increased additive genetic variance at population bottlenecks. Evolution, 50(3):1042–1051, 1996.
OpenUrl CrossRef Web of Science
[13].↵
W. G. Hill, A. Caballero, and J. Wang. The effect of linkage disequilibrium and deviation from Hardy-Weinberg proportions on the changes in genetic variance with bottlenecking. Heredity, 81:174–186, 1998.
OpenUrl
[14].↵
Y. Naciri-Graven and J. Goudet. The additive genetic variance after bottlenecks is affected by the number of loci involved in epistatic interactions. Evolution, 57:706–716, 2003.
OpenUrl CrossRef PubMed Web of Science
[15].↵
N. H. Barton and M. Turelli. Effects of genetic drift on variance components under a general model of epistasis. Evolution, 58:2111–2132, 2004.
OpenUrl CrossRef PubMed Web of Science
[16].↵
W. G. Hill, N. H. Barton, and M. Turelli. Prediction of effects of genetic drift on variance components under a general model of epistasis. Theor. Popul. Biol., 70:56–62, 2006.
OpenUrl CrossRef PubMed
[17].↵
M. Turelli and N. H. Barton. Will population bottlenecks and multilocus epistasis increase additive genetic variance? Evolution, 60:1763–1776, 2006.
OpenUrl CrossRef PubMed Web of Science
[18].↵
Y. B. Simons, M. C. Turchin, J. K. Pritchard, and G. Sella. The deleterious mutation load is insensitive to recent population history. arXiv:1305.2061v1, 2013.
[19].↵
M. Kimura. Diffusion models in population genetics. J. Ap. Prob., 1:177–232, 1964.
OpenUrl
[20].↵
M. Nei. The frequency distribution of lethal chromosomes in finite populations. Proc. Natl. Acad. Sci. USA, 60: 517524, 1968.
OpenUrl
[21].↵
R. Do, D J. Balick, et al. (submitted) No significant difference in deleterious load across modern humans. Nat. Gen., 2013.
[22].↵
M. Slatkin. A population-genetic test of founder effects and implications for Ashkenazi Jewish diseases. Am. J. Hum. Genet., 75:282–293, 2004.
OpenUrl CrossRef PubMed Web of Science
[23].↵
E. Gazave, D. Chang, A. G. Clark, and A. Keinan. Population Growth Inflates the Per-Individual Number of Deleterious Mutations and Reduces Their Mean Effect. Genetics, 195(3):969–78, 2013.
OpenUrl Abstract/FREE Full Text
[24].↵
S. Peischl, I. Dupanloup, M. Kirkpatrick, and L. Excoffier. On the accumulation of deleterious mutations during range expansions. Mol. Ecol., 2013.
[25].↵
A. Keinan, J. C. Mullikin, N. Patterson, and D. Reich. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat. Genet., 39:1251–1255, 2007.
OpenUrl CrossRef PubMed Web of Science
[26].↵
K. E. Lohmueller, et al. Proportionally more deleterious genetic variation in European than in African populations. Nature, 451(7181):994–997, Feb. 2008.
OpenUrl
[27].↵
S. Gravel, et al. Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. U. S. A., 108:11983–11988, 2011.
OpenUrl Abstract/FREE Full Text
[28].↵
W. Fu, et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature, 493:216–20, 2013.
OpenUrl CrossRef PubMed Web of Science
[29].↵
J. A. Tennessen, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science, 337(6090):64–69, 2012.
OpenUrl
[30].↵
I. Gronau et al. Bayesian inference of ancient human demography from individual genome sequences Nat. Genet., 43:1031–1034, 2011.
OpenUrl CrossRef PubMed
[31].↵
H. Li and R. Durbin. Inference of human population history from whole genome sequence of a single individual. Nature, 475:493–496, 2012.
OpenUrl
[32].↵
S. Sheehan, K. Harris, and Y. S. Song. Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach. Genetics, 194:647–62, 2013.
OpenUrl Abstract/FREE Full Text
[33].↵
K. Harris and R. Nielsen. Inferring demographic history from a spectrum of shared haplotype lengths. PLoS Genet., 9:e1003521, 2013.
OpenUrl CrossRef PubMed
[34].↵
I. M. Macleod, et al. Inferring demography from runs of homozygosity in whole- genome sequence, with correction for sequence errors. Mol. Biol. Evol., 30:2209–2223, 2013.
OpenUrl CrossRef PubMed
[35].↵
W. J. Ewens. Mathematical Population Genetics: Theoretical introduction. 2004.