Testing for ancient selection using cross-population allele frequency differentiation

Fernando Racimo

doi:10.1101/017566

1 Abstract

A powerful way to detect selection in a population is by modeling local allele frequency changes in a particular region of the genome under scenarios of selection and neutrality, and finding which model is most compatible with the data. Chen et al. [1] developed a composite likelihood method called XP-CLR that uses an outgroup population to detect departures from neutrality which could be compatible with hard or soft sweeps, at linked sites near a beneficial allele. However, this method is most sensitive to recent selection and may miss selective events that happened a long time ago. To overcome this, we developed an extension of XP-CLR that jointly models the behavior of a selected allele in a three-population tree. Our method - called 3P-CLR - outperforms XP-CLR when testing for selection that occurred before two populations split from each other, and can distinguish between those events and events that occurred specifically in each of the populations after the split. We applied our new test to population genomic data from the 1000 Genomes Project, to search for selective sweeps that occurred before the split of Africans and Eurasians, but after their split from Neanderthals, and that could have presumably led to the fixation of modern-human-specific phenotypes. We also searched for sweep events that occurred in East Asians, Europeans and the ancestors of both populations, after their split from Africans.

2 Introduction

Genetic hitchhiking will distort allele frequency patterns at regions of the genome linked to a beneficial allele that is rising in frequency [2]. This is known as a selective sweep. If the sweep is restricted to a particular population and does not affect other closely related populations, one can detect such an event by looking for extreme patterns of localized population differentation, like high values of F_st at a specific locus [3]. This and other related statistics have in fact been used to scan the genomes of present-day humans from different populations, so as to detect signals of recent positive selection [4-7].

Once it became possible to sequence entire genomes of archaic humans (like Neanderthals) [8-10], researchers also began to search for selective sweeps that occurred in the ancestral population of all present-day humans. For example, ref. [8] searched for genomic regions with a depletion of derived alleles in a low-coverage Neanderthal genome, relative to what would be expected given the derived allele frequency in present-day humans. This is a pattern that would be consistent with a sweep in present-day humans. Later on, ref. [10] developed a hidden Markov model (HMM) that could identify regions where Neanderthals fall outside of all present-day human variation (also called “external regions”), and are therefore likely to have been affected by ancient sweeps in early modern humans. They applied their method to a high-coverage Neanderthal genome. Then, they ranked these regions by their genetic length, to find segments that were extremely long, and therefore highly compatible with a selective sweep. Finally, ref. [11] used summary statistics calculated in the neighborhood of sites that were ancestral in archaic humans but fixed derived in all or almost all present-day humans, to test if any of these sites could be compatible with a selective sweep model. While these methods harnessed different summaries of the patterns of differentiation left by sweeps, they did not attempt to explicitly model the process by which these patterns are generated over time.

Chen et al. [1] developed a method called XP-CLR, which is designed to test for selection in one population after its split from a second, outgroup, population t_AB generations ago. It does so by modeling the evolutionary trajectory of an allele under linked selection and under neutrality, and then comparing the likelihood of the data under each of the two models. The method detects local allele frequency differences that are compatible with the linked selection model [2], along windows of the genome.

XP-CLR is a powerful test for detecting selective events restricted to one population. However, it provides little information about when these events happened, as it models all sweeps as if they had immediately occurred in the present generation. Additionally, if one is interested in selective sweeps that took place before two populations a and b split from each other, one would have to run XP-CLR separately on each population, with a third outgroup population c that split from the ancestor of a and b t_ABC generations ago (with t_ABC > t_AB). Then, one would need to check that the signal of selection appears in both tests. This may miss important information about correlated allele frequency changes shared by a and b, but not by c, limiting the power to detect ancient events.

To overcome this, we developed an extension of XP-CLR that jointly models the behavior of an allele in all 3 populations, to detect selective events that occurred before or after the closest two populations split from each other. Below we briefly review the modeling framework of XP-CLR and describe our new test, which we call 3P-CLR. In the Results, we show this method outperforms XP-CLR when testing for selection that occurred before the split of two populations, and can distinguish between those events and events that occurred after the split, unlike XP-CLR. We then apply the method to population genomic data from the 1000 Genomes Project [12], to search for selective sweep events that occurred before the split of Africans and Eurasians, but after their split from Neanderthals. We also use it to search for selective sweeps that occurred in the Eurasian ancestral population, and to distinguish those from events that occurred specifically in East Asians or specifically in Europeans.

3 Methods

3.1 XP-CLR

First, we review the procedure used by XP-CLR to model the evolution of allele frequency changes of two populations a and b that split from each other t_AB generations ago (Figure 1.A). For neutral SNPs, Chen et al. [1] use an approximation to the Wright-Fisher diffusion dynamics [13]. Namely, the frequency of a SNP in a population a (p_A) in the present is treated as a random variable governed by a normal distribution with mean equal to the frequency in the ancestral population (β) and variance proportional to the drift time ω from the ancestral to the present population: where ω = t_AB/ (2N_e) and N_e is the effective size of population A.

Figure 1. Schematic tree of selective sweeps detected by XP-CLR and 3P-CLR.

While XP-CLR can only use two populations (an outgroup and a test) to detect selection (panel A), 3P-CLR can detect selection in the ancestral branch of two populations (3P-CLR(Int), panel B) or on the branches specific to each population (3P-CLR(A) and 3P-CLR(B), panels C and D, respectively). The greek letters denote the known drift times for each branch of the population tree.

If a SNP is segregating in both populations - i.e. has not hit the boundaries of fixation or extinction - this process is time-reversible. Thus, one can model the frequency of the SNP in population a with a normal distribution having mean equal to the frequency in population b and variance proportional to the sum of the drift time (ω) between a and the ancestral population, and the drift time between b and the ancestral population (Ψ):

For SNPs that are linked to a beneficial allele that has undergone a sweep in population a only, Chen et al. [1] model the allele as evolving neutrally until the present and then apply a transformation to the normal distribution that depends on the distance to the selected allele r and the strength of selection s [14,15]. Let where q₀ is the frequency of the beneficial allele in population A before the sweep begins. The frequency of a neutral allele is expected to increase from p to l — c + cp if the allele is linked to the beneficial allele, and this occurs with probability equal to the neutral allele (p) before the sweep begins. Otherwise, the frequency of the neutral allele is expected to decrease from p to cp. This leads to the following transformation of the normal distribution: where σ²=(ω+ψ)p_b(1-pb)and I_[x,y] (z) is 1 on the interval [x,y] and 0 otherwise.

For s → 0 or r>>s, this distribution converges to the neutral case. Let v be the vector of all drift times that are relevant to the scenario we are studying. In this case, it will be equal to (ω, Ψ) but in more complex cases below, it may include additional drift times. Let r be the vector of recombination fractions between the beneficial alleles and each of the SNPs within a window of arbitrary size. We can then calculate the product of likelihoods over all k SNPs in that window for either the neutral or the linked selection model, after binomial sampling of alleles from the population frequency and conditioning on the event that the allele is segregating in the population:

We note that the denominator in the above equation is not explicitly stated in ref. [1] for ease of notation, but appears in the published online implementation of the method. Because we are ignoring the correlation in frequencies produced by linkage, this is a composite likelihood [16,17]. Finally, we obtain a composite likelihood ratio statistic S_XP-CLR of the hypothesis of linked selection over the hypothesis of neutrality:

For ease of computation, Chen et al. [1] assume that r is given (via a recombination map) and we will do so too. Furthermore, they empirically estimate v using F₂ statistics [18] calculated over the whole genome, and assume selection is not strong or frequent enough to affect their genome-wide values. Because we are interested in selection over long time scales, the new methods we will present below are optimally run using drift times calculated from population split times and effective population sizes estimated using model-based demographic inference methods, like ∂a∂i [19] or fastsimcoal2 [20].

3.2 3P-CLR

We are interested in the case where a selective event occurred more anciently than the split of two populations (a and b) from each other, but more recently than their split from a third population c (Figure 1.B). We begin by modeling p_A and p_B as evolving from an unknown common ancestral frequency β:

Let χ be the drift time separating the most recent common ancestor of a and b from the most recent common ancestor of a, b and c. Additionally, let v be the drift time separating population c in the present from the most recent common ancestor of a, b and c. Given these parameters, we can treat β as an additional random variable that either evolves neutrally or is linked to a selected allele that swept immediately more anciently than the split of a and b. In both cases, the distribution of β will depend on the frequency of the allele in population c (p_C) in the present. In the neutral case:

In the linked selection case: where K² = (v + χ)pc (1 - pc)

The frequencies in a and b given the frequency in c can be obtained by integrating β out. This leads to a density function that models selection in the ancestral population of a and b.

Additionally, formula 10 can be modified to test for selection that occurred specifically in one of the terminal branches that lead to a or b (Figures 1.C and 1.D), rather than in the ancestral population of a and b. For example, the density of frequencies for a scenario of selection in the branch leading to a can be written as:

We will henceforth refer to the version of 3P-CLR that is tailored to detect selection in the internal branch that is ancestral to a and b as 3P-CLR(Int). In turn, the versions of 3P-CLR that are designed to detect selection in each of the daughter populations will be designated as 3P-CLR(A) and 3P-CLR(B).

We can now calculate the probability density of specific allele frequencies in populations a and b, given that we observe m_C derived alleles in a sample of size n_C from population c:

Where B(x,y) is the Beta function and

Conditioning on the event that the site is segregating in the population, we can then calculate the probability of observing m_A and m_B derived alleles in a sample of size n_A from population a and a sample of size n_B from population b, respectively, given that we observe m_C derived alleles in a sample of size n_C from population c, using binomial sampling:

Where and

This allows us to calculate a composite likelihood of the derived allele counts in a and b given the derived allele counts in c:

As before, we can use this composite likelihood to produce a composite likelihood ratio statistic that can be calculated over regions of the genome to test the hypothesis of linked selection centered on a particular locus against the hypothesis of neutrality. Due to computational costs in numerical integration, we skip the sampling step for population c (formula 13) in our implementation of 3P-CLR. In other words, we assume p_C= m_C/n_C, but this is also assumed in XP-CLR when computing its corresponding outgroup frequency. We implemented our method in a freely available C₊₊ program that can be downloaded from here:

https://github.com/ferracimo [WILL POST IT AFTER PUBLICATION]

4 Results

4.1 Simulations

We generated simulations in SLiM [21] to test the performance of XP-CLR and 3P-CLR in a three-population scenario. We focused specifically on the performance of 3P-CLR(Int) in detecting ancient selective events that occurred in the ancestral branch of two sister populations. We assumed that the population history had been correctly estimated by the researcher (i.e. the drift parameters and population topology were known). First, we simulated scenarios in which a beneficial mutation arose in the ancestor of populations a and b, before their split from each other but after their split from c (Table 1). Although both XP-CLR and 3P-CLR are sensitive to partial or soft sweeps (as they do not rely on extended patterns of homozygosity[1]), we required the allele to have fixed before the split (at time t_ab) to ensure that the allele had not been lost before it, and also to ensure that the sweep was restricted to the internal branch of the tree. We fixed the effective size of all three populations at N_e = 10,000. Each simulation consisted in a 5 cM region and the beneficial mutation occurred in the center of this region. The mutation rate was set at 2.5 _*10^-8 per generation and the recombination rate was set at 10^-8 per generation.

View this table:

Table 1. Description of models tested.

All times are in generations. Selection in the “ancestral population” refers to a selective sweep where the beneficial mutation and fixation occurred before the split time of the two most closely related populations. Selection in “daughter population A” refers to a selective sweep that occurred in one of the two most closely related populations (A), after their split from each other.

To make a fair comparison to 3P-CLR(Int), and given that XP-CLR is a two-population test, we applied XP-CLR in two ways. First, we pretended population b was not sampled, and so the “test” panel consisted of individuals from a only, while the “outgroup” consisted of individuals from c. In the second implementation (which we call “XP-CLR-avg”), we used the same outgroup panel, but pretended that individuals from a and b were pooled into a single panel, and this pooled panel was the “test”. The window size was set at 0.5 cM and the space between the center of each window was set at 600 SNPs. To speed up computation, and because we are largely interested in comparing the relative performance of the three tests under different scenarios, we used only 20 randomly chosen SNPs per window in all tests. We note, however, that the performance of all three tests can be improved by using more SNPs per window.

Figure 2 shows receiver operating characteristic (ROC) curves comparing the sensitivity and specificity of 3P-CLR(Int), XP-CLR and XP-CLR-avg in the first six demographic scenarios described in Table 1. Each ROC curve was made from 100 simulations under selection (with s = 0.1 for the central mutation) and 100 simulations under neutrality (with s = 0 and no fixation required). In each simulation, 100 individuals were sampled from population a, 100 from population b and 10 from the outgroup population c. This emulates a situation in which only a few individuals have been sequenced from the outgroup, while large numbers of sequences are available from the tests (e.g. two populations of present-day humans). For each simulation, we took the maximum value at a region in the neighborhood of the central mutation (+/- 0.5 cM) and used those values to compute ROC curves under the two models.

Figure 2. ROC curves for performance of 3P-CLR(Int) and two variants of XP-CLR in detecting selective sweeps that occurred before the split of two populations a and b, under different demographic models.

In this case, the outgroup panel from population c contained 10 haploid genomes. The two sister population panels (from a and b) have 100 haploid genomes each.

When the split times are recent or moderately ancient (models A to D), 3P-CLR(Int) outperforms the two versions of XP-CLR. When the split times are very ancient (models E and F), none of the tests perform well. The root mean squared error (RMSE) of the genetic distance between the true selected site and the highest scored window is comparable across tests in all six scenarios (Figure S2). Finally, Figures S1 and S3 show the ROC curves and RMSE plots, respectively, for a case in which 100 individuals were sampled from all three populations (including the outgroup), with similar results.

Figure S1. ROC curves for performance of 3P-CLR(Int) and two variants of XP-CLR in detecting selective sweeps that occurred before the split of two populations a and b, under different demographic models.

In this case, the outgroup panel from population c contained 100 haploid genomes. The two sister population panels (from a and b) have 100 haploid genomes each.

Figure S2. Root-mean squared error for the location of the sweep inferred by 3P-CLR(Int) and two variants of XP-CLR under different demographic scenarios.

In this case, the outgroup panel from population c contained 10 haploid genomes and the two sister population panels (from a and b) have 100 haploid genomes each.

Figure S3. Root-mean squared error for the location of the sweep inferred by 3P-CLR(Int) and two variants of XP-CLR under different demographic scenarios.

In this case, the outgroup panel from population c contained 100 haploid genomes and the two sister population panels (from a and b) have 100 haploid genomes each.

Importantly, the usefulness of 3P-CLR(Int) resides not just in its performance at detecting selective sweeps in the ancestral population, but in its specific sensitivity to that particular type of events. Because the test relies on correlated allele frequency differences in both population a and population b (relative to the outgroup), selective sweeps that are specific to only one of the populations will not lead to high 3P-CLR(Int) scores. Figure 3 shows ROC curves in two scenarios in which a selective sweep event occurred only in population a (Models I and J in Table 1), using 100 sampled individuals from each of the 3 populations. Here, XP-CLR performs well, but 3P-CLR(Int) shows almost no sensitivity to the recent sweep, under reasonable specificity cutoffs. For example, in Model I, at a specificity of 95%, XP-CLR has 80% sensitivity, while at the same specificity, 3P-CLR(Int) only has 14% sensitivity. One can compare this to the same demographic scenario but with selection occurring in the ancestral population (Model C, Figure S1), where at 95% specificity, XP-CLR has 69% sensitivity, while 3P-CLR has 83% sensitivity.

Figure 3. 3P-CLR(Int) is tailored to detect selective events that happened before the split t_ab, so it is largely insensitive to sweeps that occurred after the split.

ROC curves show performance of 3P-CLR(Int) and two variants of XP-CLR for models where selection occurred in population a after its split from b.

4.2 Selection in Eurasians

We first applied 3P-CLR to modern human data from the 1000 Genomes Project [12]. We used the African-American recombination map [22] to convert physical distances into genetic distances. We focused on two populations (Europeans and East Asians), using Africans as the outgroup population (Figure S4.A). We randomly sampled 100 individuals from each population and obtained sample derived allele frequencies every 10 SNPs in the genome. We then calculated likelihood ratio statistics by a sliding window approach, where we sampled a “central SNP” once every 20 SNPs. The central SNP in each window was the candidate beneficial SNP for that window. We set the window size to 0.25 cM, and randomly sampled 100 SNPs from each window, centered around the candidate beneficial SNP. In each window, we calculated 3P-CLR to test for selection at three different branches of the population tree: the terminal branch leading to Europeans (3P-CLR Europe), the terminal branch leading to East Asians (3P-CLR East Asia) and the ancestral branch of Europeans and East Asians (3P-CLR Eurasia). Results are shown in Figure 4. For each scan, we selected the windows in the top 99.9% quantile of scores and merged them together if they were contiguous. Tables 2,3 and 4 show the top hits for Europeans, East Asians and the ancestral Eurasian branch, respectively

Figure S4. A. Three-population tree separating Europeans, East Asians and Africans. B. Three-population tree separating Eurasians, Africans and archaic humans (Neanderthal+Denisova).

View this table:

Table 2. Top hits for 3P-CLR run on the European terminal branch, using Africans as the outgroup.

We show the windows in the top 99.9% quantile of scores. Windows were merged together if they were contiguous. Win max = Location of window with maximum score. Win start = left-most end of left-most window for each region. Win end = right-most end of right-most window for each region. All positions were rounded to the nearest 100 bp. Score max = maximum score within region.

View this table:

Table 3. Top hits for 3P-CLR run on the East Asian terminal branch, using Africans as the outgroup.

View this table:

Table 4. Top hits for 3P-CLR run on the Eurasian ancestral branch, using Africans as the outgroup.

Figure 4. 3P-CLR scan of Europeans (upper panel), East Asians (middle panel) and the ancestral population to Europeans and East Asians (lower panel), using Africans as the outgroup in all 3 cases.

The red line denotes the 99.9% quantile cutoff.

We observe several genes that have been identified in previous selection scans. In the East Asian branch, one of the top hits is EDAR. This gene codes for a protein involved in hair thickness and incisor tooth morphology [23,24]. It has been repeatedly identified in earlier selections scans as having undergone a sweep in East Asians [25,26].

Furthermore, 3P-CLR allows us to narrow down on the specific time at which selection occurred in the history of particular populations. For example, ref. [1] performed a scan of the genomes of East Asians using XP-CLR with Africans as the outgroup, and identified a number of genes as being under selection [1]. 3P-CLR confirms this signal in several of these loci when looking specifically at the East Asian branch: CYP26B1, EMX1, SPR, SFXN5, SLC30A9, PPARA, PKDREJ, GTSE1, TRMU, CELSR1, PINX1, XKR6, CD226, ACD, PARD6A, GFOD2, RANBP10, TSNAXIP1, CENPT, THAP11, NUTF2, CDH16, RRAD, FAM96B, CES2, CBFB, C16orf70, TRADD, FBXL8, HSF4, NOL3, EXOC3L1, E2F4, ELMOS, LRRC29, FHOD1, SLC9A5, PLEKHG4, LRRCS6, ZDHHC1, HSD11B2, AtP6V0D1, AGRP, FAM65A, CTCF and RLTPR. However, when applied to the ancestral Eurasian branch, 3P-CLR finds some genes that were previously found in the XP-CLR analysis of East Asians, but that are not among the top hits in 3P-CLR applied to the East Asian branch: COMMD3, BMI1, SPAG6, CD226, SLCS0A9,LONP2, SIAH1, ABCC11 and ABCC12. This suggests selection in these regions occurred earlier, i.e. before the European-East Asian split. Figure 5 shows a comparison between the 3P-CLR scores for the three branches in the region containing genes BMI1 (a proto-oncogene [27]) and SPAG6 (involved in sperm motility [28]). In that figure, the score within each window was standardized using its chromosome-wide mean and standard deviation, to make a fair comparison. One can observe that the signal of Eurasia-specific selection is evidently stronger than the other two signals.

Figure 5. 3P-CLR scan of Europeans (blue), East Asians (black) and the ancestral Eurasian population (red) reveals the region containing genes SPAG6 and BMI1 to be candidates for selection in the ancestral population.

To make a fair comparison, all 3P-CLR scores were standardized by substracting the chromosome-wide mean from each window and dividing the resulting score by the chromosome-wide standard deviation. The image was built using the GenomeGraphs package in Bioconductor.

Other selective events that 3P-CLR infers to have occurred in Eurasians include the region containing HERC2 and OCA2, which are major determinants of eye color [29-31]. There is also evidence that these genes underwent selection more recently in the history of Europeans [32], which could suggest an extended period of selection - perhaps influenced by migrations between Asia and Europe - or repeated selective events at the same locus.

When running 3P-CLR to look for selection specific to Europe, we find that TYRP1 and MYO5A, which play a role in human skin pigmentation [33-36], are among the top hits. Both of these genes have been previously found to be under strong selection in Europe [37], using a statistic called iHS, which measures extended patterns of homozygosity that are characteristic of selective sweeps. Interestingly, a change in the gene TYRP1 has also been found to cause a blonde hair phenotype in Melanesians [38].

4.3 Selection in ancestral modern humans

We applied 3P-CLR to modern human data combined with recently sequenced archaic human data [10]. We sought to find selective events that occurred in modern humans after their spit from archaic groups. We used the combined Neanderthal and Denisovan high-coverage genomes [9,10] as the outgroup population, and, for our two test populations, we randomly sampled 100 Eurasian genomes and 100 African genomes from the 1000 Genomes data (Figure S4.B). We used previously estimated drift times as fixed parameters [10], and tested for selective events that occurred more anciently than the split of Africans and Eurasians, but more recently than the split from Neanderthals. We run 3P-CLR using 0.25 cM windows as above, but also verified that the density of scores was robust to the choice of window size and spacing (Figure S5). As before, we selected the top 99.9% windows and merged them together if they were contiguous. Table 5 and Figure S6 show the top hits. To find putative candidates for the beneficial variants in each region, we queried the catalogs of modern human-specific high-frequency or fixed derived changes that are ancestral in the Neanderthal and/or the Denisova genomes [10,39].

Figure S5. Comparison of 3P-CLR on the modern human ancestral branch under different window sizes and central SNP spacing.

The red density is the density of standardized scores for 3P-CLR run using 0.25 cM windows, 100 SNPs per window and a spacing of 20 SNPs between each central SNP. The blue dashed density is the density of standardized scores for 3P-CLR run using 1 cM windows, 200 SNPs per window and a spacing of 80 SNPs between each central SNP.

Figure S6. 3P-CLR scan of the ancestral branch to Africans and Eurasians, using the Denisovan and Neanderthal genomes as the outgroup.

The red line denotes the 99.9% quantile cutoff.

View this table:

Table 5. Top hits for 3P-CLR run on the ancestral branch to Eurasians and Africans, using archaic humans as the outgroup.

We observe several genes that have been identified in previous scans that looked for selection in modern humans after their split from archaic groups [8,10]: SIPA1L1, ANAPC10, ABCE1, RASA1, CCNH, KCNJ3, HBP1, COG5, GPR22, DUS4L, BCAP29, CALDPS2, RNF133, RNF148, FAM172A, POU5F2, FGF7, RABGAP1, GPR21, STRBP, SMURF1, GABRA2, ALMS1, PVRL3, EHBP1, VPS54, OTX1, UGP2, HCN1, GTDC1, ZEB2, OIT3, USP54, MYOZ1 and DPYD. One of our strongest candidate genes among these is ANAPC10. This gene is a core subunit of the cyclosome, is involved in progression through the cell cycle [40], and may play a role in oocyte maturation and human T-lymphotropic virus infection (KEGG pathway [41]). ANAPC10 is noteworthy because it was found to be significantly differentially expressed in humans compared to other great apes and macaques: it is up-regulated in the testes [42]. The gene also contains two intronic changes that are fixed derived in modern humans, ancestral in both Neanderthals and Denisovans and that have evidence for being highly disruptive, based on a composite score that combines conservation and regulatory data (PHRED-scaled C-scores > 11 [10,43]). The changes, however, appear not to lie in any obvious regulatory region [44,45].

We also find ADSL among the list of candidates. This gene is known to contain a nonsynonymous change that is fixed in all present-day humans but homozygous ancestral in the Neanderthal genome, the Denisova genome and two Neanderthal exomes [39] (Figure 6.A). It was previously identified as lying in a region with strong support for positive selection in modern humans, using summary statistics Implemented in an ABC method[11]. The gene is interesting because it is one of the members of the Human Phenotype ontology category “aggression hyperactivity” which is enriched for nonsynonymous changes that occurred in the modern human lineage after the split from archaic humans [39,46]. ADSL codes for adenylosuccinase, an enzyme involved in purine metabolism [47]. A deficiency of adenylosuccinase can lead to apraxia, speech deficits, delays in development and abnormal behavioral features, like hyperactivity and excessive laughter [48]. The nonsynonymous mutation (A429V) is in the C-terminal domain of the protein (Figure6.B) and lies in a highly conserved position (primate PhastCons = 0.953; GERP score = 5.67 [43.49.50). The ancestral amino acid is conserved across the tetrapod phylogeny, and the mutation is only three residues away from the most common causative SNP for severe adeny-losuccinase deficiency[51_55]. The change has the highest probability of being disruptive to protein function, out of all the nonsynonymous modern-human-specific changes that lie in the top-scoring regions (C-score = 17.69). While ADSL is an interesting candidate and lies in the center of the inferred selected region (Figure 6.A), there are other genes in the region too, including TNRC6B and MKL1. TNRC6B may be involved in miRNA-guided gene silencing [56], while MKL1 may play a role in smooth muscle differentiation[57], and has been associated with acute megakaryocytic leukemia [58].

Figure 6. ADSL is a candidate for selection in the modern human lineage, after the split from Neanderthal and Denisova.

A) One of the top-scoring regions when running 3P-CLR on the modern human lineage contains genes TNRC6B, ADSL, MKL1, MCHR1, SGSM3 and GRAP2. The most disruptive nonsynonymous modern-human-specific change in the entire list of top regions is in an exon of ADSL and is fixed derived in all present-day humans but ancestral in archaic humans. It is highly conserved accross tetrapods and lies only 3 residues away from the most common mutation leading to severe adenylosuccinase deficiency. B) The gene codes for a tetrameric protein. The mutation is in the C-terminal domain of each tetramer (red arrows), which are near the active sites (light blue arrows). Scores in panel A were standardized using the chromosome-wide mean and standard deviation. Vertebrate alignments were obtained from the UCSC genome browser (Vertebrate Multiz Alignment and Conservation track) and the image was built using the GenomeGraphs package in Bioconductor and Cn3D.

RASA1 was also a top hit in a previous scan for selection [8], and was additionally inferred to have a high Bayes factor in favor of selection in ref. [11]. The gene codes for a protein involved in the control of cellular differentiation [59]. Human diseases associated with RASA1 include basal cell carcinoma [60] and ateriovenous malformation [61,62].

The GABA_A gene cluster in chromosome 4p12 is also among the top regions. The genes within the putatively selected region code for three of the subunits of the GABA_A receptor (GABRA2, GABRA4, GABRB1), which codes for a ligand-gated ion channel that plays a key role in synaptic inhibtion in the central nervous system (see review by ref. [63]). GABRA2 is significantly associated with the risk of alcohol dependence in humans [64], perception of pain [65] and asthma [66]. In turn, GABRA4 is associated with autism risk [67,68].

Two other candidate genes that may be involved in brain development are FOXG1 and CADPS2. FOXG1 was not identified in any of the previous selection scans, and codes for a protein called forkhead box G1, which plays an important role during brain development. Mutations in this gene have been associated with a slow-down in brain growth during childhood resulting in microcephaly, which in turn causes various intellectual disabilities [69,70]. CADPS2 was identified in [8] as a candidate for selection, and has been associated with autism [71]. The gene has been suggested to be specifically important in the evolution of all modern humans, as it was not found to be selected earlier in great apes or later in particular modern human populations [72].

Finally, we find a signal of selection in a region containing the gene EHBP1 and OTX1. This region was identified in both of the two previous scans for modern human selection [8,10]. EHBP1 codes for a protein involved in endocytic trafficking [73] and has been associated with prostate cancer [74]. OTX1 is a homeobox family gene that may play a role in brain development [75]. Interestingly, EHBP1 contains a single-nucleotide intronic change (chr2:63206488) that is almost fixed in all present-day humans and homozygous ancestral in Neanderthal and Denisova [10]. This change is also predicted to be highly disruptive (C-score = 13.1) and lies in a position that is extremely conserved across primates (PhastCons = 0.942), mammals (PhastCons = 1) and vertebrates (PhastCons = 1). The change is 18 bp away from the nearest splice site and overlaps a VISTA conserved enhancer region (element 1874) [76], which suggests a putative regulatory role for the change.

4.4 Modern human-specific high-frequency changes in GWAS catalog

We overlapped the genome-wide association studies (GWAS) database [77,78] with the list of fixed or high-frequency modern human-specific changes that are ancestral in archaic humans [10] and that are located within our top putatively selected regions in modern humans (Table 6). None of the resulting SNPs are completely fixed derived, because GWAS can only yield associations from sites that are segregating. Among these SNPs, the one with the highest probability of being disruptive (rs10003958, C-score = 16.58, Gerp score = 6.07) is located in a highly-conserved regulatory (“strong enhancer”) region in the RAB28 gene [44,45] (Primate PhastCons = 0.951), and is significantly associated with obesity [79] (Figure 7.A). Interestingly, the region containing RAB28 is inferred to have been under positive selection in both the modern human and the Eurasian ancestral branches (Tables 4,5). In line with this evidence, the derived allele of rs10003958 is absent in archaic humans, at very high frequencies in Eurasians (> 94%), and only at moderately high frequencies in Africans (74%) (Figure 7.B).

We also find a highly disruptive SNP (rs10171434, C-score = 8.358) associated with urinary metabolites [80] and suicidal behavior in patients with mood disorders [81]. The SNP is located in an enhancer regulatory freature [44,45] located between genes PELI1 and VPS54, in the same putatively selected region as genes EHBP1 and OTX1 (see above). Finally, there is a highly disruptive SNP (rs731108, C-score = 10.31) that is associated with renal cell carcinoma [82]. This SNP is also located in an enhancer regulatory feature [44,45], in an intron of ZEB2. In this last case, though, only the Neanderthal genome has the ancestral state, while the Denisova genome carries the modern human variant.

View this table:

Table 6. Overlap between GWAS catalog and catalog of modern human-specific high-frequency changes in the top modern human selected regions.

Chr = chromosome. Pos = position (hg19). ID = SNP rs ID. Hum = Present-day human major allele. Anc = Human-Chimpanzee ancestor allele. Arch = Archaic human allele states (Altai Neanderthal, Denisova) where H=human-like allele and A=ancestral allele. Freq = present-day human derived frequency. Cons = consequence. C = C-score. PubMed = PubMed article ID for GWAS study.

Figure 7. RAB28 is a candidate for selection in both the Eurasian and the modern human ancestral lineages.

A) The gene lies in the middle of a 3P-CLR peak for both ancestral populations. The putatively selected region also contains several SNPs that are significantly associated with obesity and that are high-frequency derived in present-day humans (> 93%) but ancestral in archaic humans (red dots). The SNP with the highest C-score among these (rs10003958, pink circle) lies in a highly conserved strong enhancer region adjacent to the last exon of the gene. Color code for ChromHMM segmentation regions in UCSC genome browser: red = promoter, orange = strong enhancer, yellow = weak enhancer, green = weak transcription, blue = insulator. The image was built using the GenomeGraphs package in Bioconductor and the UCSC Genome Browser. B) Derived allele frequencies of SNP rs10003958 in the Denisova and Neanderthal genomes, and in different 1000 Genomes continental populations. AFR = Africans. AMR = Native Americans. SAS = South Asians. EUR = Europeans. EAS = East Asians.

5 Discussion

We have developed a new method called 3P-CLR, which allows us to detect positive selection along the genome. The method is based on an earlier test (XP-CLR [1]) that uses linked allele frequency differences between two populations to detect population-specific selection. However, 3P-CLR can allow us to distinguish between selective events that occurred before and after the split of two populations. Our method also has some similiarities to an earlier method developed by [83], which used an F_st-like score to detect selection ancestral to two populations. In that case, though, the authors used summary statistics and did not explicitly model the process leading to allele frequency differentiation.

We used our method to confirm previously found candidate genes in particular human populations, like EDAR, TYRP1 and HERC2, and find some novel candidates too (Tables 2,3,4). Additionally, we can infer that certain genes, which were previously known to have been under selection in East Asians (like SPAG6), are more likely to have undergone a sweep in the population ancestral to both Europeans and East Asians than in East Asians only.

We also used 3P-CLR to detect selective events that occurred in the ancestors of modern humans, after their split from Neanderthals and Denisovans (Table 5). These events could perhaps have led to the spread of phenotypes that set modern humans apart from other hominin groups. We find several intersting candidates, like SIPA1L1, ADSL, RASA1, OTX1, EHBP1, FOXG1, RAB28 and ANAPC10, some of which were previously detected using other types of methods [8,10,11].

An advantage of differentiation-based tests like XP-CLR and 3P-CLR is that, unlike other patterns detected by tests of neutrality (like extended haplotype homozygostiy, [84]) that are exclusive to hard sweeps, the patterns that both XP-CLR and 3P-CLR are tailored to find are based on regional allele frequency differences between populations. These patterns can also be produced by soft sweeps from standing variation or by partial sweeps [1], and there is some evidence that the latter phenomena may have been more important than classic sweeps during human evolutionary history [85].

Another advantage of both XP-CLR and 3P-CLR is that they do not rely on an arbitrary division of genomic space. Unlike other methods which require the partition of the genome into small windows of fixed size, our composite likelihood ratios can theoretically be computed over windows that are as big as each chromosome, while only switching the central candidate site at each window. This is because the likelihood ratios use the genetic distance to the central SNP as input. SNPs that are very far away from the central SNP will not contribute much to the likelihood function of both the neutral and the selection models, while those that are close to it will. While we heuristically limit the window size in our implementation in the interest of speed, this can be arbitrarily adjusted by the user as needed. The use of genetic distance in the likelihood function also allows us to take advantage of the spatial distribution of SNPs as an additional source of information, rather than only relying on patterns of population differentiation restricted to tightly linked SNPs.

3P-CLR also has an advantage over HMM-based selection methods, like the one implemented in ref. [10]. The likelihood ratio scores obtained from 3P-CLR can provide an idea of how credible a selection model is for a particular region, relative to the rest of the genome. The HMM-based method previously used to scan for selection in modern humans [10] can only rank putatively selected regions by genetic distance, but cannot output a statistical measure that may indicate how likely each region is to have been selected in ancient times. In contrast, 3P-CLR provides a composite likelihood ratio score, which allows for a statistically rigorous way to compare the neutral model and a specific selection model (for example, recent or ancient selection). The score also gives an idea of how much fainter the signal of ancient selection in modern humans is, relative to recent selection specific to a particular present-day population. For example, the outliers from Figure 4 have much higher scores (relative to the rest of the genome) than the outliers from Figure S6. This may be due to both the difference in time scales in the two sets of tests and to the uncertainty that comes from estimating outgroup allele frequencies using only two archaic genomes. This pattern can also be observed in Figure S7, where the densities of the scores looking for patterns of ancient selection have much shorter tails than the densities of scores looking for patterns of recent selection.

Figure S7. Genome-wide densities of each of the 3P-CLR scores described in this work.

The distributions of scores testing for recent selection (Europeans and East Asians) have much longer tails than the distributions of scores testing for more ancient selection (Modern Humans and Eurasians). All scores were standardized using their genome-wide means and standard deviations.

Like XP-CLR, 3P-CLR is largely robust to the underlying population history, even when this is wrongly specified, as it relies on looking for extreme allele frequency differences that are restricted to a particular region. We have noticed, however, that these types of tests may not be robust to admixture events from the outgroup population used. For example, we observe that 3P-CLR finds evidence for selection in the region containing HYAL2 (involved in the cellular response to ultraviolet radiation), when run in the East Asian branch. This makes sense, as a variant of this gene is known to have been pushed to high frequencies by selection specifically in East Asians. However, this variant likely came from Neanderthals via introgression [86,87]. While we do not observe that the HYAL2 region is a top hit in either the European or the Eurasian ancestral branches, we do observe it as a top hit in the modern human ancestral branch. This is puzzling, given that the selected haplotype should not have been introduced into modern humans until after the split of Africans and non-Africans. One explanation for the appearance of this region in both the East Asian and the modern human top hits is that the introgression event could perhaps confound the signal that 3P-CLR targets, as we are assuming no admixture in our demographic model. Another possibility is that the region has suffered multiple episodes of repeated selection. Incorporating admixture into the modeling procedure may help to disentangle this pattern better, but we leave this to a future work.

A further limitation of composite likelihood ratio tests is that the composite likelihood calculated for each model under comparison is obtained from a product of individual likelihoods at each site, and so it underestimates the correlation that exists between SNPs due to linkage effects [1,16,17,88]. One way to mitigate this problem is by using corrective weights based on linkage disequilibrium (LD) statistics calculated on the outgroup population [1]. Our implementation of 3P-CLR allows the user to incorporate such weights, if appropriate LD statistics are available from the outgroup. However, in cases where these are unreliable, it may not be possible to fully correct for this (for example, when only a few unphased genomes are available, as in the case of the Neanderthal and Denisova genomes).

While 3P-CLR relies on integrating over the possible allele frequencies in the ancestors of populations a and b (formula 10), one could envision using ancient DNA to avoid this step. Thus, if enough genomes could be sampled from that ancestral population that existed in the past, one could use the sample frequency in the ancient set of genomes as a proxy for the ancestral population frequency. This may soon be possible, as several early modern human genomes have already been sequenced in recent years [89-91].

Though we have limited ourselves to a three-population model in this manuscript, it should be straightforward to expand our model to a larger number of populations, albeit with additional costs in terms of speed and memory. Our method relies on a similar framework to the demographic inference method implemented in TreeMix [92], which can estimate complex population trees that include migration events, using genome-wide data. With a more complex modeling framework, it may be possible to estimate the time and strength of selective events with better resolution, and to incorporate additional demographic forces, like continuous migration between populations or pulses of admixture.

Acknowledgments

We thank Montgomery Slatkin, Rasmus Nielsen, Joshua Schraiber, Nicolas Duforet-Frebourg, Emilia Huerta-Sanchez, Hua Chen, Nick Patterson, David Reich, Joachim Hermisson, Graham Coop and members of the Slatkin and Nielsen labs for helpful advice and discussions. This work was supported by NIH grant R01-GM40282 to Montgomery Slatkin.

Footnotes

↵* Email: fernandoracimo{at}gmail.com

References

↵
Chen H, Patterson N, Reich D (2010) Population differentiation as a test for selective sweeps. Genome research 20: 393–402.
OpenUrl Abstract/FREE Full Text
↵
Smith JM, Haigh J (1974) The hitch-hiking effect of a favourable gene. Genetical research 23: 23–35.
OpenUrl CrossRef PubMed Web of Science
↵
Lewontin R, Krakauer J (1973) Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74: 175–195.
OpenUrl Abstract/FREE Full Text
↵
Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating a high-density snp map for signatures of natural selection. Genome research 12: 1805–1814.
OpenUrl Abstract/FREE Full Text
Weir BS, Cardon LR, Anderson AD, Nielsen DM, Hill WG (2005) Measures of human population structure show heterogeneity among genomic regions. Genome research 15: 1468–1476.
OpenUrl Abstract/FREE Full Text
Oleksyk TK, Zhao K, Francisco M, Gilbert DA, O’Brien SJ, et al. (2008) Identifying selected regions from heterozygosity and divergence using a light-coverage genomic dataset from two human populations. PLoS ONE 3: e1712.
OpenUrl CrossRef PubMed
↵
Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZXP, et al. (2010) Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329: 75–78.
OpenUrl Abstract/FREE Full Text
↵
Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, et al. (2010) A draft sequence of the neandertal genome. Science 328: 710–722.
OpenUrl Abstract/FREE Full Text
↵
Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, et al. (2012) A high-coverage genome sequence from an archaic denisovan individual. Science 338: 222–226.
OpenUrl Abstract/FREE Full Text
↵
Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, et al. (2014) The complete genome sequence of a neanderthal from the altai mountains. Nature 505: 43–49.
OpenUrl CrossRef GeoRef PubMed Web of Science
↵
Racimo F, Kuhlwilm M, Slatkin M (2014) A test for ancient selective sweeps and an application to candidate sites in modern humans. Molecular Biology and Evolution 31: 3344–3358.
OpenUrl CrossRef PubMed
↵
Consortium GP, et al. (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65.
OpenUrl CrossRef PubMed Web of Science
↵
Nicholson G, Smith AV, Jónsson F, Gústafsson Ó, Stefánsson K, et al. (2002) Assessing population differentiation and isolation from single-nucleotide polymorphism data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64: 695–715.
OpenUrl
↵
Durrett R, Schweinsberg J (2004) Approximating selective sweeps. Theoretical population biology 66: 129–138.
OpenUrl CrossRef PubMed Web of Science
↵
Fay JC, Wu CI (2000) Hitchhiking under positive darwinian selection. Genetics 155: 1405–1413.
OpenUrl Abstract/FREE Full Text
↵
Lindsay BG (1988) Composite likelihood methods. Contemporary Mathematics 80: 221–39.
OpenUrl CrossRef
↵
Varin C, Reid N, Firth D (2011) An overview of composite likelihood methods. Statistica Sinica 21: 5–42.
OpenUrl Web of Science
↵
Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, et al. (2012) Ancient admixture in human history. Genetics 192: 1065–1093.
OpenUrl Abstract/FREE Full Text
↵
Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD (2009) Inferring the joint demographic history of multiple populations from multidimensional snp frequency data. PLoS genetics 5: e1000695.
OpenUrl CrossRef
↵
Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M (2013) Robust demographic inference from genomic and snp data. PLoS genetics 9: e1003905.
OpenUrl CrossRef PubMed
↵
Messer PW (2013) Slim: simulating evolution with selection and linkage. Genetics 194: 1037–1039.
OpenUrl Abstract/FREE Full Text
↵
Hinch AG, Tandon A, Patterson N, Song Y, Rohland N, et al. (2011) The landscape of recombination in african americans. Nature 476: 170–175.
OpenUrl CrossRef PubMed Web of Science
↵
Fujimoto A, Kimura R, Ohashi J, Omi K, Yuliwulandari R, et al. (2008) A scan for genetic determinants of human hair morphology: Edar is associated with asian hair thickness. Human Molecular Genetics 17: 835–843.
OpenUrl CrossRef PubMed Web of Science
↵
Kimura R, Yamaguchi T, Takeda M, Kondo O, Toma T, et al. (2009) A common variation in edar is a genetic determinant of shovel-shaped incisors. The American Journal of Human Genetics 85: 528–535.
OpenUrl CrossRef PubMed
↵
Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449: 913–918.
OpenUrl CrossRef PubMed Web of Science
↵
Grossman SR, Shylakhter I, Karlsson EK, Byrne EH, Morales S, et al. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327: 883–886.
OpenUrl Abstract/FREE Full Text
↵
Siddique HR, Saleem M (2012) Role of bmi1, a stem cell factor, in cancer recurrence and chemoresistance: preclinical and clinical evidences. Stem Cells 30: 372–378.
OpenUrl CrossRef PubMed Web of Science
↵
Sapiro R, Kostetskii I, Olds-Clarke P, Gerton GL, Radice GL, et al. (2002) Male infertility, impaired sperm motility, and hydrocephalus in mice deficient in sperm-associated antigen 6. Molecular and cellular biology 22: 6298–6305.
OpenUrl Abstract/FREE Full Text
↵
Eiberg H, Troelsen J, Nielsen M, Mikkelsen A, Mengel-From J, et al. (2008) Blue eye color in humans may be caused by a perfectly associated founder mutation in a regulatory element located within the herc2 gene inhibiting oca2 expression. Human genetics 123: 177–187.
OpenUrl CrossRef PubMed Web of Science
Han J, Kraft P, Nan H, Guo Q, Chen C, et al. (2008) A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation. PLoS genetics 4: e1000074.
OpenUrl
↵
Branicki W, Brudnik U, Wojas-Pelc A (2009) Interactions between herc2, oca2 and mc1r may influence human pigmentation phenotype. Annals of human genetics 73: 160–170.
OpenUrl CrossRef PubMed Web of Science
↵
Mathieson I, Lazaridis I, Rohland N, Mallick S, Llamas B, et al. (2015) Eight thousand years of natural selection in europe. bioRxiv: 016477.
↵
Pastural E, Ersoy F, Yalman N, Wulffraat N, Grillo E, et al. (2000) Two genes are responsible for griscelli syndrome at the same 15q21 locus. Genomics 63: 299–306.
OpenUrl CrossRef PubMed Web of Science
Fukuda M, Kuroda T, Mikoshiba K (2002) Slac2-a/melanophilin, the missing link between rab27 and myosin va: implications of a tripartite protein complex for melanosome transport. The Journal of biological chemistry 277: 12432.
OpenUrl Abstract/FREE Full Text
Halaban R, Moellmann G (1990) Murine and human b locus pigmentation genes encode a glycoprotein (gp75) with catalase activity. Proceedings of the National Academy of Sciences 87: 4809–4813.
OpenUrl Abstract/FREE Full Text
↵
Sulem P, Gudbjartsson DF, Stacey SN, Helgason A, Rafnar T, et al. (2008) Two newly identified genetic determinants of pigmentation in europeans. Nature genetics 40: 835–837.
OpenUrl CrossRef PubMed Web of Science
↵
Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS biology 4: e72.
OpenUrl CrossRef PubMed
↵
Kenny EE, Timpson NJ, Sikora M, Yee MC, Moreno-Estrada A, et al. (2012) Melanesian blond hair is caused by an amino acid change in tyrp1. Science 336: 554–554.
OpenUrl Abstract/FREE Full Text
↵
Castellano S, Parra G, Sánchez-Quinto FA, Racimo F, Kuhlwilm M, et al. (2014) Patterns of coding variation in the complete exomes of three neandertals. Proceedings of the National Academy of Sciences 111: 6666–6671.
OpenUrl Abstract/FREE Full Text
↵
Pravtcheva DD, Wise TL (2001) Disruption of apc10/doc1 in three alleles of oligosyndactylism. Genomics 72: 78–87.
OpenUrl CrossRef PubMed
↵
Kanehisa M, Goto S (2000) Kegg: kyoto encyclopedia of genes and genomes. Nucleic acids research 28: 27–30.
OpenUrl CrossRef PubMed Web of Science
↵
Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, et al. (2011) The evolution of gene expression levels in mammalian organs. Nature 478: 343–348.
OpenUrl CrossRef PubMed Web of Science
↵
Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, et al. (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nature genetics 46: 310–315.
OpenUrl CrossRef PubMed
↵
Consortium EP, et al. (2012) An integrated encyclopedia of dna elements in the human genome. Nature 489: 57–74.
OpenUrl CrossRef PubMed Web of Science
↵
Rosenbloom KR, Dreszer TR, Long JC, Malladi VS, Sloan CA, et al. (2011) Encode whole-genome data in the ucsc genome browser: update 2012. Nucleic acids research: gkr1012.
↵
Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, et al. (2008) The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. The American Journal of Human Genetics 83: 610–615.
OpenUrl CrossRef PubMed Web of Science
↵
Van Keuren M, Hart I, Kao FT, Neve R, Bruns G, et al. (1987) A somatic cell hybrid with a single human chromosome 22 corrects the defect in the cho mutant (ade–i) lacking adenylosuccinase activity. Cytogenetic and Genome Research 44: 142–147.
OpenUrl CrossRef
↵
Gitiaux C, Ceballos-Picot I, Marie S, Valayannopoulos V, Rio M, et al. (2009) Misleading behavioural phenotype with adenylosuccinate lyase deficiency. European Journal of Human Genetics 17: 133–136.
OpenUrl CrossRef PubMed
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome research 15: 1034–1050.
OpenUrl Abstract/FREE Full Text
Cooper GM, Goode DL, Ng SB, Sidow A, Bamshad MJ, et al. (2010) Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nature methods 7: 250–251.
OpenUrl
Kmoch S, Hartmannová H, Stibùrková B, Krijt J, Zikánová M, et al. (2000) Human adenylosuccinate lyase (adsl), cloning and characterization of full-length cdna and its isoform, gene structure and molecular basis for adsl deficiency in six patients. Human molecular genetics 9: 1501–1513.
OpenUrl CrossRef PubMed Web of Science
Maaswinkel-Mooij P, Laan L, Onkenhout W, Brouwer O, Jaeken J, et al. (1997) Adenylosuccinase deficiency presenting with epilepsy in early infancy. Journal of inherited metabolic disease 20: 606–607.
OpenUrl CrossRef PubMed Web of Science
Marie S, Cuppens H, Heuterspreute M, Jaspers M, Tola EZ, et al. (1999) Mutation analysis in adenylosuccinate lyase deficiency: Eight novel mutations in the re-evaluated full adsl coding sequence. Human mutation 13: 197–202.
OpenUrl CrossRef PubMed Web of Science
Race V, Marie S, Vincent MF, Van den Berghe G (2000) Clinical, biochemical and molecular genetic correlations in adenylosuccinate lyase deficiency. Human molecular genetics 9: 2159–2165.
OpenUrl CrossRef PubMed
Edery P, Chabrier S, Ceballos-Picot I, Marie S, Vincent MF, et al. (2003) Intrafamilial variability in the phenotypic expression of adenylosuccinate lyase deficiency: a report on three patients. American Journal of Medical Genetics Part A 120: 185–190.
OpenUrl
↵
Meister G, Landthaler M, Peters L, Chen PY, Urlaub H, et al. (2005) Identification of novel argonaute-associated proteins. Current biology 15: 2149–2155.
OpenUrl CrossRef PubMed Web of Science
↵
Du KL, Chen M, Li J, Lepore JJ, Mericko P, et al. (2004) Megakaryoblastic leukemia factor-1 transduces cytoskeletal signals and induces smooth muscle cell differentiation from undifferentiated embryonic stem cells. Journal of Biological Chemistry 279: 17578–17586.
OpenUrl Abstract/FREE Full Text
↵
Mercher T, Busson-Le Coniat M, Monni R, Mauchauffé M, Khac FN, et al. (2001) Involvement of a human gene related to the drosophila spen gene in the recurrent t (1; 22) translocation of acute megakaryocytic leukemia. Proceedings of the National Academy of Sciences 98: 5776–5779.
OpenUrl Abstract/FREE Full Text
↵
Trahey M, Wong G, Halenbeck R, Rubinfeld B, Martin GA, et al. (1988) Molecular cloning of two types of gap complementary dna from human placenta. Science 242: 1697–1700.
OpenUrl Abstract/FREE Full Text
↵
Friedman E, Gejman PV, Martin GA, McCormick F (1993) Nonsense mutations in the c–terminal sh2 region of the gtpase activating protein (gap) gene in human tumours. Nature genetics 5: 242–247.
OpenUrl CrossRef PubMed Web of Science
↵
Eerola I, Boon LM, Mulliken JB, Burrows PE, Dompmartin A, et al. (2003) Capillary malformation–arteriovenous malformation, a new clinical and genetic disorder caused by rasa1 mutations. The American Journal of Human Genetics 73: 1240–1249.
OpenUrl CrossRef PubMed Web of Science
↵
Hershkovitz D, Bercovich D, Sprecher E, Lapidot M (2008) Rasa1 mutations may cause hereditary capillary malformations without arteriovenous malformations. British Journal of Dermatology 158: 1035–1040.
OpenUrl CrossRef PubMed Web of Science
↵
Whiting PJ, Bonnert TP, McKernan RM, Farrar S, Bourdelles BL, et al. (1999) Molecular and functional diversity of the expanding gaba-a receptor gene family. Annals of the New York Academy of Sciences 868: 645–653.
OpenUrl CrossRef PubMed Web of Science
↵
Edenberg HJ, Dick DM, Xuei X, Tian H, Almasy L, et al. (2004) Variations in gabra2, encoding the a 2 subunit of the gaba a receptor, are associated with alcohol dependence and with brain oscillations. The American Journal of Human Genetics 74: 705–714.
OpenUrl CrossRef PubMed Web of Science
↵
Knabl J, Witschi R, Hösl K, Reinold H, Zeilhofer UB, et al. (2008) Reversal of pathological pain through specific spinal gabaa receptor subtypes. Nature 451: 330–334.
OpenUrl CrossRef PubMed Web of Science
↵
Xiang YY, Wang S, Liu M, Hirota JA, Li J, et al. (2007) A gabaergic system in airway epithelium is essential for mucus overproduction in asthma. Nature medicine 13: 862–867.
OpenUrl CrossRef PubMed Web of Science
↵
Ma D, Whitehead P, Menold M, Martin E, Ashley-Koch A, et al. (2005) Identification of significant association and gene-gene interaction of gaba receptor subunit genes in autism. The American Journal of Human Genetics 77: 377–388.
OpenUrl CrossRef PubMed Web of Science
↵
Collins AL, Ma D, Whitehead PL, Martin ER, Wright HH, et al. (2006) Investigation of autism and gaba receptor subunit genes in multiple ethnic groups. Neurogenetics 7: 167–174.
OpenUrl CrossRef PubMed Web of Science
↵
Ariani F, Hayek G, Rondinella D, Artuso R, Mencarelli MA, et al. (2008) Foxg1 is responsible for the congenital variant of rett syndrome. The American Journal of Human Genetics 83: 89–93.
OpenUrl CrossRef PubMed Web of Science
↵
Mencarelli M, Spanhol-Rosseto A, Artuso R, Rondinella D, De Filippis R, et al. (2010) Novel foxg1 mutations associated with the congenital variant of rett syndrome. Journal of medical genetics 47: 49–53.
OpenUrl Abstract/FREE Full Text
↵
Sadakata T, Furuichi T (2010) Ca 2+-dependent activator protein for secretion 2 and autistic-like phenotypes. Neuroscience research 67: 197–202.
OpenUrl CrossRef PubMed
↵
Crisci JL, Wong A, Good JM, Jensen JD (2011) On characterizing adaptive events unique to modern humans. Genome biology and evolution 3: 791–798.
OpenUrl CrossRef PubMed
↵
Guilherme A, Soriano NA, Furcinitti PS, Czech MP (2004) Role of ehd1 and ehbp1 in perinuclear sorting and insulin-regulated glut4 recycling in 3t3-l1 adipocytes. Journal of Biological Chemistry 279: 40062–40075.
OpenUrl Abstract/FREE Full Text
↵
Gudmundsson J, Sulem P, Rafnar T, Bergthorsson JT, Manolescu A, et al. (2008) Common sequence variants on 2p15 and xp11. 22 confer susceptibility to prostate cancer. Nature genetics 40: 281–283.
OpenUrl CrossRef PubMed
↵
Gong S, Zheng C, Doughty ML, Losos K, Didkovsky N, et al. (2003) A gene expression atlas of the central nervous system based on bacterial artificial chromosomes. Nature 425: 917–925.
OpenUrl CrossRef PubMed Web of Science
↵
Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, et al. (2006) In vivo enhancer analysis of human conserved non-coding sequences. Nature 444: 499–502.
OpenUrl CrossRef PubMed Web of Science
↵
Li MJ, Wang P, Liu X, Lim EL, Wang Z, et al. (2011) Gwasdb: a database for human genetic variants identified by genome-wide association studies. Nucleic acids research: gkr1182.
↵
Welter D, MacArthur J, Morales J, Burdett T, Hall P, et al. (2014) The nhgri gwas catalog, a curated resource of snp-trait associations. Nucleic acids research 42: D1001–D1006.
OpenUrl CrossRef PubMed Web of Science
↵
Paternoster L, Evans DM, Nohr EA, Holst C, Gaborieau V, et al. (2011) Genome-wide populationbased association study of extremely overweight young adults–the goya study. PLoS ONE 6: e24303.
OpenUrl CrossRef PubMed
↵
Suhre K, Wallaschofski H, Raffler J, Friedrich N, Haring R, et al. (2011) A genome-wide association study of metabolic traits in human urine. Nature genetics 43: 565–569.
OpenUrl CrossRef PubMed
↵
Perlis RH, Huang J, Purcell S, Fava M, Rush AJ, et al. (2010) Genome-wide association study of suicide attempts in mood disorder patients. Genome 167.
↵
Henrion M, Frampton M, Scelo G, Purdue M, Ye Y, et al. (2013) Common variation at 2q22. 3 (zeb2) influences the risk of renal cancer. Human molecular genetics 22: 825–831.
OpenUrl CrossRef PubMed Web of Science
↵
Schlebusch CM, Skoglund P, Sjödin P, Gattepaille LM, Hernandez D, et al. (2012) Genomic variation in seven khoe-san groups reveals adaptation and complex african history. Science 338: 374–379.
OpenUrl Abstract/FREE Full Text
↵
Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, et al. (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832–837.
OpenUrl CrossRef PubMed Web of Science
↵
Hernandez RD, Kelley JL, Elyashiv E, Melton SC, Auton A, et al. (2011) Classic selective sweeps were rare in recent human evolution. Science 331: 920–924.
OpenUrl Abstract/FREE Full Text
↵
Ding Q, Hu Y, Xu S, Wang J, Jin L (2013) Neanderthal introgression at chromosome 3p21. 31 was under positive natural selection in east asians. Molecular Biology and Evolution: mst260.
↵
Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, et al. (2014) The genomic landscape of neanderthal ancestry in present-day humans. Nature 507: 354–357.
OpenUrl CrossRef PubMed Web of Science
↵
Pace L, Salvan A, Sartori N (2011) Adjusting composite likelihood ratio statistics. Statistica Sinica 21: 129.
OpenUrl
↵
Fu Q, Li H, Moorjani P, Jay F, Slepchenko SM, et al. (2014) Genome sequence of a 45,000-year-old modern human from western siberia. Nature 514: 445–449.
OpenUrl CrossRef PubMed Web of Science
Seguin-Orlando A, Korneliussen TS, Sikora M, Malaspinas AS, Manica A, et al. (2014) Genomic structure in europeans dating back at least 36,200 years. Science 346: 1113–1118.
OpenUrl Abstract/FREE Full Text
↵
Lazaridis I, Patterson N, Mittnik A, Renaud G, Mallick S, et al. (2014) Ancient human genomes suggest three ancestral populations for present-day europeans. Nature 513: 409–413.
OpenUrl CrossRef PubMed Web of Science
↵
Pickrell JK, Pritchard JK (2012) Inference of population splits and mixtures from genome-wide allele frequency data. PLoS genetics 8: e1002967.

View the discussion thread.

Posted April 06, 2015.

Download PDF

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5197)
Biochemistry (11697)
Bioengineering (8714)
Bioinformatics (29116)
Biophysics (14924)
Cancer Biology (12047)
Cell Biology (17347)
Clinical Trials (138)
Developmental Biology (9405)
Ecology (14136)
Epidemiology (2067)
Evolutionary Biology (18260)
Genetics (12214)
Genomics (16758)
Immunology (11838)
Microbiology (27986)
Molecular Biology (11544)
Neuroscience (60776)
Paleontology (450)
Pathology (1864)
Pharmacology and Toxicology (3228)
Physiology (4936)
Plant Biology (10381)
Scientific Communication and Education (1679)
Synthetic Biology (2876)
Systems Biology (7331)
Zoology (1642)

[1] ↵
Chen H, Patterson N, Reich D (2010) Population differentiation as a test for selective sweeps. Genome research 20: 393–402.
OpenUrl Abstract/FREE Full Text

[2] ↵
Smith JM, Haigh J (1974) The hitch-hiking effect of a favourable gene. Genetical research 23: 23–35.
OpenUrl CrossRef PubMed Web of Science

[3] ↵
Lewontin R, Krakauer J (1973) Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74: 175–195.
OpenUrl Abstract/FREE Full Text

[4] ↵
Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating a high-density snp map for signatures of natural selection. Genome research 12: 1805–1814.
OpenUrl Abstract/FREE Full Text

[5] Weir BS, Cardon LR, Anderson AD, Nielsen DM, Hill WG (2005) Measures of human population structure show heterogeneity among genomic regions. Genome research 15: 1468–1476.
OpenUrl Abstract/FREE Full Text

[6] Oleksyk TK, Zhao K, Francisco M, Gilbert DA, O’Brien SJ, et al. (2008) Identifying selected regions from heterozygosity and divergence using a light-coverage genomic dataset from two human populations. PLoS ONE 3: e1712.
OpenUrl CrossRef PubMed

[7] ↵
Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZXP, et al. (2010) Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329: 75–78.
OpenUrl Abstract/FREE Full Text

[8] ↵
Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, et al. (2010) A draft sequence of the neandertal genome. Science 328: 710–722.
OpenUrl Abstract/FREE Full Text

[9] ↵
Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, et al. (2012) A high-coverage genome sequence from an archaic denisovan individual. Science 338: 222–226.
OpenUrl Abstract/FREE Full Text

[10] ↵
Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, et al. (2014) The complete genome sequence of a neanderthal from the altai mountains. Nature 505: 43–49.
OpenUrl CrossRef GeoRef PubMed Web of Science

[11] ↵
Racimo F, Kuhlwilm M, Slatkin M (2014) A test for ancient selective sweeps and an application to candidate sites in modern humans. Molecular Biology and Evolution 31: 3344–3358.
OpenUrl CrossRef PubMed

[12] ↵
Consortium GP, et al. (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65.
OpenUrl CrossRef PubMed Web of Science

[13] ↵
Nicholson G, Smith AV, Jónsson F, Gústafsson Ó, Stefánsson K, et al. (2002) Assessing population differentiation and isolation from single-nucleotide polymorphism data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64: 695–715.
OpenUrl

[14] ↵
Durrett R, Schweinsberg J (2004) Approximating selective sweeps. Theoretical population biology 66: 129–138.
OpenUrl CrossRef PubMed Web of Science

[15] ↵
Fay JC, Wu CI (2000) Hitchhiking under positive darwinian selection. Genetics 155: 1405–1413.
OpenUrl Abstract/FREE Full Text

[16] ↵
Lindsay BG (1988) Composite likelihood methods. Contemporary Mathematics 80: 221–39.
OpenUrl CrossRef

[17] ↵
Varin C, Reid N, Firth D (2011) An overview of composite likelihood methods. Statistica Sinica 21: 5–42.
OpenUrl Web of Science

[18] ↵
Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, et al. (2012) Ancient admixture in human history. Genetics 192: 1065–1093.
OpenUrl Abstract/FREE Full Text

[19] ↵
Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD (2009) Inferring the joint demographic history of multiple populations from multidimensional snp frequency data. PLoS genetics 5: e1000695.
OpenUrl CrossRef

[20] ↵
Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M (2013) Robust demographic inference from genomic and snp data. PLoS genetics 9: e1003905.
OpenUrl CrossRef PubMed

[21] ↵
Messer PW (2013) Slim: simulating evolution with selection and linkage. Genetics 194: 1037–1039.
OpenUrl Abstract/FREE Full Text

[22] ↵
Hinch AG, Tandon A, Patterson N, Song Y, Rohland N, et al. (2011) The landscape of recombination in african americans. Nature 476: 170–175.
OpenUrl CrossRef PubMed Web of Science

[23] ↵
Fujimoto A, Kimura R, Ohashi J, Omi K, Yuliwulandari R, et al. (2008) A scan for genetic determinants of human hair morphology: Edar is associated with asian hair thickness. Human Molecular Genetics 17: 835–843.
OpenUrl CrossRef PubMed Web of Science

[24] ↵
Kimura R, Yamaguchi T, Takeda M, Kondo O, Toma T, et al. (2009) A common variation in edar is a genetic determinant of shovel-shaped incisors. The American Journal of Human Genetics 85: 528–535.
OpenUrl CrossRef PubMed

[25] ↵
Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449: 913–918.
OpenUrl CrossRef PubMed Web of Science

[26] ↵
Grossman SR, Shylakhter I, Karlsson EK, Byrne EH, Morales S, et al. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327: 883–886.
OpenUrl Abstract/FREE Full Text

[27] ↵
Siddique HR, Saleem M (2012) Role of bmi1, a stem cell factor, in cancer recurrence and chemoresistance: preclinical and clinical evidences. Stem Cells 30: 372–378.
OpenUrl CrossRef PubMed Web of Science

[28] ↵
Sapiro R, Kostetskii I, Olds-Clarke P, Gerton GL, Radice GL, et al. (2002) Male infertility, impaired sperm motility, and hydrocephalus in mice deficient in sperm-associated antigen 6. Molecular and cellular biology 22: 6298–6305.
OpenUrl Abstract/FREE Full Text

[29] ↵
Eiberg H, Troelsen J, Nielsen M, Mikkelsen A, Mengel-From J, et al. (2008) Blue eye color in humans may be caused by a perfectly associated founder mutation in a regulatory element located within the herc2 gene inhibiting oca2 expression. Human genetics 123: 177–187.
OpenUrl CrossRef PubMed Web of Science

[30] Han J, Kraft P, Nan H, Guo Q, Chen C, et al. (2008) A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation. PLoS genetics 4: e1000074.
OpenUrl

[31] ↵
Branicki W, Brudnik U, Wojas-Pelc A (2009) Interactions between herc2, oca2 and mc1r may influence human pigmentation phenotype. Annals of human genetics 73: 160–170.
OpenUrl CrossRef PubMed Web of Science

[32] ↵
Mathieson I, Lazaridis I, Rohland N, Mallick S, Llamas B, et al. (2015) Eight thousand years of natural selection in europe. bioRxiv: 016477.

[33] ↵
Pastural E, Ersoy F, Yalman N, Wulffraat N, Grillo E, et al. (2000) Two genes are responsible for griscelli syndrome at the same 15q21 locus. Genomics 63: 299–306.
OpenUrl CrossRef PubMed Web of Science

[34] Fukuda M, Kuroda T, Mikoshiba K (2002) Slac2-a/melanophilin, the missing link between rab27 and myosin va: implications of a tripartite protein complex for melanosome transport. The Journal of biological chemistry 277: 12432.
OpenUrl Abstract/FREE Full Text

[35] Halaban R, Moellmann G (1990) Murine and human b locus pigmentation genes encode a glycoprotein (gp75) with catalase activity. Proceedings of the National Academy of Sciences 87: 4809–4813.
OpenUrl Abstract/FREE Full Text

[36] ↵
Sulem P, Gudbjartsson DF, Stacey SN, Helgason A, Rafnar T, et al. (2008) Two newly identified genetic determinants of pigmentation in europeans. Nature genetics 40: 835–837.
OpenUrl CrossRef PubMed Web of Science

[37] ↵
Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS biology 4: e72.
OpenUrl CrossRef PubMed

[38] ↵
Kenny EE, Timpson NJ, Sikora M, Yee MC, Moreno-Estrada A, et al. (2012) Melanesian blond hair is caused by an amino acid change in tyrp1. Science 336: 554–554.
OpenUrl Abstract/FREE Full Text

[39] ↵
Castellano S, Parra G, Sánchez-Quinto FA, Racimo F, Kuhlwilm M, et al. (2014) Patterns of coding variation in the complete exomes of three neandertals. Proceedings of the National Academy of Sciences 111: 6666–6671.
OpenUrl Abstract/FREE Full Text

[40] ↵
Pravtcheva DD, Wise TL (2001) Disruption of apc10/doc1 in three alleles of oligosyndactylism. Genomics 72: 78–87.
OpenUrl CrossRef PubMed

[41] ↵
Kanehisa M, Goto S (2000) Kegg: kyoto encyclopedia of genes and genomes. Nucleic acids research 28: 27–30.
OpenUrl CrossRef PubMed Web of Science

[42] ↵
Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, et al. (2011) The evolution of gene expression levels in mammalian organs. Nature 478: 343–348.
OpenUrl CrossRef PubMed Web of Science

[43] ↵
Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, et al. (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nature genetics 46: 310–315.
OpenUrl CrossRef PubMed

[44] ↵
Consortium EP, et al. (2012) An integrated encyclopedia of dna elements in the human genome. Nature 489: 57–74.
OpenUrl CrossRef PubMed Web of Science

[45] ↵
Rosenbloom KR, Dreszer TR, Long JC, Malladi VS, Sloan CA, et al. (2011) Encode whole-genome data in the ucsc genome browser: update 2012. Nucleic acids research: gkr1012.

[46] ↵
Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, et al. (2008) The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. The American Journal of Human Genetics 83: 610–615.
OpenUrl CrossRef PubMed Web of Science

[47] ↵
Van Keuren M, Hart I, Kao FT, Neve R, Bruns G, et al. (1987) A somatic cell hybrid with a single human chromosome 22 corrects the defect in the cho mutant (ade–i) lacking adenylosuccinase activity. Cytogenetic and Genome Research 44: 142–147.
OpenUrl CrossRef

[48] ↵
Gitiaux C, Ceballos-Picot I, Marie S, Valayannopoulos V, Rio M, et al. (2009) Misleading behavioural phenotype with adenylosuccinate lyase deficiency. European Journal of Human Genetics 17: 133–136.
OpenUrl CrossRef PubMed

[49] Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome research 15: 1034–1050.
OpenUrl Abstract/FREE Full Text

[50] Cooper GM, Goode DL, Ng SB, Sidow A, Bamshad MJ, et al. (2010) Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nature methods 7: 250–251.
OpenUrl

[51] Kmoch S, Hartmannová H, Stibùrková B, Krijt J, Zikánová M, et al. (2000) Human adenylosuccinate lyase (adsl), cloning and characterization of full-length cdna and its isoform, gene structure and molecular basis for adsl deficiency in six patients. Human molecular genetics 9: 1501–1513.
OpenUrl CrossRef PubMed Web of Science

[52] Maaswinkel-Mooij P, Laan L, Onkenhout W, Brouwer O, Jaeken J, et al. (1997) Adenylosuccinase deficiency presenting with epilepsy in early infancy. Journal of inherited metabolic disease 20: 606–607.
OpenUrl CrossRef PubMed Web of Science

[53] Marie S, Cuppens H, Heuterspreute M, Jaspers M, Tola EZ, et al. (1999) Mutation analysis in adenylosuccinate lyase deficiency: Eight novel mutations in the re-evaluated full adsl coding sequence. Human mutation 13: 197–202.
OpenUrl CrossRef PubMed Web of Science

[54] Race V, Marie S, Vincent MF, Van den Berghe G (2000) Clinical, biochemical and molecular genetic correlations in adenylosuccinate lyase deficiency. Human molecular genetics 9: 2159–2165.
OpenUrl CrossRef PubMed

[55] Edery P, Chabrier S, Ceballos-Picot I, Marie S, Vincent MF, et al. (2003) Intrafamilial variability in the phenotypic expression of adenylosuccinate lyase deficiency: a report on three patients. American Journal of Medical Genetics Part A 120: 185–190.
OpenUrl

[56] ↵
Meister G, Landthaler M, Peters L, Chen PY, Urlaub H, et al. (2005) Identification of novel argonaute-associated proteins. Current biology 15: 2149–2155.
OpenUrl CrossRef PubMed Web of Science

[57] ↵
Du KL, Chen M, Li J, Lepore JJ, Mericko P, et al. (2004) Megakaryoblastic leukemia factor-1 transduces cytoskeletal signals and induces smooth muscle cell differentiation from undifferentiated embryonic stem cells. Journal of Biological Chemistry 279: 17578–17586.
OpenUrl Abstract/FREE Full Text

[58] ↵
Mercher T, Busson-Le Coniat M, Monni R, Mauchauffé M, Khac FN, et al. (2001) Involvement of a human gene related to the drosophila spen gene in the recurrent t (1; 22) translocation of acute megakaryocytic leukemia. Proceedings of the National Academy of Sciences 98: 5776–5779.
OpenUrl Abstract/FREE Full Text

[59] ↵
Trahey M, Wong G, Halenbeck R, Rubinfeld B, Martin GA, et al. (1988) Molecular cloning of two types of gap complementary dna from human placenta. Science 242: 1697–1700.
OpenUrl Abstract/FREE Full Text

[60] ↵
Friedman E, Gejman PV, Martin GA, McCormick F (1993) Nonsense mutations in the c–terminal sh2 region of the gtpase activating protein (gap) gene in human tumours. Nature genetics 5: 242–247.
OpenUrl CrossRef PubMed Web of Science

[61] ↵
Eerola I, Boon LM, Mulliken JB, Burrows PE, Dompmartin A, et al. (2003) Capillary malformation–arteriovenous malformation, a new clinical and genetic disorder caused by rasa1 mutations. The American Journal of Human Genetics 73: 1240–1249.
OpenUrl CrossRef PubMed Web of Science

[62] ↵
Hershkovitz D, Bercovich D, Sprecher E, Lapidot M (2008) Rasa1 mutations may cause hereditary capillary malformations without arteriovenous malformations. British Journal of Dermatology 158: 1035–1040.
OpenUrl CrossRef PubMed Web of Science

[63] ↵
Whiting PJ, Bonnert TP, McKernan RM, Farrar S, Bourdelles BL, et al. (1999) Molecular and functional diversity of the expanding gaba-a receptor gene family. Annals of the New York Academy of Sciences 868: 645–653.
OpenUrl CrossRef PubMed Web of Science

[64] ↵
Edenberg HJ, Dick DM, Xuei X, Tian H, Almasy L, et al. (2004) Variations in gabra2, encoding the a 2 subunit of the gaba a receptor, are associated with alcohol dependence and with brain oscillations. The American Journal of Human Genetics 74: 705–714.
OpenUrl CrossRef PubMed Web of Science

[65] ↵
Knabl J, Witschi R, Hösl K, Reinold H, Zeilhofer UB, et al. (2008) Reversal of pathological pain through specific spinal gabaa receptor subtypes. Nature 451: 330–334.
OpenUrl CrossRef PubMed Web of Science

[66] ↵
Xiang YY, Wang S, Liu M, Hirota JA, Li J, et al. (2007) A gabaergic system in airway epithelium is essential for mucus overproduction in asthma. Nature medicine 13: 862–867.
OpenUrl CrossRef PubMed Web of Science

[67] ↵
Ma D, Whitehead P, Menold M, Martin E, Ashley-Koch A, et al. (2005) Identification of significant association and gene-gene interaction of gaba receptor subunit genes in autism. The American Journal of Human Genetics 77: 377–388.
OpenUrl CrossRef PubMed Web of Science

[68] ↵
Collins AL, Ma D, Whitehead PL, Martin ER, Wright HH, et al. (2006) Investigation of autism and gaba receptor subunit genes in multiple ethnic groups. Neurogenetics 7: 167–174.
OpenUrl CrossRef PubMed Web of Science

[69] ↵
Ariani F, Hayek G, Rondinella D, Artuso R, Mencarelli MA, et al. (2008) Foxg1 is responsible for the congenital variant of rett syndrome. The American Journal of Human Genetics 83: 89–93.
OpenUrl CrossRef PubMed Web of Science

[70] ↵
Mencarelli M, Spanhol-Rosseto A, Artuso R, Rondinella D, De Filippis R, et al. (2010) Novel foxg1 mutations associated with the congenital variant of rett syndrome. Journal of medical genetics 47: 49–53.
OpenUrl Abstract/FREE Full Text

[71] ↵
Sadakata T, Furuichi T (2010) Ca 2+-dependent activator protein for secretion 2 and autistic-like phenotypes. Neuroscience research 67: 197–202.
OpenUrl CrossRef PubMed

[72] ↵
Crisci JL, Wong A, Good JM, Jensen JD (2011) On characterizing adaptive events unique to modern humans. Genome biology and evolution 3: 791–798.
OpenUrl CrossRef PubMed

[73] ↵
Guilherme A, Soriano NA, Furcinitti PS, Czech MP (2004) Role of ehd1 and ehbp1 in perinuclear sorting and insulin-regulated glut4 recycling in 3t3-l1 adipocytes. Journal of Biological Chemistry 279: 40062–40075.
OpenUrl Abstract/FREE Full Text

[74] ↵
Gudmundsson J, Sulem P, Rafnar T, Bergthorsson JT, Manolescu A, et al. (2008) Common sequence variants on 2p15 and xp11. 22 confer susceptibility to prostate cancer. Nature genetics 40: 281–283.
OpenUrl CrossRef PubMed

[75] ↵
Gong S, Zheng C, Doughty ML, Losos K, Didkovsky N, et al. (2003) A gene expression atlas of the central nervous system based on bacterial artificial chromosomes. Nature 425: 917–925.
OpenUrl CrossRef PubMed Web of Science

[76] ↵
Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, et al. (2006) In vivo enhancer analysis of human conserved non-coding sequences. Nature 444: 499–502.
OpenUrl CrossRef PubMed Web of Science

[77] ↵
Li MJ, Wang P, Liu X, Lim EL, Wang Z, et al. (2011) Gwasdb: a database for human genetic variants identified by genome-wide association studies. Nucleic acids research: gkr1182.

[78] ↵
Welter D, MacArthur J, Morales J, Burdett T, Hall P, et al. (2014) The nhgri gwas catalog, a curated resource of snp-trait associations. Nucleic acids research 42: D1001–D1006.
OpenUrl CrossRef PubMed Web of Science

[79] ↵
Paternoster L, Evans DM, Nohr EA, Holst C, Gaborieau V, et al. (2011) Genome-wide populationbased association study of extremely overweight young adults–the goya study. PLoS ONE 6: e24303.
OpenUrl CrossRef PubMed

[80] ↵
Suhre K, Wallaschofski H, Raffler J, Friedrich N, Haring R, et al. (2011) A genome-wide association study of metabolic traits in human urine. Nature genetics 43: 565–569.
OpenUrl CrossRef PubMed

[81] ↵
Perlis RH, Huang J, Purcell S, Fava M, Rush AJ, et al. (2010) Genome-wide association study of suicide attempts in mood disorder patients. Genome 167.

[82] ↵
Henrion M, Frampton M, Scelo G, Purdue M, Ye Y, et al. (2013) Common variation at 2q22. 3 (zeb2) influences the risk of renal cancer. Human molecular genetics 22: 825–831.
OpenUrl CrossRef PubMed Web of Science

[83] ↵
Schlebusch CM, Skoglund P, Sjödin P, Gattepaille LM, Hernandez D, et al. (2012) Genomic variation in seven khoe-san groups reveals adaptation and complex african history. Science 338: 374–379.
OpenUrl Abstract/FREE Full Text

[84] ↵
Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, et al. (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832–837.
OpenUrl CrossRef PubMed Web of Science

[85] ↵
Hernandez RD, Kelley JL, Elyashiv E, Melton SC, Auton A, et al. (2011) Classic selective sweeps were rare in recent human evolution. Science 331: 920–924.
OpenUrl Abstract/FREE Full Text

[86] ↵
Ding Q, Hu Y, Xu S, Wang J, Jin L (2013) Neanderthal introgression at chromosome 3p21. 31 was under positive natural selection in east asians. Molecular Biology and Evolution: mst260.

[87] ↵
Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, et al. (2014) The genomic landscape of neanderthal ancestry in present-day humans. Nature 507: 354–357.
OpenUrl CrossRef PubMed Web of Science

[88] ↵
Pace L, Salvan A, Sartori N (2011) Adjusting composite likelihood ratio statistics. Statistica Sinica 21: 129.
OpenUrl

[89] ↵
Fu Q, Li H, Moorjani P, Jay F, Slepchenko SM, et al. (2014) Genome sequence of a 45,000-year-old modern human from western siberia. Nature 514: 445–449.
OpenUrl CrossRef PubMed Web of Science

[90] Seguin-Orlando A, Korneliussen TS, Sikora M, Malaspinas AS, Manica A, et al. (2014) Genomic structure in europeans dating back at least 36,200 years. Science 346: 1113–1118.
OpenUrl Abstract/FREE Full Text

[91] ↵
Lazaridis I, Patterson N, Mittnik A, Renaud G, Mallick S, et al. (2014) Ancient human genomes suggest three ancestral populations for present-day europeans. Nature 513: 409–413.
OpenUrl CrossRef PubMed Web of Science

[92] ↵
Pickrell JK, Pritchard JK (2012) Inference of population splits and mixtures from genome-wide allele frequency data. PLoS genetics 8: e1002967.

Testing for ancient selection using cross-population allele frequency differentiation

1 Abstract

2 Introduction

3 Methods

3.1 XP-CLR

3.2 3P-CLR

4 Results

4.1 Simulations

4.2 Selection in Eurasians

4.3 Selection in ancestral modern humans

4.4 Modern human-specific high-frequency changes in GWAS catalog

5 Discussion

Acknowledgments

Footnotes

References

Citation Manager Formats

Subject Area