Signatures of natural selection in the drug metabolizing enzyme genes: Opportunity for developing personalized and precision medicine

Sheikh Nizamuddin; K. Thangaraj

doi:10.1101/113514

Abstract

Modern human experienced various selective pressures; including range of xenobiotics which contributed to heterogeneity of drug response. Many genes involve in pharmacokinetics and dynamics of drug, have been reported under natural selection. However, none of the studies have utilized comprehensive information of drug-centered PharmGKB pathways. We have extended this work and aimed to investigate sweep signals, using 1,798 subjects, from 53 Indian and 15 other world populations. We observed that modifiers which alters the biochemical function of other genes, have excess of natural selection (median std-z score=0.033±0.95; p-value=1.7×10⁻⁹-3.7×10⁻³). Taxane and statin primarily used for chemotherapy and lowering cholesterol level, respectively; and well known for heterogeneous drug response. We observed that pharmacokinetic pathway of taxane and statins are under natural selection (p-value=2.53×10⁻⁹and 2.73×10⁻⁹-1.09×10⁻⁴; q-value=1.28×10⁻⁷ and 6.91×10⁻⁶-1.1×10⁻³). We also observed signal of selection in Ibuprofen pharmacokinetics (p-value=1.76×10⁻⁵; q-value =2.22×10⁻⁴), beta-agonist/beta-blocker pharmacodynamics (p-value=4.79×10⁻⁴; q-value =4.04×10⁻⁴) and Zidovudin pharmacokinetics/dynamic pathway (p-value=7.0×10⁻⁴; q-value =5.06×10⁻⁴). Hard sweeps signals were observed in a total of 322 loci. Of which, 53 affect mRNA expression (p-value<0.001) and 16 were already reported with therapeutic response. Interestingly, we observed that Africans have experience 2 phases of natural selection, one at ~30,000 another at ~10,000 years before present.

Background

Modern human migrated out of Africa ~75,000 years BP (before present) and colonized at diverse ecological conditions ^1-3. During the migration, human populations have experienced various selection pressures, including range of xenobiotics, which were either present in new food sources or in new environment ⁴. Moreover, introduction of fire during the cultural revolution also had a major impact, as cooked food produce novel toxins and carcinogens ^5-9. In order to cope up with different selection pressure, humans have acquired beneficial variations in the genome, including variation in the drug metabolizing enzyme gene(s) that are responsible for heterogeneous therapeutic response in the contemporary populations ¹⁰.

Many genes, which influence the drug level and their kinetics, have been designated as ADME (absorption, digestion, metabolism and excretion of drug) genes and categorized into core, extended, phase-I, phase-II and modifiers on the basis of their function; and were in natural selection among various populations ^10,11. Several lines of evidences argue the existence of two type of selections; hard and soft sweeps ¹². In hard sweep, advantageous mutation rise to higher frequency and single mutation event is sufficient to influence the phenotype; for example: SLC24A5 mutation influences the skin pigmentation and DARC mutation influence the resistance against malaria^12-14; while in soft sweeps, multiple advantageous mutations rise simultaneously and influence in polygenic manner; for example: cytokine-cytokine receptor signaling pathways ¹⁵. Although many studies on natural selection of ADME genes explored hard sweeps signals, soft sweeps signals remain unexplored. Hence, in the present study, we utilized the drug pharmacokinetics and dynamics pathway from PharmGKB database to explore the soft sweeps signals ¹⁶.

Several ADME genes that were reported in natural selection were found in out-of-Africa populations ¹⁰. But, it is not clear what happened to the populations those were left behind in Africa? To the best of our knowledge, none of the studies have revealed about the signals in pharmacogenomically important genes that were originated in Africa, after early human migration and colonization events. In the present study, we explored these ancient signals, using genome-wide markers.

Results and discussions

Evidence of natural selection

In this study, we explored overall signal of selection based on genome-wide genotype data (623,462 SNPs). Genic regions are considered more influenced under natural selection compared to non-genic regions and must be having high average std-z score. To explore it, initially we compared the whole genic SNPs with non-genic. Then, we compared genic SNPs of 873 pharmacogenomically important genes with non-genic, to understand that natural selection on pharmacogenomically important genes, are similar to whole genome genic region or not.

We observed significantly higher std-z score in genic region (299,317 SNPs; median = 0.0033) comparative to non-genic SNPs (324,145; median = −0.0033) (F = 1.0608, p-value <2.2×10⁻¹⁶; t = 9.8221, p-value < 2.2×10⁻¹⁶), which reflect that genic regions are under natural selection. To further explore that whether natural selection really acting on the genic region, variable range of flanking sequences (from 0 to 10⁵base pairs) around genes was considered (Table S3). SNPs that are far from genes are more towards non-genic designation and hence less std-z score is expected with increment of flanking region around gene. We observed that median std-z score of genic and non-genic SNPs decrease with increment of flanking region (Figure 1C, 1D and Table S3), except median score of non-genic increases at 2000 base pairs (bp). We also observed that per base-pair decrease in std-z score of genic region is less compared to non-genic regions, which is evident in Figure 1E.

Figure 1.

Simulation analysis using hierarchical island model, samples genetic clustering and signal of natural selection in whole genic and in pharmacogenomically important genes. A. Principal component analysis to explore genic cluster of samples prior to simulation analysis. B. Outcome of the simulation analysis with Hierarchical island model. Figure C and D represents decrease in the median std-z score with increment of flanking region while E represents the ratio of genic’s median std-z score to non-genic’s score.

Similar pattern was found in pharmacogenomically important genes. A total of 19,721 SNPs from 873 genes have significantly higher std-z score compared to non-genic SNPs (std-z score: F = 1.0406, p-value = 1.1×10⁻⁴; t = 3.455, p-value = 5.5×10⁻⁴), suggests that natural selection acted on these genes that are pharmacogenomically important.

Modifiers ADME genes are most influenced under natural selection

To find out whether high frequency of natural selection among different categories of pharmacogenomically important genes, we have classified SNPs into core (494 SNPs), extended (3,900), phase-I (1,533), phase-II (767), modifier (285), transporter (1,805) and other pathway (15,333) genes; and compared them with other genic, non-genic and among themselves. Std-z score distribution of each category was compared for all combinations, using median and median absolute deviation (MAD) (Table S4). We observed that high frequency of modifier genes are under natural selection (std z-score = 0.033± 0.95) compared to other ADME categories, non-genic and other genic regions (Table 1, S4 and S5).

View this table:

Table 1.

Statistical significance of std-z score among different categories

We speculated that above observation could be biased by other SNPs, which were not in natural selection, but present within the genes. Hence, to prove the previous observation, SNPs that were in strong natural selection (std-z score >3) were only considered (reason for considering this cut-off value is explained in “Hard sweeps signals of natural selection”). Thus, 83 SNPs of ADME and 239 of other pathway genes were selected. It was observed that 27.27% modifier genes (6 out of 22) were under natural selection; while other categories ranges from 11.11% to 20.83% (Table S5). Among these, Phase-II genes (11.11%; 6 out of 54) were least influenced category. Moreover, the number of genes could also be biased by linkage disequilibrium (LD) among SNPs of different genes; hence, haplotype blocks of all populations were explored. Only rs3805322 (ADH4) and rs1042026 (ADH1B) were in the same haplotype block in HapMap-JPT, while they were in different blocks in other populations. Hence, it is clear that percentage of genes in each category is not the artifacts of LD (linkage disequilibrium).

It is interesting to note that modifiers influence the expression and alters the biochemical function of other ADME genes (www.pharmaadme.org). Alteration in the function of this category can cause major influence on pharmacogenomic system.

Soft sweeps (polygenic adaptation) signals of natural selection

Much of selection acts as polygenic way (soft sweeps) with subtle influence on variation irrespective of hard sweeps, where either selection coefficient (>1%) or time of fixation is high¹². To find the “soft sweep” signals, we utilized 102 pharmacogenomically important pathways (PharmGKB database).Of which, 45 pathways with less number of genes (< 10) were excluded from further analysis (Table S6). It is evident from previous reports that highest and median std-z score of SNPs assigned to gene have significant correlation and hence, highest score can be best representative of std-z score distribution within gene ¹⁵. Same is true in the present study, where a significant correlation (r = 0.436; p-value < 2.2×10⁻¹⁶; −0.448 95% C.I.) was observed between highest and median std-z score. Therefore, genic SNPs with highest std-z score were assigned as representative for the genes (total 16,594) in further soft-sweep signal analysis.

We observed multimodal distribution of std-z score, with 3 humps, having different range of score and density (Figure 2A). Maximum number of genes (11398: 68.69% of total gene) exists in the first hump (island-I) and have std-z score distribution from −2.378 to 0.824 (median = 0.54); intermediate number of genes (5031: 30.32 %) exist in second hump (island-II) and have distribution from 1.774 to 5.0 (median = 2.878); while in third hump (island-III), minimum number of genes (164: 0.99%) exists with std-z score from 5.1024 to 5.733 (median = 5.6091).

Figure 2.

Soft sweep signals. In A, upper panel represents the density of std-z score where 3 humps can be observed while lower panel represents the distribution of std-z score of gene’s representative SNPs and few previously reported hard and soft sweep signals. The curve with dotted lines in B, C, D, E, F, G and H represent the distribution of std-z scores for Taxane pharmacokinetics, Fluvastatin pharmacokinetics, Atorvastatin-Lovastatin-Simvastatin pharmacokinetics, Statin generalized pharmacokinetics, ibuprofen pharmacokinetics, β-agonist β-blocker pharmacodynamics and Zidovudin pharmacokinetics/dynamics pathway’s genes respectively, while other curve with solid line in B, C, D, E, F, G and H represents the distribution of std-z scores for representative SNPs of all genes.

Further, we used the number of genes, present within 3 islands and compared them with genes of pathways, using simple chi-square test with 2 degree of freedom. For multiple test correction, p-values were corrected and converted into q-values (Figure S3) (false discovery rate control method).¹⁷ We have observed that 7 pathways were under natural selection (p-value < 7×10⁻⁴; q-value < 5.06×10⁻³), which includes pharmacokinetics pathway of Taxane, Fluvastatin, Atorvastatin-Lovastatin-Simvastatin, Ibuprofen and Statin; pharmacodynamic pathway of β-agonist/β-blocker; and pharmacodynamic and kinetics pathway of Zidovudin (Table 2 and S7) (Figure 2B, 2C, 2D, 2E, 2F, 2G and 2H). We speculated that these 7 pathways might be having the significant number of common genes which are under natural selection. But, we did not find the same (Table S8).

View this table:

Table 2.

Statistical significance of pathways. Three humps/islands were found in distribution curve of std-z score (Figure 3B). The corresponding number of the genes and percentage in each hump is given in the header of this table. Moreover, for each pathway, distribution of genes (number: percentage) in each hump is given with corresponding p-value and q-value.

Taxane and statin drugs are primarily used for chemotherapy and lowering cholesterol level. Taxane produced by the plants of the Taxus sp., while statins naturally produced by Peniecillium and Aspergillus fungi. Interestingly, statins are also observed in various natural sources including dairy products, whole grains, almonds and other nuts, fatty fish, pure sugar cane, apple cider vinegar, and many vegetables. On the basis of our results, we speculate that modern human experienced selection pressure in the past against taxane and statin like organic molecules, which are present in environment. Due to which beneficial mutations have increased only in those populations experienced in selective pressure; while in others frequency is less; and this can be the reason for different therapeutic responses in the present day populations. We presume that the same is true for Zidovudin and Ibuprofen.

Hard sweeps signals of natural selection

The cut-off std-z score ≥ 3 was chosen to filter the SNPs in “hard sweeps” (Table S9). The best examples of “hard sweeps”: African Duffy/DARC, SLC24A5, OCA2 and KITLG, ¹² exists in hump2/Island-2 with std-z score 4.08, 2.6, 5.199 and 3.9 (Figure 2A). It justify our cut-off std-z score 3 for “hard sweep” signals. Frequency distribution of the 322 SNPs in the population, utilized in the present study, are given in Figure S4.

Clinical annotation of hard sweeps signals through literatures

In the further step, we annotated the hard sweep signals with Ensembl-BioMart (version 75). Out of 322 SNPs, 16 (8 in ADME and 8 in other pathway genes) have been reported in association with drug-response (Table S10). Variant in the ADH1B (rs 1042026; missense variation: Arg47His) is well-known for its role in alcohol metabolism, had std-z score 3.14; ^18,19 and previously reported under natural selection during Neolithic period ^20,21. The rs363333 (SLC18A2: std-z score = 3.155) and rs892413 (CHRNA6; std-z score = 3.047), which are associated with tobacco and alcohol metabolism/dependence were also observed under selection ^22-25.

The G allele of rs1056836 plays protective role against prostate cancer through metabolism of carcinogens including those ones, which are produced during meat cooking ^26-29. We observed this SNP (rs1056836) under natural selection with std-z score 3.14. Moreover, we also observed high frequency of G allele in ‘out of Africa’ population (>0.558). East-Asians have very high frequency (0.92) of G allele and Europeans have less (0.558) among “out-of-African” population, while Africans have frequency ranging 0.124-0.357. Indo-Africans (Siddi) (0.3571) were close to African populations. Intriguingly, CC genotype is associated with decrease survival rate of prostate cancer patients, when treated with docetaxel compared to GC and GG genotypes ³⁰. Natural selection is the reason for different frequency spectrum of these genotypes and hence, for heterogeneous drug response.

Another important variant, rs472660 (intronic) in CYP3A43 is significantly associated with olanzapine clearance and having std-z score 4.32 ³¹. The AA genotype of rs472660 is well-known for the reason of racial differences in inefficacy and/or adverse reaction. Within Indian subcontinent, only Indo-African (Siddi) has AA genotype with 0.357 frequency, similar to other African populations (0.444-0.70%). GIH and Singapore-Indians (2.27 and 1.2, respectively) have similar frequency as European populations (CEU = 1.21 and TSI = 3.41). Interestingly, we found 2 SNPs, rs11103482 (RXRA; std-z score = 3.3) and rs2984915 (JUN; std-z score = 3.47),were under natural selection, which are known to influence measles vaccine immunity, an important finding which suggests that not only present day drug responses are outcome of natural selection, but also vaccine responses ^32,33. CYP3A4 is one of the important ADME enzymes, which metabolize ~50% of the drug and known to influence inter-individual responses. We found rs1851426 (std-z score = 5.61) in CYP3A4 gene that influence phenotype variability with probe drug quinine ³⁴. Similarly, rs2854450 (EPHX1) was found to be associated with diisocyanate-induced asthma was also in natural selection (std-z score = 3.07) ³⁵.

Anti-psychotic drug Olanzapine is having higher rate of discontinuation, due to its inadequate response or hypersensitive reaction. The rs472660 (CYP3A43; AA genotype) has reported in association with clearance of Olanzapine, was observed under natural selection (std-z score = 4.32) ³¹. Frequency of AA genotype of rs472660 is higher in African population (44-70%), while Europeans (1.21-3.41%), HapMap-GIH (2.27%) and Singapore-Indians (1.2%) have significantly low frequency. Interestingly, the “AA” genotype was observed only in Indo-Africans (35.71%). We have also identified the variants that are associated with dose/response of anticoagulants (Phenoprocoumon: rs11150604: STX4: std-z score = 4.78), ³⁶ resistant to anti-angiogenic therapies (Bevacizumab: rs9582036: FLT1: std-z score = 3.884) ^37-39, response to psychological stress (Cortisol: rs242924: CRHR1: std-z score = 3.05) ^40-42 and response in chemotherapy (Bortezomib and Vincristine: rs6457816: PPARD: std-z score = 4.0; Celecoxib: rs6017996: SRC: std-z score = 3.065; Sunitinib, Panitumumab, Oxaliplatin and Bevacizumab: rs9582036: FLT1: std-z score = 3.884) ^43-46. Besides these, rs10059859 (PDE4D: std-z score = 5.59) was found to be associated with esophageal function and rs1403527 (NR1I2: std-z score = 3.36) was reported to be in worldwide differentiation, in “Hard sweeps” ^47,48.

Expression analysis

After clinically annotating the hard sweep signals, we explored the effect of these 322 SNPs on the mRNA expression. For this, we utilized whole genome expression data and genotype of HapMap individuals (details are given in method section); and observed 53 SNPs (19 in ADME and 34 in other pathway genes) affecting the mRNA expression with p-value < 0.001 (Figure S5 and Table 3). Of which, 3 SNPs (rs1056836, rs 11103482 and rs6457816) in ADME genes and 3 SNPs (rs2984915, rs9554316 and rs9582036) in other pathway genes have already been clinically annotated.

View this table:

Table 3:

Expression analysis of hard sweep signals

Signal of ancient positive natural selection

Late Pleistocene and Holocene age of modern human experienced high ecological and demographic changes; and rise of agriculture. To find the acceleration of selective sweeps, we calculated age of variants (only for ADME genes) based on the length of extended haplotype homozygosity (EHH) on either side of derived allele, in East-Asians (JPT and CHB), Europeans (CEU), Mexicans (MEX), African (YRI), Indians (Indo-Europeans and Dravidians), Tibeto-Burmans & Tibetans, Onge and Siddi. We observed that in all populations, variants have split into 2 cluster: in the 1^st cluster SNPs have high range of age and std-z score (< 0.824) while in 2^nd cluster SNPs having less range with std-z score (> 1.774) (Figure S6). Since, our aim was to find the acceleration of natural selection, we proceeded further with the 2^nd cluster (std-z score>2). Further, we considered only those variants having age of < 60ky and estimated kernel density with 100 breakpoints. In all 8 populations analyzed, we found high density of SNPs near to ~ 10,000 year BP (before present).

Interestingly, we observed African population has 2 phase of acceleration (Figure 3). First phase of acceleration in this population was ~30,000 years BP while second phase was ~10,000 (same as other population). We did not find signal for natural selection at ~30,000 YBP in other populations suggesting that acceleration of natural selection in Africans might have happened after out-of-Africa migration. To understand this event, we explored the signals in Onge and Siddi populations. Onge is a “Negrito” population, who are the descendent of the first “out-of-Africa” population and inhabited in isolation for tens of thousands years in the Andaman and Nicobar islands of Indian sub-continent ⁴⁹ while Siddi population is the recent (~400 years) migrant from Africa and has admixed with Indian populations ⁵⁰. If natural selection would have acted after “out-of-Africa” migration, Onge populations must be carrying recent signals; but not the ancient signals due to their isolation while others African and Siddi population should be having both. Presence of signals only in Siddi not in Onge suggests that African continent possess 2 wave of acceleration compared to populations from rest of the world. Further, we tried to explore the difference between these 2 phases and found that the differences exist only in SNP level, not genes level. This is a preliminary observation and needs to be explored further.

Figure 3.

Evidence of two phase of natural selection in Africans and its comparison to other world populations.

Conclusions

We identified signature of natural selection in pharmacogenomically important genes. Of which, modifiers which alters the biochemical function of other genes, have excess of natural selection. Using comprehensive information of drug-centered pathway from PharmGKB, we observed 7 pharmacokinetic and dynamic pathways under natural selection. Our study also identified 322 hard sweep signals. We have demonstrated that 53 affect mRNA expression level. In future, these SNPs can be utilized to understand heterogeneous drug response, personalized therapy in contemporary populations.

Methods

SNP data

A total of 53 Indian populations, who belongs to diverse social, linguistic and geographical background (4 Tibeto-Burman, 1 Tibetan-refugees, 12 Indo-Europeans, 22 Dravidians, 10 Austro-Asiatic, 2 Indo-Africans, 1 Great-Andamanese and 1 Onge) were selected for this study. The source of genotype data for these 53 Indian populations, are given in Table S1. In addition, 11 populations from HapMap, 3 from Singapore genome diversity project and 1 Tibetan population from NCBI-GEO database were also included (Table S1). This study was approved by the Institutional ethical committee of CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India.

Further, 811 genes from 102 pharmacokinetic and pharmacodynamic important pathways have been selected from PharmGKB database ¹⁶. Of which, 31 were core ADME genes (8 transporter, 12 phase-I and 11 phase-II) and 92 were extended ADME genes (27 transporter, 39 phase-I, 13 phase-II and 13 modifier). Besides these, 1 core and 165 extended ADME (40 transporter, 72 phase-I, 42 phase-II and 11 modifier) genes from PharmaADME database, which are not in pathway, were also been selected. SNPs, which are within 10,000 base pairs (bps) of above described 977 genes were considered as genic and extracted from whole genome data after imputation. In quality control filtering (imputation R2<0.95), genotype information of 104 genes (including 5 ADME genes on X-chromosome) were excluded (Figure S1) and analyzed using the remaining 20,225 SNPs.

Imputation of missing genotypes

We have already demonstrated that imputation of missing genotypes is more accurate with Indian haplotype reference panel ⁵¹. Hence, we utilized 56 Indian-specific haplotype (http://www.ccmb.res.in/bic/database_pagelink.php?page=snpdata) as reference panel for Indo-European, Dravidian and Austro-Asiatic; and performed imputation with Beagle-v3.3.1. Of which, only 623,462 SNPs having R2> 0.9, were selected for further analysis. For Tibeto-Burman and Tibetan, we utilized phased haplotype reference panel of JPT, CHB and CHD HapMap populations while for Siddi population, we utilized haplotype reference panel of YRI.

Principal component analysis (PCA)

To genetically cluster the populations for coalescence simulations with hierarchical island model (discussed in subheading “Test for natural selection”), we performed PCA with EIGENSOFT package ⁵². Since, populations can differentiate on spurious axis due to linkage disequilibrium (LD) between SNPs, we excluded 193,014 SNPs with r²>0.75 prior to PCA and utilized remaining 444,610 SNPs. Further, we explored the clusters on initial 3 eigenvectors, having highest eigenvalues.

Test for natural selection

To detect the signal of selection, population differentiation measured by F_st, were used, because it gives collective information about all populations in single statistics in comparison to others i.e. HaploPS, XP-CLR, XP-EHH, his ^13,15,53-55. However, absolute value of F_st could be misleading because of its correlation with heterozygosity ⁵⁶, hence, p-value for F_st was assessed. For this, we generated empirical distribution of F_st with respect to their heterozygosity in 100,000 coalescence simulations with hierarchical island model (10 groups and 100 demes per group) ⁵⁷ As proposed earlier that in this model, demes within the same group (continent) are assumed to exchange migrants at a higher rate than demes in different groups, which reflect the hierarchical nature of human continental regions and appropriate to use in simulation. To perform this simulation, we clustered the samples in 10 groups on the basis of their linguistic affiliations and genetic clustering observed in principal component analysis (PCA) (figure 1A); 1) Onge, 2) Great-Andamanese, 3) Austro-Asiatic, 4) Dravidians, 5) Indo-Africans, 6) Indo-Europeans, 7) Tibeto-Burman and Tibetans, 8) HapMap-Africans, 9) Singapore-Malay, Singapore-Chinese and HapMap-East Asians; and 10) Admixed populations with Indian ancestry(GIH and Singapore-Indians), HapMap-Europeans and HapMap-Mexicans.

Moreover, to understand the differentiation not only between populations but also between pharmacogenomically important genes and other genic (within 10,000 base pair of gene) & non-genic regions, above distribution was used to calculate the p-value for all 623,462 autosomal SNPs. It is evident in Figure 1B that SNPs having both low and high F_st value are under natural selection. Since, the aim of the study is to find pharmacogenomically important SNPs which are highly differentiated among populations and are under natural selection, we converted p-value into z-score, in a way that highly differentiated SNPs (having high F_st value) score high positive value ^15,57 The qnorm function of R with Epanechnikov kernel density was used, for conversion of p-value to z score.

We observed that 623,462 SNPs are not equally distributed in genome (Figure S2). In this scenario, genomic regions with high density of SNPs are more likely to possess extreme value of z-score, only by chance and hence, standardization is needed. We split the whole genome into 13,933 fragments (each 2×10⁵ base pairs) and clustered in 20 bins according to their SNP density (Table S2). Further, z-score distribution within bin was used for standardization as proposed by Daub, et al. (2013) ¹⁵. It is briefly, explained below:

Suppose, an x locus has z score “z_x” and on basis of SNP density belongs to i^th bin, then standardized z-score (std z-score) for x will be;

Where; median absolute deviation,

To represent the central tendency of the distribution, median has always been used in place of mean; while for representing the statistical dispersion, median absolute deviation (MAD) divided by 0.6745 was used in place of standard error/deviation, as these values are more realistic even in case of long tail or exponential distribution. Interestingly, if distribution is symmetrical, median becomes equal to mean. Hence, if, we do not know the pattern of distribution, median is the best choice. All statistical analysis i.e. F test, t test, pathway significant analysis was performed with R basic package.

Clinical annotation of Hard-sweep signals

In order to explore the clinical significant of naturally selected loci, Ensembl-BioMart (Ensembl version 75: February 2014) was used to extract literatures in which hard-sweep loci obtained in the present study, was associated with the drug responses.

Expression analysis

To find the significance of hard sweep signals, we also utilized whole genome expression dataset of HapMap subjects from Gene Expression Omnibus (GEO) with GSE6536 ID and genotype data of same individuals from ftp://ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2009-01_phaseIII/plink_format. For the expression of each mRNA, we used whole genome normalized expression score and performed Wald-test for quantitative trait association analysis with Plink software ⁵⁸.

Age of variants

To calculate the rough estimate of age of variants, we used the formula mention by Zhernakova, A. et. al ⁵⁹. For this, initially, we calculated extended haplotype homozygosity score x on both side of core allele with “rehh” package of R. The alleles, which are matching with the Chimpanzee are considered as ancestral while other as derived allele. Only the age of derived alleles were calculated and considered as core allele. Suppose the genetic distance between left and right side of core SNP is r, the generation time G can be calculated as:

(Block)

For the calculation of age, we consider only those genetic distance where x is equal to 0.25. In the present study, we consider that 1 generation is equal to 25 years.

Declarations

Ethics approval and consent to participate

Collection of DNA samples and use of genetic data, were approved by ethical committee of CSIR-Centre for Cellular and Molecular Biology, India.

Consent for publication

Not applicable.

Availability of data and material

The datasets generated during and/or analyzed during the current study are not publicly available due to data and privacy protection considerations but may be available on justified request.

Competing interests

Authors declare that they have no competing interests.

Funding

This work was supported by CSIR Network project—EpiHeD (BSC0118), Government of India. Sheikh Nizamuddin was supported by ICMR JRF-SRF research fellowship.

Authors’ contributions

KT was project leaders. SN analyzed data. SN and KT wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgements

Sheikh Nizamuddin acknowledges ICMR for JRF-SRF research fellowship.

References

↵
Underhill, P. A. & Kivisild, T. Use of y chromosome and mitochondrial DNA population structure in tracing human migrations. Annual review of genetics 41, 539–564, doi:10.1146/annurev.genet.41.110306.130407 (2007).
OpenUrl CrossRef PubMed Web of Science
↵
Harris, K. & Nielsen, R. Inferring demographic history from a spectrum of shared haplotype lengths. PLoS genetics 9, e1003521, doi:10.1371/journal.pgen.1003521 (2013).
OpenUrl CrossRef PubMed
↵
Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nature genetics 43, 1031–1034, doi:10.1038/ng.937 (2011).
OpenUrl CrossRef PubMed
↵
Leopold, A. C. & Ardrey, R. Toxic substances in plants and the food habits of early man. Science 176, 512–514 (1972).
OpenUrl Abstract/FREE Full Text
↵
Jagerstad, M. & Skog, K. Genotoxicity of heat-processed foods. Mutation research 574, 156–172, doi:10.1016/j.mrfmmm.2005.01.030 (2005).
OpenUrl CrossRef PubMed Web of Science
Sugimura, T., Wakabayashi, K., Nakagama, H. & Nagao, M. Heterocyclic amines: Mutagens/carcinogens produced during cooking of meat and fish. Cancer science 95, 290–299 (2004).
OpenUrl CrossRef PubMed
Ito, N. et al. A new colon and mammary carcinogen in cooked food, 2-amino-1-methyl-6-phenylimidazo[4,5-b]pyridine (PhIP). Carcinogenesis 12, 1503–1506 (1991).
OpenUrl CrossRef PubMed Web of Science
Sinha, R. et al. Lower levels of urinary 2-amino-3,8-dimethylimidazo[4,5-f]-quinoxaline (MeIQx) in humans with higher CYP1A2 activity. Carcinogenesis 16, 2859–2861 (1995).
OpenUrl CrossRef PubMed Web of Science
↵
Moonen, H., Engels, L., Kleinjans, J. & Kok, T. The CYP1A2-164A-->C polymorphism (CYP1A2*1F) is associated with the risk for colorectal adenomas in humans. Cancer Lett 229, 25–31 (2005).
OpenUrl CrossRef PubMed Web of Science
↵
Li, J., Zhang, L., Zhou, H., Stoneking, M. & Tang, K. Global patterns of genetic diversity and signals of natural selection for human ADME genes. Human molecular genetics 20, 528–540, doi:10.1093/hmg/ddq498 (2011).
OpenUrl CrossRef PubMed Web of Science
↵
Janha, R. E. et al. Inactive alleles of cytochrome P450 2C19 may be positively selected in human evolution. BMC evolutionary biology 14, 71, doi:10.1186/1471-2148-14-71 (2014).
OpenUrl CrossRef PubMed
↵
Pritchard, J. K., Pickrell, J. K. & Coop, G. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Current biology: CB 20, R208–215, doi:10.1016/j.cub.2009.11.055 (2010).
OpenUrl CrossRef PubMed Web of Science
↵
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS biology 4, e72, doi:10.1371/journal.pbio.0040072 (2006).
OpenUrl CrossRef PubMed
↵
Langhi, D. M., Jr.. & Bordin, J. O. Duffy blood group and malaria. Hematology 11, 389–398, doi:10.1080/10245330500469841 (2006).
OpenUrl CrossRef PubMed
↵
Daub, J. T. et al. Evidence for polygenic adaptation to pathogens in the human genome. Molecular biology and evolution 30, 1544–1558, doi:10.1093/molbev/mst080 (2013).
OpenUrl CrossRef PubMed Web of Science
↵
Whirl-Carrillo, M. et al. Pharmacogenomics knowledge for personalized medicine. Clinical pharmacology and therapeutics 92, 414–417, doi:10.1038/clpt.2012.96 (2012).
OpenUrl CrossRef PubMed
↵
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America 100, 9440–9445, doi:10.1073/pnas.1530509100 (2003).
OpenUrl Abstract/FREE Full Text
↵
Birley, A. J. et al. ADH single nucleotide polymorphism associations with alcohol metabolism in vivo. Human molecular genetics 18, 1533–1542, doi:10.1093/hmg/ddp060 (2009).
OpenUrl CrossRef PubMed Web of Science
↵
Luo, X. et al. Multiple ADH genes modulate risk for drug dependence in both African- and European-Americans. Human molecular genetics 16, 380–390, doi:10.1093/hmg/ddl460 (2007).
OpenUrl CrossRef PubMed
↵
Han, Y. et al. Evidence of positive selection on a class I ADH locus. American journal of human genetics 80, 441–456, doi:10.1086/512485 (2007).
OpenUrl CrossRef PubMed
↵
Peng, Y. et al. The ADH1B Arg47His polymorphism in east Asian populations and expansion of rice domestication in history. BMC evolutionary biology 10, 15, doi:10.1186/1471-2148-10-15 (2010).
OpenUrl CrossRef PubMed
↵
Schwab, S. G. et al. Association of DNA polymorphisms in the synaptic vesicular amine transporter gene (SLC18A2) with alcohol and nicotine dependence. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology 30, 2263–2268, doi:10.1038/sj.npp.1300809 (2005).
OpenUrl CrossRef PubMed Web of Science
Weiss, R. B. et al. A candidate gene approach identifies the CHRNA5-A3-B4 region as a risk factor for age-dependent nicotine addiction. PLoS genetics 4, e1000125, doi:10.1371/journal.pgen.1000125 (2008).
OpenUrl CrossRef PubMed
Hoft, N. R. et al. SNPs in CHRNA6 and CHRNB3 are associated with alcohol consumption in a nationally representative sample. Genes, brain, and behavior 8, 631–637, doi:10.1111/j.1601-183X.2009.00495.x (2009).
OpenUrl CrossRef PubMed Web of Science
↵
Hoft, N. R. et al. Genetic association of the CHRNA6 and CHRNB3 genes with tobacco dependence in a nationally representative sample. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology 34, 698–706, doi:10.1038/npp.2008.122 (2009).
OpenUrl CrossRef PubMed Web of Science
↵
Catsburg, C. et al. Polymorphisms in carcinogen metabolism enzymes, fish intake, and risk of prostate cancer. Carcinogenesis 33, 1352–1359, doi:10.1093/carcin/bgs175 (2012).
OpenUrl CrossRef PubMed Web of Science
Wang, J. et al. Carcinogen metabolism genes, red meat and poultry intake, and colorectal cancer risk. Int J Cancer 130, 1898–1907 (2011).
OpenUrl
Cotterchio, M. et al. Red meat intake, doneness, polymorphisms in genes that encode carcinogen-metabolizing enzymes, and colorectal cancer risk. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 17, 3098–3107, doi:10.1158/1055-9965.EPI-08-0341 (2008).
OpenUrl Abstract/FREE Full Text
↵
Stephenson, N., Beckmann, L. & Chang-Claude, J. Carcinogen metabolism, cigarette smoking, and breast cancer risk: a Bayes model averaging approach. Epidemiologic perspectives & innovations: EP+I 7, 10, doi:10.1186/1742-5573-7-10 (2010).
OpenUrl CrossRef
↵
Sissung, T. M. et al. Association of the CYP1B1*3 allele with survival in patients with prostate cancer receiving docetaxel. Mol Cancer Ther 7, 19–26 (2008).
OpenUrl Abstract/FREE Full Text
↵
Bigos, K. L. et al. Genetic variation in CYP3A43 explains racial difference in olanzapine clearance. Molecular psychiatry 16, 620–625, doi:10.1038/mp.2011.38 (2011).
OpenUrl CrossRef PubMed
↵
Ovsyannikova, I. G. et al. The role of polymorphisms in Toll-like receptors and their associated intracellular signaling genes in measles vaccine immunity. Human genetics 130, 547–561, doi:10.1007/s00439-011-0977-x (2011).
OpenUrl CrossRef PubMed
↵
Ovsyannikova, I. G. et al. Effects of vitamin A and D receptor gene polymorphisms/haplotypes on immune responses to measles vaccine. Pharmacogenetics and genomics 22, 20–31, doi:10.1097/FPC.0b013e32834df186 (2012).
OpenUrl CrossRef PubMed
↵
Rodriguez-Antona, C., Sayi, J. G., Gustafsson, L. L., Bertilsson, L. & Ingelman-Sundberg, M. Phenotype-genotype variability in the human CYP3A locus as assessed by the probe drug quinine and analyses of variant CYP3A4 alleles. Biochemical and biophysical research communications 338, 299–305, doi:10.1016/j.bbrc.2005.09.020 (2005).
OpenUrl CrossRef PubMed Web of Science
↵
Yucesoy, B. et al. Genetic variants in antioxidant genes are associated with diisocyanate-induced asthma. Toxicological sciences: an official journal of the Society of Toxicology 129, 166–173, doi:10.1093/toxsci/kfs183 (2012).
OpenUrl CrossRef PubMed Web of Science
↵
Teichert, M. et al. Dependency of phenprocoumon dosage on polymorphisms in the VKORC1, CYP2C9, and CYP4F2 genes. Pharmacogenetics and genomics 21, 26–34, doi:10.1097/FPC.0b013e32834154fb (2011).
OpenUrl CrossRef
↵
Clarke, J. M. & Hurwitz, H. I. Understanding and targeting resistance to antiangiogenic therapies. Journal of gastrointestinal oncology 4, 253–263, doi:10.3978/j.issn.2078-6891.2013.036 (2013).
OpenUrl CrossRef
Schneider, B. P., Shen, F. & Miller, K. D. Pharmacogenetic biomarkers for the prediction of response to antiangiogenic treatment. The Lancet. Oncology 13, e427–436, doi:10.1016/S1470-2045(12)70275-9 (2012).
OpenUrl CrossRef PubMed
↵
Lambrechts, D. et al. VEGF pathway genetic variants as biomarkers of treatment outcome with bevacizumab: an analysis of data from the AViTA and AVOREN randomised trials. The Lancet. Oncology 13, 724–733, doi:10.1016/S1470-2045(12)70231-0 (2012).
OpenUrl CrossRef PubMed Web of Science
↵
Mahon, P. B., Zandi, P. P., Potash, J. B., Nestadt, G. & Wand, G. S. Genetic association of FKBP5 and CRHR1 with cortisol response to acute psychosocial stress in healthy adults. Psychopharmacology 227, 231–241, doi:10.1007/s00213-012-2956-x (2013).
OpenUrl CrossRef PubMed
Tyrka, A. R. et al. Interaction of childhood maltreatment with the corticotropinreleasing hormone receptor gene: effects on hypothalamic-pituitary-adrenal axis reactivity. Biological psychiatry 66, 681–685, doi:10.1016/j.biopsych.2009.05.012 (2009).
OpenUrl CrossRef PubMed Web of Science
↵
Ozomaro, U., Wahlestedt, C. & Nemeroff, C. B. Personalized medicine in psychiatry: problems and promises. BMC medicine 11, 132, doi: 10.1186/1741-7015-11-132 (2013).
OpenUrl CrossRef PubMed
↵
Broyl, A. et al. Mechanisms of peripheral neuropathy associated with bortezomib and vincristine in patients with newly diagnosed multiple myeloma: a prospective analysis of data from the HOVON-65/GMMG-HD4 trial. The Lancet. Oncology 11, 1057–1065, doi:10.1016/S1470-2045(10)70206-0 (2010).
OpenUrl CrossRef PubMed Web of Science
Kraus, S. et al. Impact of genetic polymorphisms on adenoma recurrence and toxicity in a COX2 inhibitor (celecoxib) trial: results from a pilot study. Pharmacogenetics and genomics 23, 428–437, doi:10.1097/FPC.0b013e3283631784 (2013).
OpenUrl CrossRef
Beuselinck, B. et al. VEGFR1 single nucleotide polymorphisms associated with outcome in patients with metastatic renal cell carcinoma treated with sunitinib - a multicentric retrospective analysis. Acta oncologica 53, 103–112, doi:10.3109/0284186X.2013.770600 (2014).
OpenUrl CrossRef PubMed Web of Science
↵
Miles, D. W. et al. Biomarker results from the AVADO phase 3 trial of first-line bevacizumab plus docetaxel for HER2-negative metastatic breast cancer. British journal of cancer 108, 1052–1060, doi:10.1038/bjc.2013.69 (2013).
OpenUrl CrossRef PubMed Web of Science
↵
Scheiman, J. M., Patel, P. M., Henson, E. K. & Nostrant, T. T. Effect of naproxen on gastroesophageal reflux and esophageal function: a randomized, double-blind, placebo-controlled study. The American journal of gastroenterology 90, 754–757 (1995).
OpenUrl PubMed
↵
Maisano Delser, P. & Fuselli, S. Human loci involved in drug biotransformation: worldwide genetic variation, population structure, and pharmacogenetic implications. Human genetics 132, 563–577, doi:10.1007/s00439-013-1268-5 (2013).
OpenUrl CrossRef PubMed
↵
Thangaraj, K. et al. Reconstructing the origin of Andaman Islanders. Science 308, 996, doi:10.1126/science.1109987 (2005).
OpenUrl Abstract/FREE Full Text
↵
Shah, A. M. et al. Indian Siddis: African descendants with Indian admixture. American journal of human genetics 89, 154–161 (2011).
OpenUrl CrossRef PubMed
↵
Govindaraj, P. et al. Genome-wide analysis correlates Ayurveda Prakriti. Scientific reports 5, 15786, doi:10.1038/srep15786 (2015).
OpenUrl CrossRef
↵
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS genetics 2, e190, doi:10.1371/journal.pgen.0020190 (2006).
OpenUrl CrossRef PubMed
↵
Liu, X., Saw, W. Y., Ali, M., Ong, R. T. & Teo, Y. Y. Evaluating the possibility of detecting evidence of positive selection across Asia with sparse genotype data from the HUGO Pan-Asian SNP Consortium. BMC genomics 15, 332, doi:10.1186/1471-2164-15-332 (2014).
OpenUrl CrossRef PubMed
Chen, H., Patterson, N. & Reich, D. Population differentiation as a test for selective sweeps. Genome research 20, 393–402, doi:10.1101/gr.100545.109 (2010).
OpenUrl Abstract/FREE Full Text
↵
Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).
OpenUrl CrossRef PubMed Web of Science
↵
Beaumont MA, N. R. Evaluating Loci for Use in the Genetic Analysis of Population Structure. Proc R Soc Lond B 263, 1619–1626 (1996).
OpenUrl CrossRef
↵
Hofer, T., Foll, M. & Excoffier, L. Evolutionary forces shaping genomic islands of population differentiation in humans. BMC genomics 13, 107, doi:10.1186/1471-2164-13-107 (2012).
OpenUrl CrossRef PubMed
↵
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559–575, doi:10.1086/519795 (2007).
OpenUrl CrossRef PubMed
↵
Zhernakova, A. et al. Evolutionary and functional analysis of celiac risk loci reveals SH2B3 as a protective factor against bacterial infection. American journal of human genetics 86, 970–977, doi:10.1016/j.ajhg.2010.05.004 (2010).
OpenUrl CrossRef PubMed Web of Science

View the discussion thread.

Posted March 03, 2017.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5210)
Biochemistry (11736)
Bioengineering (8746)
Bioinformatics (29186)
Biophysics (14964)
Cancer Biology (12084)
Cell Biology (17401)
Clinical Trials (138)
Developmental Biology (9418)
Ecology (14176)
Epidemiology (2067)
Evolutionary Biology (18299)
Genetics (12235)
Genomics (16793)
Immunology (11863)
Microbiology (28066)
Molecular Biology (11580)
Neuroscience (60925)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4956)
Plant Biology (10422)
Scientific Communication and Education (1683)
Synthetic Biology (2883)
Systems Biology (7338)
Zoology (1650)

[1] ↵
Underhill, P. A. & Kivisild, T. Use of y chromosome and mitochondrial DNA population structure in tracing human migrations. Annual review of genetics 41, 539–564, doi:10.1146/annurev.genet.41.110306.130407 (2007).
OpenUrl CrossRef PubMed Web of Science

[2] ↵
Harris, K. & Nielsen, R. Inferring demographic history from a spectrum of shared haplotype lengths. PLoS genetics 9, e1003521, doi:10.1371/journal.pgen.1003521 (2013).
OpenUrl CrossRef PubMed

[3] ↵
Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nature genetics 43, 1031–1034, doi:10.1038/ng.937 (2011).
OpenUrl CrossRef PubMed

[4] ↵
Leopold, A. C. & Ardrey, R. Toxic substances in plants and the food habits of early man. Science 176, 512–514 (1972).
OpenUrl Abstract/FREE Full Text

[5] ↵
Jagerstad, M. & Skog, K. Genotoxicity of heat-processed foods. Mutation research 574, 156–172, doi:10.1016/j.mrfmmm.2005.01.030 (2005).
OpenUrl CrossRef PubMed Web of Science

[6] Sugimura, T., Wakabayashi, K., Nakagama, H. & Nagao, M. Heterocyclic amines: Mutagens/carcinogens produced during cooking of meat and fish. Cancer science 95, 290–299 (2004).
OpenUrl CrossRef PubMed

[7] Ito, N. et al. A new colon and mammary carcinogen in cooked food, 2-amino-1-methyl-6-phenylimidazo[4,5-b]pyridine (PhIP). Carcinogenesis 12, 1503–1506 (1991).
OpenUrl CrossRef PubMed Web of Science

[8] Sinha, R. et al. Lower levels of urinary 2-amino-3,8-dimethylimidazo[4,5-f]-quinoxaline (MeIQx) in humans with higher CYP1A2 activity. Carcinogenesis 16, 2859–2861 (1995).
OpenUrl CrossRef PubMed Web of Science

[9] ↵
Moonen, H., Engels, L., Kleinjans, J. & Kok, T. The CYP1A2-164A-->C polymorphism (CYP1A2*1F) is associated with the risk for colorectal adenomas in humans. Cancer Lett 229, 25–31 (2005).
OpenUrl CrossRef PubMed Web of Science

[10] ↵
Li, J., Zhang, L., Zhou, H., Stoneking, M. & Tang, K. Global patterns of genetic diversity and signals of natural selection for human ADME genes. Human molecular genetics 20, 528–540, doi:10.1093/hmg/ddq498 (2011).
OpenUrl CrossRef PubMed Web of Science

[11] ↵
Janha, R. E. et al. Inactive alleles of cytochrome P450 2C19 may be positively selected in human evolution. BMC evolutionary biology 14, 71, doi:10.1186/1471-2148-14-71 (2014).
OpenUrl CrossRef PubMed

[12] ↵
Pritchard, J. K., Pickrell, J. K. & Coop, G. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Current biology: CB 20, R208–215, doi:10.1016/j.cub.2009.11.055 (2010).
OpenUrl CrossRef PubMed Web of Science

[13] ↵
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS biology 4, e72, doi:10.1371/journal.pbio.0040072 (2006).
OpenUrl CrossRef PubMed

[14] ↵
Langhi, D. M., Jr.. & Bordin, J. O. Duffy blood group and malaria. Hematology 11, 389–398, doi:10.1080/10245330500469841 (2006).
OpenUrl CrossRef PubMed

[15] ↵
Daub, J. T. et al. Evidence for polygenic adaptation to pathogens in the human genome. Molecular biology and evolution 30, 1544–1558, doi:10.1093/molbev/mst080 (2013).
OpenUrl CrossRef PubMed Web of Science

[16] ↵
Whirl-Carrillo, M. et al. Pharmacogenomics knowledge for personalized medicine. Clinical pharmacology and therapeutics 92, 414–417, doi:10.1038/clpt.2012.96 (2012).
OpenUrl CrossRef PubMed

[17] ↵
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America 100, 9440–9445, doi:10.1073/pnas.1530509100 (2003).
OpenUrl Abstract/FREE Full Text

[18] ↵
Birley, A. J. et al. ADH single nucleotide polymorphism associations with alcohol metabolism in vivo. Human molecular genetics 18, 1533–1542, doi:10.1093/hmg/ddp060 (2009).
OpenUrl CrossRef PubMed Web of Science

[19] ↵
Luo, X. et al. Multiple ADH genes modulate risk for drug dependence in both African- and European-Americans. Human molecular genetics 16, 380–390, doi:10.1093/hmg/ddl460 (2007).
OpenUrl CrossRef PubMed

[20] ↵
Han, Y. et al. Evidence of positive selection on a class I ADH locus. American journal of human genetics 80, 441–456, doi:10.1086/512485 (2007).
OpenUrl CrossRef PubMed

[21] ↵
Peng, Y. et al. The ADH1B Arg47His polymorphism in east Asian populations and expansion of rice domestication in history. BMC evolutionary biology 10, 15, doi:10.1186/1471-2148-10-15 (2010).
OpenUrl CrossRef PubMed

[22] ↵
Schwab, S. G. et al. Association of DNA polymorphisms in the synaptic vesicular amine transporter gene (SLC18A2) with alcohol and nicotine dependence. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology 30, 2263–2268, doi:10.1038/sj.npp.1300809 (2005).
OpenUrl CrossRef PubMed Web of Science

[23] Weiss, R. B. et al. A candidate gene approach identifies the CHRNA5-A3-B4 region as a risk factor for age-dependent nicotine addiction. PLoS genetics 4, e1000125, doi:10.1371/journal.pgen.1000125 (2008).
OpenUrl CrossRef PubMed

[24] Hoft, N. R. et al. SNPs in CHRNA6 and CHRNB3 are associated with alcohol consumption in a nationally representative sample. Genes, brain, and behavior 8, 631–637, doi:10.1111/j.1601-183X.2009.00495.x (2009).
OpenUrl CrossRef PubMed Web of Science

[25] ↵
Hoft, N. R. et al. Genetic association of the CHRNA6 and CHRNB3 genes with tobacco dependence in a nationally representative sample. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology 34, 698–706, doi:10.1038/npp.2008.122 (2009).
OpenUrl CrossRef PubMed Web of Science

[26] ↵
Catsburg, C. et al. Polymorphisms in carcinogen metabolism enzymes, fish intake, and risk of prostate cancer. Carcinogenesis 33, 1352–1359, doi:10.1093/carcin/bgs175 (2012).
OpenUrl CrossRef PubMed Web of Science

[27] Wang, J. et al. Carcinogen metabolism genes, red meat and poultry intake, and colorectal cancer risk. Int J Cancer 130, 1898–1907 (2011).
OpenUrl

[28] Cotterchio, M. et al. Red meat intake, doneness, polymorphisms in genes that encode carcinogen-metabolizing enzymes, and colorectal cancer risk. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 17, 3098–3107, doi:10.1158/1055-9965.EPI-08-0341 (2008).
OpenUrl Abstract/FREE Full Text

[29] ↵
Stephenson, N., Beckmann, L. & Chang-Claude, J. Carcinogen metabolism, cigarette smoking, and breast cancer risk: a Bayes model averaging approach. Epidemiologic perspectives & innovations: EP+I 7, 10, doi:10.1186/1742-5573-7-10 (2010).
OpenUrl CrossRef

[30] ↵
Sissung, T. M. et al. Association of the CYP1B1*3 allele with survival in patients with prostate cancer receiving docetaxel. Mol Cancer Ther 7, 19–26 (2008).
OpenUrl Abstract/FREE Full Text

[31] ↵
Bigos, K. L. et al. Genetic variation in CYP3A43 explains racial difference in olanzapine clearance. Molecular psychiatry 16, 620–625, doi:10.1038/mp.2011.38 (2011).
OpenUrl CrossRef PubMed

[32] ↵
Ovsyannikova, I. G. et al. The role of polymorphisms in Toll-like receptors and their associated intracellular signaling genes in measles vaccine immunity. Human genetics 130, 547–561, doi:10.1007/s00439-011-0977-x (2011).
OpenUrl CrossRef PubMed

[33] ↵
Ovsyannikova, I. G. et al. Effects of vitamin A and D receptor gene polymorphisms/haplotypes on immune responses to measles vaccine. Pharmacogenetics and genomics 22, 20–31, doi:10.1097/FPC.0b013e32834df186 (2012).
OpenUrl CrossRef PubMed

[34] ↵
Rodriguez-Antona, C., Sayi, J. G., Gustafsson, L. L., Bertilsson, L. & Ingelman-Sundberg, M. Phenotype-genotype variability in the human CYP3A locus as assessed by the probe drug quinine and analyses of variant CYP3A4 alleles. Biochemical and biophysical research communications 338, 299–305, doi:10.1016/j.bbrc.2005.09.020 (2005).
OpenUrl CrossRef PubMed Web of Science

[35] ↵
Yucesoy, B. et al. Genetic variants in antioxidant genes are associated with diisocyanate-induced asthma. Toxicological sciences: an official journal of the Society of Toxicology 129, 166–173, doi:10.1093/toxsci/kfs183 (2012).
OpenUrl CrossRef PubMed Web of Science

[36] ↵
Teichert, M. et al. Dependency of phenprocoumon dosage on polymorphisms in the VKORC1, CYP2C9, and CYP4F2 genes. Pharmacogenetics and genomics 21, 26–34, doi:10.1097/FPC.0b013e32834154fb (2011).
OpenUrl CrossRef

[37] ↵
Clarke, J. M. & Hurwitz, H. I. Understanding and targeting resistance to antiangiogenic therapies. Journal of gastrointestinal oncology 4, 253–263, doi:10.3978/j.issn.2078-6891.2013.036 (2013).
OpenUrl CrossRef

[38] Schneider, B. P., Shen, F. & Miller, K. D. Pharmacogenetic biomarkers for the prediction of response to antiangiogenic treatment. The Lancet. Oncology 13, e427–436, doi:10.1016/S1470-2045(12)70275-9 (2012).
OpenUrl CrossRef PubMed

[39] ↵
Lambrechts, D. et al. VEGF pathway genetic variants as biomarkers of treatment outcome with bevacizumab: an analysis of data from the AViTA and AVOREN randomised trials. The Lancet. Oncology 13, 724–733, doi:10.1016/S1470-2045(12)70231-0 (2012).
OpenUrl CrossRef PubMed Web of Science

[40] ↵
Mahon, P. B., Zandi, P. P., Potash, J. B., Nestadt, G. & Wand, G. S. Genetic association of FKBP5 and CRHR1 with cortisol response to acute psychosocial stress in healthy adults. Psychopharmacology 227, 231–241, doi:10.1007/s00213-012-2956-x (2013).
OpenUrl CrossRef PubMed

[41] Tyrka, A. R. et al. Interaction of childhood maltreatment with the corticotropinreleasing hormone receptor gene: effects on hypothalamic-pituitary-adrenal axis reactivity. Biological psychiatry 66, 681–685, doi:10.1016/j.biopsych.2009.05.012 (2009).
OpenUrl CrossRef PubMed Web of Science

[42] ↵
Ozomaro, U., Wahlestedt, C. & Nemeroff, C. B. Personalized medicine in psychiatry: problems and promises. BMC medicine 11, 132, doi: 10.1186/1741-7015-11-132 (2013).
OpenUrl CrossRef PubMed

[43] ↵
Broyl, A. et al. Mechanisms of peripheral neuropathy associated with bortezomib and vincristine in patients with newly diagnosed multiple myeloma: a prospective analysis of data from the HOVON-65/GMMG-HD4 trial. The Lancet. Oncology 11, 1057–1065, doi:10.1016/S1470-2045(10)70206-0 (2010).
OpenUrl CrossRef PubMed Web of Science

[44] Kraus, S. et al. Impact of genetic polymorphisms on adenoma recurrence and toxicity in a COX2 inhibitor (celecoxib) trial: results from a pilot study. Pharmacogenetics and genomics 23, 428–437, doi:10.1097/FPC.0b013e3283631784 (2013).
OpenUrl CrossRef

[45] Beuselinck, B. et al. VEGFR1 single nucleotide polymorphisms associated with outcome in patients with metastatic renal cell carcinoma treated with sunitinib - a multicentric retrospective analysis. Acta oncologica 53, 103–112, doi:10.3109/0284186X.2013.770600 (2014).
OpenUrl CrossRef PubMed Web of Science

[46] ↵
Miles, D. W. et al. Biomarker results from the AVADO phase 3 trial of first-line bevacizumab plus docetaxel for HER2-negative metastatic breast cancer. British journal of cancer 108, 1052–1060, doi:10.1038/bjc.2013.69 (2013).
OpenUrl CrossRef PubMed Web of Science

[47] ↵
Scheiman, J. M., Patel, P. M., Henson, E. K. & Nostrant, T. T. Effect of naproxen on gastroesophageal reflux and esophageal function: a randomized, double-blind, placebo-controlled study. The American journal of gastroenterology 90, 754–757 (1995).
OpenUrl PubMed

[48] ↵
Maisano Delser, P. & Fuselli, S. Human loci involved in drug biotransformation: worldwide genetic variation, population structure, and pharmacogenetic implications. Human genetics 132, 563–577, doi:10.1007/s00439-013-1268-5 (2013).
OpenUrl CrossRef PubMed

[49] ↵
Thangaraj, K. et al. Reconstructing the origin of Andaman Islanders. Science 308, 996, doi:10.1126/science.1109987 (2005).
OpenUrl Abstract/FREE Full Text

[50] ↵
Shah, A. M. et al. Indian Siddis: African descendants with Indian admixture. American journal of human genetics 89, 154–161 (2011).
OpenUrl CrossRef PubMed

[51] ↵
Govindaraj, P. et al. Genome-wide analysis correlates Ayurveda Prakriti. Scientific reports 5, 15786, doi:10.1038/srep15786 (2015).
OpenUrl CrossRef

[52] ↵
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS genetics 2, e190, doi:10.1371/journal.pgen.0020190 (2006).
OpenUrl CrossRef PubMed

[53] ↵
Liu, X., Saw, W. Y., Ali, M., Ong, R. T. & Teo, Y. Y. Evaluating the possibility of detecting evidence of positive selection across Asia with sparse genotype data from the HUGO Pan-Asian SNP Consortium. BMC genomics 15, 332, doi:10.1186/1471-2164-15-332 (2014).
OpenUrl CrossRef PubMed

[54] Chen, H., Patterson, N. & Reich, D. Population differentiation as a test for selective sweeps. Genome research 20, 393–402, doi:10.1101/gr.100545.109 (2010).
OpenUrl Abstract/FREE Full Text

[55] ↵
Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).
OpenUrl CrossRef PubMed Web of Science

[56] ↵
Beaumont MA, N. R. Evaluating Loci for Use in the Genetic Analysis of Population Structure. Proc R Soc Lond B 263, 1619–1626 (1996).
OpenUrl CrossRef

[57] ↵
Hofer, T., Foll, M. & Excoffier, L. Evolutionary forces shaping genomic islands of population differentiation in humans. BMC genomics 13, 107, doi:10.1186/1471-2164-13-107 (2012).
OpenUrl CrossRef PubMed

[58] ↵
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559–575, doi:10.1086/519795 (2007).
OpenUrl CrossRef PubMed

[59] ↵
Zhernakova, A. et al. Evolutionary and functional analysis of celiac risk loci reveals SH2B3 as a protective factor against bacterial infection. American journal of human genetics 86, 970–977, doi:10.1016/j.ajhg.2010.05.004 (2010).
OpenUrl CrossRef PubMed Web of Science

Signatures of natural selection in the drug metabolizing enzyme genes: Opportunity for developing personalized and precision medicine

Abstract

Background

Results and discussions

Evidence of natural selection

Modifiers ADME genes are most influenced under natural selection

Soft sweeps (polygenic adaptation) signals of natural selection

Hard sweeps signals of natural selection

Clinical annotation of hard sweeps signals through literatures

Expression analysis

Signal of ancient positive natural selection

Conclusions

Methods

SNP data

Imputation of missing genotypes

Principal component analysis (PCA)

Test for natural selection

Clinical annotation of Hard-sweep signals

Expression analysis

Age of variants

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and material

Competing interests

Funding

Authors’ contributions

Acknowledgements

References

Citation Manager Formats

Subject Area