Genomic analysis of European Drosophila populations reveals longitudinal structure and continent-wide selection

Martin Kapun; Maite G. Barrón; Fabian Staubach; Jorge Vieira; Darren J. Obbard; Clément Goubert; Omar Rota-Stabelli; Maaria Kankare; Annabelle Haudry; R. Axel W. Wiberg; Lena Waidele; Iryna Kozeretska; Elena G. Pasyukova; Volker Loeschcke; Marta Pascual; Cristina P. Vieira; Svitlana Serga; Catherine Montchamp-Moreau; Jessica Abbott; Patricia Gibert; Damiano Porcelli; Nico Posnien; Sonja Grath; Élio Sucena; Alan O. Bergland; Maria Pilar Garcia Guerreiro; Banu Sebnem Onder; Eliza Argyridou; Lain Guio; Mads Fristrup Schou; Bart Deplancke; Cristina Vieira; Michael G. Ritchie; Bas J. Zwaan; Eran Tauber; Dorcas J. Orengo; Eva Puerma; Montserrat Aguadé; Paul S. Schmidt; John Parsch; Andrea J. Betancourt; Thomas Flatt; Josefa González

doi:10.1101/313759

Abstract

Genetic variation is the fuel of evolution. However, analyzing evolutionary dynamics in natural populations is challenging, sequencing of entire populations remains costly and comprehensive sampling logistically difficult. To tackle this issue and to define relevant spatial and temporal scales of variation, we have founded the European Drosophila Population Genomics Consortium (DrosEU). Here we present the first analysis of 48 D. melanogaster population samples collected across Europe in 2014. Our analysis uncovers novel patterns of variation at multiple levels: genome-wide neutral SNPs, mtDNA haplotypes, inversions, and TEs showing previously cryptic longitudinal population structure; signatures of selective sweeps shared among populations; presumably adaptive clines in inversions; and geographic variation in TEs. Additionally, we document highly variable microbiota and identify several new Drosophila viruses. Our study reveals novel aspects of the population biology of D. melanogaster and illustrates the power of extensive sampling and pooled sequencing of populations on a continent-wide scale.

Introduction

Genetic variation is the raw material for evolutionary change. Understanding the processes that create and maintain variation in natural populations remains a fundamental goal in evolutionary biology. The identification of patterns of genetic variation within and among taxa (Dobzhansky 1970; Lewontin 1974; Kreitman 1983; Kimura 1984; Hudson et al. 1987; McDonald & Kreitman 1991; e.g., Adrian & Comeron 2013) provides fundamental insights into the action of various evolutionary forces. Historically, due to technological constraints, studies of genetic variation were limited to single loci or small genomic regions and to static sampling of small numbers of individuals from natural populations. The development of population genomics has extended such analyses to patterns of variation on a genome-wide scale (e.g., Black et al. 2001; Jorde et al. 2001; Luikart et al. 2003; Begun et al. 2007; Sella et al. 2009; Charlesworth 2010; Casillas & Barbadilla 2017). This has resulted in fundamental advances in our understanding of historical and contemporaneous evolutionary dynamics in natural populations (e.g., Sella et al. 2009; Hohenlohe et al. 2010; Cheng et al. 2012; Fabian et al. 2012; Pool et al. 2012; Messer & Petrov 2013; Ellegren 2014; Harpur et al. 2014; Kapun et al. 2014; Bergland et al. 2014; Charlesworth 2015; Zanini et al. 2015; Kapun et al. 2016a; Casillas & Barbadilla 2017).

However, large-scale sampling and genome sequencing of entire populations remains largely prohibitive in terms of sequencing costs and labor-intensive sample collection, limiting the number of populations that can be analyzed. Evolution is a highly dynamic process across a variety of spatial scales in many taxa; thus, to generate a comprehensive context for population genomic analyses, it is essential to define the appropriate spatial scales of analysis, from meters to thousands of kilometers (Levins 1968; Endler 1977; Richardson et al. 2014). Furthermore, one-time sampling of natural populations provides only a static view of patterns of genetic variation. Allele frequency changes can be highly dynamic even across very short timescales (e.g., Umina et al. 2005; Bergland et al. 2014; Behrman et al. 2018), and theoretical work suggests that such temporal dynamics may be an important yet largely understudied mechanism by which genetic variation is maintained (Wittmann et al. 2017). It is thus essential to define the relevant spatio-temporal scales for sampling and population genomic analyses accordingly.

To generate a population genomic framework that can deliver appropriate high-resolution sampling and to provide a unique resource to the research community, we formed the European Drosophila Population Genomics Consortium (DrosEU; https://droseu.net). Our primary objective is to utilize the strengths of this consortium to extensively sample and sequence European populations of Drosophila melanogaster on a continent-wide scale and across distinct timescales. In close cooperation with a complementary effort focused on North American populations, the Drosophila Real Time Evolution Consortium (Dros-RTEC; http://web.sas.upenn.edu/paul-schmidt-lab/dros-rtec/), our long-term goal is to define the appropriate spatio-temporal scales at which populations should be sampled and analyzed and to gain novel insights into the dynamics of genetic variation.

D. melanogaster offers several advantages for such a concerted sampling and analysis effort: a relatively small genome, a broad geographic range, a multivoltine life history that allows sampling across generations over short timescales, ease of sampling natural populations using standardized techniques, an extensive research community and a well-developed context for population genomic analysis (Powell 1997; Keller 2007; Hales et al. 2015). The species is native to sub-Saharan Africa and has subsequently expanded its range into novel habitats in Europe over the last 10,000-15,000 years and in North America and Australia in the last several hundred years (e.g., Lachaise et al. 1988; David & Capy 1988; Keller 2007). On both the North American and Australian continents, the prevalence of latitudinal clines in frequencies of alleles (e.g., Schmidt & Paaby 2008; Turner et al. 2008; Kolaczkowski et al. 2011b; Fabian et al. 2012; Bergland et al. 2014; Machado et al. 2016; Kapun et al. 2016a), structural variants such as chromosomal inversions (Mettler et al. 1977; Voelker et al. 1978; Knibb et al. 1981; Knibb 1982; 1986; Anderson et al. 1991; Rako et al. 2006; Kapun et al. 2014; Rane et al. 2015; Kapun et al. 2016a) and transposable elements (TEs) (Boussy et al. 1998; González et al. 2008; 2010), as well as complex phenotypes (de Jong & Bochdanovits 2003; Schmidt & Paaby 2008; Schmidt et al. 2008; Flatt et al. 2013; Adrion et al. 2015 and references therein; Kapun et al. 2016b; Behrman et al. 2018) have been interpreted to result from local adaptation to environmental factors that co-vary with latitude or as the legacy of an out-of-Africa dispersal history. However, sampling across these latitudinal gradients has not been replicated outside of a single transect on the east coasts of both continents. The observed latitudinal clines on the east coasts of North America and Australia may have been generated, at least in part, by demography and differential colonization histories of populations at high and low latitudes (Bergland et al. 2016). In North America, for example, temperate populations appear to be largely of European origin, whereas low-latitude populations show evidence of greater admixture from ancestral African populations and the Caribbean (Caracristi & Schlötterer 2003; Yukilevich & True 2008a; b; Duchen et al. 2013; Kao et al. 2015; Bergland et al. 2016). More intensive sampling and analysis of both African as well as European populations is thus essential to disentangling the relative importance of local adaptation versus colonization history and demography in generating the clinal patterns that have been widely observed. While there has been a great deal of progress in the analysis of ancestral African populations (e.g., Begun & Aquadro 1993; Corbett-Detig & Hartl 2012; Pool et al. 2012; Fabian et al. 2015; Lack et al. 2015; 2016), the European continent remains largely uncharacterized at the population genomic level (Božičević et al. 2016; Pool et al. 2016; Mateo et al. 2018).

Here, we present the first analysis of the DrosEU pool-sequencing data from a set of 48 European population samples collected in 2014. Our main focus is on describing spatial variation across the European continent. A similar consortium has been organized mainly in the United States, the Dros-RTEC consortium. While the two consortia share the common goal of widespread and coordinate sampling, the Dros-RTEC consortium concentrates on seasonal dynamics in North American populations (Machado et al. 2018). We examine the 2014 DrosEU data at three levels: (1) patterns of variation at single-nucleotide polymorphisms (SNPs) in the nuclear (∼5.5 × 10⁶ SNPs) and mitochondrial (mtDNA) genomes; (2) variation in copy number of transposable elements (TEs); (3) cosmopolitan chromosomal inversions previously associated with climate adaptation; and (4) variation among populations in microbiota, including endosymbionts, bacteria, and viruses (Figure 1).

Figure 1. The conceptual framework of the DrosEU consortium.

By intensive spatio-temporal sampling of natural populations of Drosophila melanogaster, the European Drosophila Population Genomics Consortium (DrosEU; http://droseu.net/), aims to uncover the factors that shape the evolutionary dynamics of this exemplary model organism. Each of the repeatedly and consistently sampled DrosEU populations is subject to evolutionary forces (“Evolution axis”, from neutral to adaptive evolution) in interaction with the environment (“Environment axis”, from local aspects to global patterns, including spatial factors). In addition, there are several dimensions along which the fly genomes can be studied, from single SNPs and genes to structural variants and co-evolving genomes (“Genomics axis”), both over short and long timescales (“Timescale axis”).

We find that European populations of D. melanogaster exhibit novel patterns of variation at all levels investigated: neutral SNPs in the nuclear genome and mtDNA haplotypes that reveal previously unknown longitudinal population structure; genomic regions consistent with selective sweeps that indicate selection on a continent-wide scale; new evidence for inversion clines in Europe; and spatio-temporal variation in TEs frequencies. We also identify four new DNA viruses and for the first time assemble the complete genome of a fifth. These novel features are revealed by the comprehensive magnitude of our coordinated sampling, thus demonstrating the utility of this approach.

Together with other large-scale genomic datasets for D. melanogaster (Casillas & Barbadilla 2017) our data provide a rich and powerful community resource for studies of molecular population genetics. Importantly, the DrosEU dataset represents the first comprehensive characterization of genetic variation in D. melanogaster on the European continent and might yield important insights into how this species has adapted to temperate climates after its migration out of Africa.

Results

As part of the DrosEU effort, we collected and sequenced 48 population samples of D. melanogaster from 32 geographical locations across Europe in 2014 (Table 1; Figure 2 and Figure 3A).

Figure 2. The geographic distribution of population samples.

The map shows the geographic locations of all samples in the 2014 DrosEU dataset. The color of the circles indicates the sampling season for each location (see Table 1 and Supplemental Table 1). Note that some of the 12 Ukranian locations overlap in the map.

Figure 3. Sampling and data analysis pipeline.

The schematic diagram shows the workflow of data collection and processing (A) followed by bioinformatic approaches used for quality assessment and read mapping (B) as well as the downstream analyses (C) conducted in this study (see Materials and Methods for further information).

View this table:

Table 1. Sample information for all populations in the DrosEU dataset.

The table shows the origin, collection data and season and sample size (number of chromosomes: n) of the 48 samples in the DrosEU dataset. Additional information can be found in the supporting information in Table S1.

While our analyses focus on spatial patterns, thirteen of the 32 locations were sampled repeatedly during the year (at least twice, once in summer and once in fall), allowing a first, crude analysis of seasonal changes in allele frequencies on a genome-wide level (Figure 2). For an extensive analysis of temporal (seasonal) patterns in mainly North American populations see the companion paper by Machado et al. (2018). All 48 samples were sequenced to high coverage, with a mean coverage per population of >50x (Table S1 and Figure 4).

Figure 4. Chromosome-wide average coverages.

Barplot showing the distribution of chromosome-wide coverages for X-chromosomes (black) and autosomes (grey). The error bars represent the standard deviation of coverages across all autosomal arms (2L, 2R, 3L, 3R).

Using this high-quality dataset, we performed the first comprehensive, continent-wide genomic analysis of European D. melanogaster populations (Figure 3). In addition to nuclear SNPs, we also investigated variation in mtDNA, TE insertions, chromosomal inversion polymorphisms, and the Drosophila-associated microbiome (Figure 3).

Most SNPs are widespread throughout Europe

We identified a total of 5,558,241 “high confidence” SNPs with frequencies > 0.1% across all 48 samples (Figure 3B, Table S1 and S2). Of these, 17% (941,080) were shared among all samples, whereas 62% were polymorphic in fewer than 50% of the samples (Figure 5A).

Figure 5 with 2 supplements. SNP sharing.

(A) Shared SNPs among DrosEU samples. Number and proportion of SNPs in different samples, ranging from one specific sample to being shared among all 48 samples. (B) Shared SNPs among three worldwide populations. Elliptic Venn diagram showing the number and proportion of SNPs overlapping among the 5,361,256 biallelic SNPs in DrosEU (Europe), 3,953,804 biallelic SNPs in DGRP (North America) and 4,643,511 biallelic SNPs in Zambia (Africa) populations.

Figure 5 - figure supplement 1. SNP sharing among DrosEU samples.

Number and proportion of SNPs in different 33 samples, each from one location. In the locations with more than one sample we took (A) the earliest sample or (B) the latest of the season.

Figure 5 - figure supplement 2. SNP sharing with other populations by frequency bins.

Elliptic Venn-diagram showing the proportion of SNPs shared and specific to each population, DrosEU (Europe), DGRP (North America) and Zambia (Africa) for different frequency bins: (A) frequency ≥ 0.5, (B) frequency ≥ 0.1 and < 0.5 and (C) frequency < 0.1. For proportions <1%, the number of SNPs is not depicted; DrosEU-Zambia: 4,088 SNPs, DrosEU-DGRP: 24,049 SNPs, DrosEU-DGRP-Zambia: 3,627 SNPs.

Due to our filtering scheme, SNPs that are private or nearly private to a sample will be recovered only if they are at a substantial frequency in that sample (∼5%). In fact, only a small proportion of SNPs (1% = 3,645) was found in fewer than 10% of the samples, and only 0.004% (210) were specific to a single sample (Figure 5A). To avoid an excess contribution of SNPs from populations with multiple (seasonal) sampling, we repeated the analysis by considering only the earliest (Figure 5 - figure supplement 1A) or the latest (Figure 5 - figure supplement 1B) sample from populations with seasonal data. We observed similar patterns across the three analyses: (i) a very small number of sample-specific, private SNPs (210, 527 and 455, respectively), (ii) a majority of SNPs shared among 20% to 40% of the samples (53%, 52% and 52%, respectively), and (iii) a substantial proportion shared among all samples (17%, 20% and 19%, respectively; Figure 5A and Figure 5 - figure supplement 1). These results suggest that most SNPs are geographically widespread in Europe and that genetic differentiation among populations is moderate, consistent with high levels of gene flow across the European continent.

Derived European and North American populations share more SNPs with each other than they do with an ancestral African population

D. melanogaster originated in sub-Saharan Africa, migrated to Europe ∼10,000-15,000 years ago, and subsequently colonized the rest of the world, including North America and Australia ∼150 years ago (Lachaise et al. 1988; David & Capy 1988; Keller 2007). To search for genetic signatures of this shared history, we investigated the amount of allele sharing between African, European, and North American populations. We compared our SNP set to two published datasets, one from Zambia in sub-Saharan Africa (DPGP3; Lack et al. 2015) and one from North Carolina in North America (DGRP; Huang et al. 2014).

Populations from Zambia inhabit the ancestral geographical range of D. melanogaster (Pool et al. 2012; Lack et al. 2015); North American populations are thought to be derived from European populations, with some degree of admixture from African populations, particularly in the southern United States and the Caribbean (Caracristi & Schlötterer 2003; Yukilevich & True 2008a; b; Yukilevich et al. 2010; Duchen et al. 2013; Kao et al. 2015; Pool 2015; Bergland et al. 2016). The population from North Carolina exhibits primarily European ancestry, with ∼15% admixture from Africa (Bergland et al. 2016).

Approximately 10% of the SNPs (∼1 million) were shared among all three datasets (Figure 5B). Since the out-of-Africa range expansion and the subsequent colonization of the North America continent by European (and to a lesser degree African) ancestors was likely accompanied by founder effects, leading to a loss of African alleles, and adaptation to temperate climates (Mettler et al. 1977), we predicted that a relatively high proportion of SNPs would be shared between Europe and North America. As expected, the proportion of shared SNPs was higher between Europe and North America (22%) than between either Europe or North America and Zambia (11% and 13%, respectively; Figure 5B).

When we analyzed SNPs in variant frequency bins, the proportion of SNPs shared across at least two continents increased from 26% to 41% for SNPs, with variant frequencies larger than 50% (Figure 5 - figure supplement 2A). In contrast, only 6% of the SNPs at low frequency (<10%; Figure 5 - figure supplement 2C) were shared. These results are consistent with the loss of low-frequency variants during the colonization of the European continent; they suggest that intermediate frequency alleles are more likely to be ancestral and thus shared across broad geographic scales. Interestingly, as compared to Africa and North America, we identified nearly 3 million private SNPs that are specific to Europe (Figure 5B). Given that North American and Australian populations are – at least partly – of European ancestry (see Lemeunier & Aulard 1992 for more details), future analysis of our data may be able to shed light on the demography and adaptation of these derived populations.

European and other derived populations exhibit similar amounts of genetic variation

Next, we estimated genome-wide levels of nucleotide diversity within the European population samples using population genetic summary statistics. Pairwise nucleotide diversity (π and Watterson’s θ), corrected for pooling (Stalker 1976; Mettler et al. 1977; Voelker et al. 1978; Stalker 1980; Sezgin et al. 2004), ranged from 0.0047 to 0.0057 and from 0.0045 to 0.0064, respectively (Figure 6 and Figure 7), with our estimates being qualitatively similar to those from non-African D. melanogaster populations sequenced as individuals (Knibb et al. 1981; Knibb 1982; Anderson et al. 1987) or as pools (Inoue & Watanabe 1979; Inoue et al. 1984).

Figure 6. Chromosome-wide average Tajima’s π.

Barplot showing the distribution of chromosome-wide estimates of Tajima’s π for X-chromosomes (black) and autosomes (grey). The error bars represent the standard deviation of π across all autosomal arms (2L, 2R, 3L, 3R).

Figure 7. Chromosome-wide average Watterson’s θ.

Barplot showing the distribution of chromosome-wide estimates of Watterson’s θ for X-chromosomes (black) and autosomes (grey). The error bars represent the standard deviation of θ across all autosomal arms (2L, 2R, 3L, 3R).

Figure 8. Genetic variation in worldwide samples.

Barplot showing the distribution of genome-wide estimates of Tajima’s π of the DrosEU and other genomic datasets (see Materials and Methods for more details) The error bars in the DrosEU dataset represent the standard deviation of π across all 48 population samples.

Estimates of π were slightly lower than, but in close agreement with, estimates of θ, leading to a slightly negative average of Tajima’s D (Das & Singh 1990; 1991; Singh & Das 1992; Singh 2018). Due to our SNP calling approach (see Materials and Methods), we found a deficiency of alleles with frequencies ≤ 0.01, both in the sample-wise site frequency spectra (SFS) as well as in the combined SFS by SNP type, with the sample-wise SFS being skewed towards low frequency variants (Figure 9A).

Figure 9. Site frequency spectra (SFS).

(A) Chromosome-wise SFS: The five panels show the means (dark blue line) and standard deviations (light blue area) of the folded SFS across all 48 samples. The red dashed line indicates the expected SFS based on a θ of 10⁻⁴. (B) SFS by SNP type: The two panels show the means (dark line) and standard deviations (light area) of the folded SFS for different SNP types (intergenic, intronic, non-synonymous and synonymous), combining SNPs across the autosomes (left) and the X-chromosome (right).

In addition, we observed an excess of low-frequency SNPs at non-synonymous sites as compared to other types of sites, which is consistent with purifying selection eliminating deleterious non-synonymous mutations (Endler 1977).

Overall, we detected only minor differences in the amount of genetic variation among populations. Specifically, genome-wide π ranged from 0.005 (Yalta, Ukraine) to 0.006 (Chalet à Gobet, Switzerland) for autosomes, and from 0.003 (Odesa, Ukraine) to 0.0035 (Chalet à Gobet, Switzerland) for the X chromosome (Table S1 and Figure 6). When testing for associations between geographic variables and genome-wide average levels of genetic variation, we found that both π and θ were strongly negatively correlated with altitude, but neither was correlated with latitude or longitude (Table 2). There were no correlations between the season in which the samples were collected and levels of average genome-wide genetic variation as measured by π and θ (Table 2).

View this table:

Table 2. Clinality of genetic variation and population structure.

Effects of geographic variables and/or seasonality on genome-wide average levels of diversity (π, θ and Tajima’s D; top rows) and on the first three axes of a PCA based on allele frequencies at neutrally evolving sites (bottom rows). The values represent F-ratios from general linear models. Bold type indicates F-ratios that are significant after Bonferroni correction (adjusted α’=0.0055). Asterisks in parentheses indicate significance when accounting for spatial autocorrelation by spatial error models. These models were only calculated when Moran’s I test, as shown in the last column, was significant. *p < 0.05; **p < 0.01; ***p < 0.001.

The X chromosome showed markedly lower genetic variation than the autosomes, with the ratio of X-linked to autosomal variation (π_X/π_A) ranging from 0.53 to 0.66. These values are well below the ratio of 0.75 (one-sample Wilcoxon rank test, p < 0.001) expected under standard neutrality and equal sex ratio, but are consistent with previous findings for European populations of D. melanogaster and can be attributed to either selection (Knibb et al. 1981) or changes in population size (Knibb et al. 1981). This pattern is consistent with previous estimates of relatively low X-linked diversity for European (Kapun et al. 2014) and other non-African populations (Kapun et al. 2016a). Interestingly, the ratio π_X/π_A was significantly, albeit weakly, positively correlated with latitude (Spearman’s r = 0.315, p = 0.0289), with northern populations having slightly higher X/A ratios than southern populations. This is at odds with the prediction of periodically bottlenecked populations leading to a lower X/A ratio in the north and perhaps reflects more complex demographic scenarios (Mettler et al. 1977; Voelker et al. 1978; Knibb et al. 1981; Knibb 1982; Das & Singh 1991; Van ‘t Land et al. 2000; de Jong & Bochdanovits 2003; Anderson et al. 2005; Umina et al. 2005; Rako et al. 2006; Kapun et al. 2014; 2016a).

In contrast to π and θ, we observed major differences in the genome-wide averages of Tajima’s D among samples (Figure 10).

Figure 10. Chromosome-wide average Tajima’s D.

Barplot showing the distribution of chromosome-wide estimates of Tajima’s D for X-chromosomes (black) and autosomes (grey). The error bars represent the standard deviation of D across all autosomal arms (2L, 2R, 3L, 3R).

The chromosome-wide Tajima’s D was negative in approximately half of all samples and close to zero or slightly positive in the remaining samples, possibly due to heterogeneity in the proportion of sequencing errors among the multiplexed sequencing runs. However, models that included sequence run as a covariate did not explain more of the observed variance than models without the covariate, suggesting that associations of π and θ with geographic variables were not confounded by sequencing heterogeneity (see Supporting Information; Table S4). Moreover, our results for π, θ and D are unlikely to be confounded by spatio-temporal autocorrelations: after accounting for similarity among spatial neighbors (Moran’s I ≈ 0, p > 0.05 for all tests), there were no significant residual autocorrelations among samples for these estimators.

Genetic variation was not distributed homogeneously across the genome. Both π and θ were markedly reduced close to centromeric and telomeric regions (Figure 11), which is in good agreement with previous studies reporting systematic reductions in genetic variation in regions with reduced recombination (Kennison 2008).

Figure 11 with 1 supplement. Genome-wide estimates of genetic diversity and recombination rates.

The distribution of Tajima’s π, Watterson’s θ and Tajima’s D (from top to bottom) in 200 kb non-overlapping windows plotted for each chromosomal arm separately. Bold black lines depict statistics which were averaged across all 48 samples and the upper and lower grey area show the corresponding standard deviations for each window. Red dashed lines highlight the vertical position of a zero value. The bottom row shows log-transformed recombination rates (r) in 100kb non-overlapping windows as obtained from Comeron et al. (2010).

Figure 11 - figure supplement 1. Correlation between recombination and genetic diversity.

Smooth local regression (LOESS) between Comeron et al. (2012) recombination rate in cM/Mb and the average of the 48 samples genetic diversity (π) in 100kb non-overlapping windows by chromosome arm.

Consistent with this, we detected strong correlations with estimates of recombination rates based on the data of Comeron et al. (2012) (linear regression, p < 0.001; not accounting for autocorrelation), suggesting that the distribution of genome-wide genetic variation is strongly influenced by the recombination landscape (Table S5). For autosomes, fine-scale recombination rates explained 41-47% of the variation in π, whereas broad-scale recombination rates (Roberts 1998; Pimpinelli et al. 2010) explained 50-56% of the variation in diversity. We obtained similar results for X-chromosomes, with recombination rates explaining 31-38% (Dobzhansky & Sturtevant 1938; Kunze-Mühl & Müller 1957; Ashburner & Lemeunier 1976) or 24-33% (Wesley & Eanes 1994; Andolfatto et al. 1999; Matzkin et al. 2005; Corbett-Detig et al. 2012) of the variation (Figure 11, Table S5, Figure 11 - figure supplement 1).

We also observed variation in Tajima’s D with respect to genomic position (Figure 11). Notably, Tajima’s D was markedly lower than the corresponding chromosome-wide average in the proximity of telomeric and centromeric regions on all chromosomal arms. These patterns possibly reflect purifying selection or selective sweeps close to heterochromatic regions (Navarro & Faria 2014; Kapun et al. 2014; 2016a), or might alternatively be a result of sequencing errors having a stronger effect on the SFS in low SNP density regions.

Localized reductions in Tajima’s D are consistent with selective sweeps

We identified 144 genomic locations on the autosomes with non-zero recombination, reduced genetic variation, and a local reduction in Tajima’s D (see Methods, Table S6), which jointly may be indicative of selective sweeps. Although we cannot rule out that these patterns are the result of non-selective demographic effects (e.g., bottlenecks), two observations suggest that at least some of these regions are affected by positive selection. First, bottlenecks are typically expected to cause genome-wide, non-localized reductions in Tajima’s D. Second, several of the genomic regions in our data coincide with previously identified, well-supported selective sweeps in the proximity of Hen1, Cyp6g1 (Andolfatto et al. 1999; Corbett-Detig & Hartl 2012), wapl (Andolfatto et al. 1999; Matzkin et al. 2005; Kennington et al. 2007; Corbett-Detig & Hartl 2012; Kennington & Hoffmann 2013; Kapun et al. 2014; 2016a), HDAC6 (Begun 2015; Lavington & Kern 2017), and around the chimeric gene CR18217 (Kirkpatrick 2010).

Figure 12 with 3 supplements. Signals of selective sweeps.

The central figure shows the distribution of Tajima’s D in 50 kb sliding windows with 40 kb overlap. The red and green dashed lines show Tajima’s D = 0 and Tajima’s D = −1, respectively. The top panel magnifies a genomic region on chromosomal arm 2R that harbors well-known candidate loci for pesticide resistance, Cyp6g1 and Hen1 (highlighted in red), where strong selection resulted in a selective sweep. This sweep is characterized by an excess of low-frequency SNP variants, indicated by an overall negative Tajima’s D in all samples. The colored solid lines depict Tajima’s D for each sample separately, whereas the black dashed line shows Tajima’s D averaged across all samples. (A legend for the color codes of the samples can be found in the Supporting Information file in Figure 12 - figure supplement 3). The bottom figure shows a genomic region on 3L which has not been previously identified as a potential target of selection but shows Tajima’s D patterns similar to the top figure. Notably, both regions are also characterized by a strong reduction of genetic variation (Figure 12 - figure supplement 1).

Figure 12 - figure supplement 1: Genetic variation in regions of putative selective sweeps.

This figure is equivalent to Figure 12 in the main text but shows the distribution of genetic variation (π) in regions with depressed Tajima’s D around the well-studied Cyp6g1 locus (A) and a previously unknown candidate region on 3L (B). Similar to Tajima’s D, π was calculated in 50 kb sliding windows with 40 kb overlap. See Table S6 for more examples. A legend for the color codes of the samples can be found in Figure 12 - figure supplement 3.

Figure 12 - figure supplement 2. Signals of selective sweeps in local populations.

This figure is equivalent to Figure 12 in the main text but shows depressions in Tajima’s D in single population samples. Such regions were identified by enriching for samples with window-wise Tajima’s D values < −0.9, which were smaller than 2 times the standard deviation of Tajima’s D for all samples in the corresponding genomic region. The top and bottom panels show two such examples where samples with reduced Tajima’s D are highlighted by thick lines and the corresponding sample names. See Table S6 for additional details. A legend for the color codes of the samples can be found in Figure 12 - figure supplement 3.

Figure 12 - figure supplement 3.

Legend for color code in Figure 12, Figure 12 - figure supplement 1 and Figure 12 - figure supplement 2.

However, some regions, such as those around wapl or HDAC6, are characterized by low recombination rates (< 0.5 cM/Mb; Table S5), which can itself lead to reduced variation and Tajima’s D (see also Nolte et al. 2013). Our screen also uncovered several regions that have not previously been described as harboring sweeps (Table S6). These represent promising candidate regions containing putative targets of positive selection. For several of these candidate regions, patterns of variation were highly similar across the majority of European samples, suggesting the existence of continent-wide selective sweeps that either predate the colonization of Europe (e.g., Beisswanger et al. 2006) or that have swept across all European populations more recently. In contrast, some candidate regions were restricted to only a few populations and characterized by highly negative values of Tajima’s D, i.e. deviating from the among-population average by more than two standard deviations, thus possibly hinting at cases of local, population-specific adaptation (Figure 12 - figure supplement 2 and Table S6 for examples).

European populations are strongly structured along an east-west gradient

We next investigated patterns of genetic differentiation due to demographic substructure. Overall, pairwise differentiation as measured by F_ST was relatively low, though markedly higher for X-chromosomes (0.043–0.076) than for autosomes (0.013–0.059; Student’s t-test; p < 0.001; Figure 13), possibly reflecting differences in effective population size between the X chromosome and the autosomes (Hutter et al. 2007). One population, from Sheffield (UK), showed an unusually high amount of differentiation on the X-chromosome as compared to other populations (Figure 13).

Figure 13. Chromosome-wide average F_ST.

Bar-plot showing the distribution of chromosome-wide F_ST averaged across all possible pairwise comparisons for a given sample. The black bars show observed Fst values and the dark grey bars expected F_ST values (based on autosomal values; see Materials and Methods) for the X chromosome. The light grey bars show autosomal F_ST values and error bars represent the standard deviation of average chromosome-wide F_ST across all autosomal arms (2L, 2R, 3L, 3R).

Despite these overall low levels of among-population differentiation, European populations showed some evidence of geographic substructure. To analyze this pattern in more detail, we focused on a set of SNPs located in short introns (< 60 bp), as these sites are relatively unaffected by selection (Haddrill et al. 2005; Singh et al. 2009; Parsch et al. 2010; Clemente & Vogl 2012; Lawrie et al. 2013). We analyzed the extent of isolation by distance (IBD) within Europe by correlating genetic and geographic distance and using pairwise F_ST between populations as a measure of genetic isolation. F_ST was overall low but significantly correlated with distance across the continent, indicating weak but significant IBD (Mantel test; p < 0.001; max. F_ST ∼ 0.05; Figure 14A). We also examined those populations that were most and least separated by genetic differentiation, estimated by pairwise F_ST (Figure 14B). In general, longitude had a stronger effect on isolation than latitude, with populations showing the strongest differentiation separated along an east-west, rather than a north-south, axis (Figure 14B). This pattern remained unchanged when the number of populations sampled from Ukraine was reduced to avoid overrepresentation (Figure 14 - figure supplement 1).

Figure 14 with 2 supplements. Genetic differentiation among European populations.

(A) isolation by distance estimated by average genetic differentiation (Fst) of 21,008 SNPs located in short introns (<60 bp) plotted against geographic distance. Mantel tests and linear regression (red dashed line and statistics in upper left box) indicate significance. (B) Average neutral F_ST among populations. The center plot shows the distribution of average neutral F_ST values for all 1,128 pairwise combinations. Mean neutral F_ST values were calculated by averaging individual F_ST values from 20,008 genome-wide intronic SNPs for each pairwise comparison. The plots on the left and the right show population pairs in the lower (blue) and upper (red) 5% tails of the F_ST distribution. (C) Population structure of all DrosEU samples as determined by PCA of allele frequencies of 20,008 SNPs located in short introns (< 60 bp). The optimal number of five clusters was estimated by hierarchical model fitting using the first four principal components. Cluster assignment of each population, which was estimated by k-means clustering, is indicated by color.

Figure 14 - figure supplement 1:

Genetic differentiation among European populations. Similar to Figure 14, the top plot shows the population structure of all DrosEU samples as determined by PCA of allele frequencies of 20,008 SNPs located in short introns (< 60 bp) but only on a reduced dataset with one randomly drawn Ukrainian sample to test if the longitudinal pattern found in the full dataset was an artifact of the excessive sampling in the Ukraine. The bottom left plot shows the cumulative variance explained by each of the principal components (PC) and the bottom right barplot depicts the Eigenvalues of each PC.

Figure 14 - figure supplement 2:

PCA of neutrally evolving SNPs in the 48 DrosEU samples. The top plot shows the cumulative variance explained by each of the principal components (PC) from a PCA based on the allele frequencies of 20,008 putatively neutrally evolving SNPs located in short (<60bp) introns distant from chromosomal inversions. The bottom barplot depicts the Eigenvalues of each PC.

To further explore these patterns, we performed a principal component analysis (PCA) on the allele frequencies of SNPs in short introns. The first three principal components (PC) explained more than 25% of the total variance (PC1: 16.3%, PC2: 5.4%, PC3: 4.8%, eigenvalues = 599.2, 199.1, and 178.5 respectively; Figure 14C and Figure 14 - figure supplement 2). As expected, PC1 was strongly correlated with longitude. Despite significant signals of autocorrelation, as indicated by Moran’s test on residuals from linear regressions with PC1, the association with longitude was not due to spatial autocorrelation, since a spatial error model also resulted in a significant association. PC2 was similarly, but to a lesser extent, correlated with longitude and also with altitude. PC3, by contrast, was not associated with any variable examined (Table 2). None of the major PC axes were correlated with season, indicating that there were no shared seasonal differences across samples in our dataset. Hierarchical model fitting based on the first three PC axes resulted in five distinct clusters (Figure 14C) that were oriented along the axis of PC1, supporting the notion of strong longitudinal differentiation among European populations. To the best of our knowledge, such a pronounced longitudinal signature of differentiation has not previously been reported in European D. melanogaster.

Remarkably, this pattern is qualitatively similar to that observed for human populations (Cavalli-Sforza 1966; Xiao et al. 2004; Francalacci & Sanna 2008), perhaps consistent with co-migration of this commensal species.

Mitochondrial haplotypes also exhibit longitudinal population structure

Our finding that European populations are longitudinally structured is also supported by an analysis of mitochondrial haplotypes. We identified two main mitochondrial haplotypes in Europe, separated by at least 41 mutations (between G1.2 and G2.1; Figure 15A). Our findings are consistent with similar analyses of mitochondrial haplotypes from a North American D. melanogaster population (Cooper et al. 2015) as well as from worldwide samples (Wolff et al. 2016), revealing varying degrees of differentiation among haplotypes, ranging from only a few to hundreds of substitutions. The two G1 subtypes (G1.1 and G1.2) are separated by only four mutations, and the three G2 subtypes are separated by a maximum of four mutations (between G2.1 and G2.3). The estimated frequency of these haplotypes varied greatly among populations (Figure 15B). Qualitatively, three types of European populations can be distinguished based on these haplotypes, namely those with (1) a high frequency (> 60%) of the G1 haplotypes, characteristic of central European samples, (2) a low frequency (< 40%) of G1 haplotypes, a pattern common for Eastern European populations in summer, and (3) a combined frequency of G1 haplotypes between 40-60%, which is typical of samples from the Iberian Peninsula and from Eastern Europe in fall (Figure 15 - figure supplement 1).

Figure 15 with 1 supplement. Mitochondrial haplotypes.

(A) TCS network showing the relationship of 5 common mitochondrial haplotypes; (B) estimated frequency of each mitochondrial haplotype in 48 European samples.

Figure 15 - figure supplement 1. Mitochondrial haplotypes.

(A) Graphical summary of the combined frequency of G1 haplotypes in Europe. Summer and Fall are represented at the top and bottom of the circles, respectively. White – no information; green, yellow and red represent a combined frequency of G1 haplotypes lower than 40%, in between 40% and 60% and higher than 60%, respectively. (B) Correlations between the combined frequency of G1 haplotypes and longitude (red diamonds for populations below 20° and red circles for populations above 20°).

We observed a significant shift in the relative frequencies of the two haplotype classes between summer and fall samples in only two of the nine possible comparisons among haplotypes. While there was no correlation between latitude and the combined frequency of G1 haplotypes, we found a weak but significant negative correlation between G1 haplotypes and longitude (r² = 0.10; p < 0.05), which is consistent with the longitudinal east-west population structure observed for intronic SNPs. In a subsequent analysis, we divided the dataset at 20° longitude into an eastern and a western subset since in northern Europe 20° longitude corresponds to the division of two major climatic zones, namely C (temperate) and D (cold), according to the Köppen-Geiger climate classification (Peel et al. 2007). When splitting the populations in a western (longitude < 20° E) and an eastern group (longitude > 20° E), we found a clear correlation between longitude and the combined frequency of G1 haplotypes, explaining as much as 50% of the variation in the western group (Figure 15 - figure supplement 1B). Similarly, in the eastern populations longitude and the combined frequency of G1 haplotypes were correlated, explaining approximately 20% of the variance (Figure 15 - figure supplement 1B). Thus, our data on mitochondrial haplotypes clearly confirm the existence of pronounced east-west population structure and differentiation in European D. melanogaster. While this might be due to climatic selection, as recently found for clinal mitochondrial haplotypes in Australia (Camus et al. 2017), we can presently not rule out an effect of demography.

The majority of TEs vary with longitude and altitude

To examine the population genetics of structural variants in our data, we first focused on transposable elements (TEs). The repetitive content of the 48 samples analyzed ranged from 16% to 21% with respect to nuclear genome size (Figure 16). The vast majority of detected repeats were TEs, mostly represented by long terminal repeats (LTR) and long interspersed nuclear elements (LINE; Class I), as well as a few DNA elements (Class II). LTR content best explained total TE content (LINE+LTR+DNA) (Pearson’s r = 0.87, p < 0.01, vs. DNA r = 0.58, p = 0.0117, and LINE r = 0.36, p < 0.01 and Figure S16A).

Figure 16 with 1 supplement. Transposable elements.

Relative abundances of repeats among samples. Proportion of each repeat class was estimated from sampled reads with dnaPipeTE (2 samples per run, 0.1X coverage per sample).

Figure 16 - figure supplement 1. Transposable Elements.

(A) shows the contribution of each of the main TE classes to variation of the total TE content. Correlations (Pearson’s correlation tests) between each of the three main TE classes (LTR, LINE and DNA) and the total TE content of each pool (LTR+LINE+DNA) in Kb. (B) The site frequency spectrum of TE frequencies per chromosome arm. Each dot represents the proportion of TEs in each bin per sample and a smoother geometric line had been added to highlight the trend. Lower panel is a zoom in of the above panel.

We next estimated population-wise frequencies of 1,630 TE insertions annotated in the D. melanogaster reference genome v.6.04 using T-lex2 (Table S7, Fiston-Lavier et al. 2010). On average, 56% of the TEs annotated in the reference genome were fixed in all samples. The remaining polymorphic TEs usually segregated at low frequency in all samples (Figure 16 - figure supplement 1A), potentially due to the effect of purifying selection (González et al. 2008; Petrov et al. 2011; Kofler et al. 2012; Cridland et al. 2013; Blumenstiel et al. 2014). However, we also observed 142 TE insertions present at intermediate (>10% and <95%) frequencies (Figure 16 - figure supplement 1B), which might be consistent with transposition-selection balance (Charlesworth et al. 1994).

In each of the 48 samples TE frequency and recombination rate were negatively correlated on a genome-wide level (Spearman rank sum test; p < 0.01), as previously reported (Bartolomé et al. 2002; Petrov et al. 2011; Kofler et al. 2012). This pattern still holds when only polymorphic TEs (population frequency <95%) are analyzed, although it becomes statistically non-significant for some chromosomes and populations (Table S8). In either case, the correlation is more negative when using broad-scale, rather than fine-scale, recombination rate estimates (Materials and methods, Tables S8B, S8D). This indicates that broad-scale recombination patterns may best capture long-term population recombination patterns.

We further tested whether the distribution of TE frequencies among samples could be explained by geographical or temporal variables. We focused on the 141 TE insertions that showed frequency variability among samples (interquartile range, (IQR) > 10; see Materials and Methods). Of these, 73 TEs showed significant associations with geographical or temporal variables after multiple testing correction (Table S9). Note that we used a conservative p-value threshold (< 0.001), and we did not find significant residual spatio-temporal autocorrelation among samples for any TE tested (Moran’s I > 0.05 for all tests; Table S9). 16 out of 73 TEs were located in regions of very low recombination (0 cM/Mb for either of the two recombination measures used). Among the 57 significant TEs located in high recombination regions, we observed significant correlations of 13 TEs with longitude, 13 with altitude, 5 with latitude, and 3 with season (Table S9). In addition, the frequencies of the other 23 insertions were significantly correlated with more than one of the above-mentioned variables (Table S9). These significant TEs were scattered along the main five chromosome arms (Table S9). Among the 57 significant TEs located in high recombination regions two TE families were enriched (χ² p-values after Yate’s correction < 0.05): the LTR 297 family with 11 copies, and the DNA pogo family with 5 copies (Table S10). We also checked the genomic localization of the 57 TEs. Most of them (42) were located inside genes: two in 5’UTR, four in 3’UTR, 18 in the first intron, and 18 TEs in subsequent introns. Additionally, 7 TEs are <1 kb from the nearest gene, indicating that these might potentially affect the regulation of nearby genes (Table S9). Interestingly, 14 of these 57 TEs coincide with previously identified candidate adaptive TEs (Table S9), suggesting that our dataset might be enriched for adaptive insertions. However, further analyses are needed to discard the effect of non-selective forces on the patterns observed.

Inversion polymorphisms in Europe exhibit latitudinal and longitudinal clines

Chromosomal inversions are another class of important and common structural genomic variants, often exhibiting frequency clines on multiple continents, some of which have been shown to be adaptive (e.g. Knibb 1982; Umina et al. 2005; Kapun et al. 2014; 2016a). However, little is known yet about the spatial distribution and clinality of inversions in Europe. We used a panel of inversion-specific marker SNPs (Kapun et al. 2014) to examine the presence and frequency of six cosmopolitan inversion polymorphisms (In(2L)t, In(2R)NS, In(3L)P, In(3R)C, In(3R)Mo, In(3R)Payne) in the 48 samples. All populations were polymorphic for one or more inversions (Figure 17). However, only In(2L)t segregated at substantial frequencies in most populations (average frequency = 20.2%). All other inversions were either absent or occurred at low frequencies (average frequencies: In(2R)NS = 6.2%, In(3L)P = 4%, In(3R)C = 3.1%, In(3R)Mo =2.2%, In(3R)Payne = 5.7%).

Figure 17 with 1 supplement. Distribution of inversion frequencies.

Cumulative bar plots showing the absolute frequencies of six cosmopolitan inversions (In(2L)t, In(2R)NS, In(3L)P, In(3R)C, In(3R)Mo, In(3R)Payne) in all 48 population samples of the DrosEU dataset.

Figure 17 - figure supplement 1. Clinal variation of the inversion In(3R)Payne across continents.

Parallel frequency clines of In(3R)Payne along the latitudinal axis at the North American east coast (red) and in Europe (blue) (see also Table S10).

Despite their overall low frequencies, several inversions exhibited clinal patterns across space (Table 3). We observed significant latitudinal clines for In(3L)P, In(3R)C and In(3R)Payne. Although they differed in overall frequencies, In(3L)P and In(3R)Payne showed latitudinal clines in Europe that are qualitatively similar to the clines previously observed along the North American and Australian east coasts (Figure S17 and Table S11, Kapun et al. 2016a). For the first time, we also detected a longitudinal cline for In(2L)t and In(2R)NS, with both inversions decreasing in frequency from east to west, a result that is consistent with our finding of strong longitudinal among-population differentiation in Europe. In(2L)t also increased in frequency with altitude (Table 3). Except for In(3R)C, we did not find significant residual spatio-temporal autocorrelation among samples for any inversion tested (Moran’s I ≈ 0, p > 0.05 for all tests; Table 3), suggesting that our analysis was not confounded by spatial autocorrelation for most of the inversions. It will clearly be interesting to examine the extent to which clines in inversions (and other genomic variants) across Europe are shaped by selection and/or demography in future work.

View this table:

Table 3. Clinality and/or seasonality of chromosomal inversions.

The values represent F-ratios from generalized linear models with a binomial error structure to account for frequency data. Bold type indicates deviance values that were significant after Bonferroni correction (adjusted α’=0.0071). Stars in parentheses indicate significance when accounting for spatial autocorrelation by spatial error models. These models were only calculated when Moran’s I test, as shown in the last column, was significant. *p < 0.05; **p < 0.01; ***p < 0.001

European Drosophila microbiomes contain trypanosomatids and novel viruses

We were also interested in determining the abundance of microbiota associated with D. melanogaster from the Pool-Seq data – these endosymbionts often have crucial functions in affecting the life history, immunity, hormonal physiology, and metabolic homeostasis of their fly hosts (e.g., Trinder et al. 2017; Martino et al. 2017). The taxonomic origin of a total of 262 million non-Drosophila reads was inferred using MGRAST, which identifies and counts short protein motifs (‘features’) within reads (Meyer et al. 2008). The largest fraction of protein features was assigned to Wolbachia (on average 53.7%; Figure 18), a well-known endosymbiont of Drosophila (Werren et al. 2008). The relative abundance of Wolbachia protein features varied strongly between samples ranging from 8.8% in a sample from the UK to almost 100% in samples from Spain, Portugal, Turkey and Russia (Table 1). Similarly, Wolbachia loads varied 100x between samples if we use the ratio of Wolbachia protein features divided by the number of Drosophila sequences retrieved for that sample as a proxy for relative micro-organismal load (for a full table of micro-organismal loads standardized by Drosophila genome coverage see Table S12).

Figure 18: Microbiome.

Relative abundance of Drosophila-associated microbes as assessed by MGRAST classified shotgun sequences. Microbes had to reach at least 3% relative abundance in one of the samples to be presented

Acetic acid bacteria of the genera Gluconobacter, Gluconacetobacter, and Acetobacter were the second largest group, with an average relative abundance of 34.4%.

Furthermore, we found evidence for the presence of several genera of Enterobacteria (Serratia, Yersinia, Klebsiella, Pantoea, Escherichia, Enterobacter, Salmonella, and Pectobacterium). Serratia occurs only at low frequencies or is absent from most of our samples, but reaches a very high relative abundance in the Nicosia summer collection (54.5%). This high relative abundance was accompanied by an 80x increase in Serratia bacterial load. We detected several eukaryotic microorganisms, although they were less abundant than the bacteria. The fraction of fungal protein features is larger than 3% in only three of our samples from Finland, Austria and Turkey (Table 1). Interestingly, we detected the presence of trypanosomatids in 16 of our samples, consistent with other recent evidence that Drosophila can host these organisms (Wilfert et al. 2011; Chandler & James 2013; Hamilton et al. 2015).

Our data also allowed us to detect the presence of five different DNA viruses (Table S13). These included approximately two million reads from Kallithea nudivirus (Webster et al. 2015), allowing us to assemble the complete Kallithea genome for the first time (>300-fold coverage in the Ukrainian sample UA_Kha_14_46; Genbank accession KX130344). We also identified around 1,000 reads from a novel nudivirus that is closely related to Kallithea virus and to Drosophila innubila nudivirus (Unckless 2011) in sample DK_Kar_14_41 from Karensminde, Denmark (Table 1). These sequences permitted us to identify a publicly available dataset (SRR3939042: 27 male D. melanogaster from Esparto, California; Machado et al. 2016) that contained sufficient reads to complete the genome (provisionally named “Esparto Virus”; KY608910). We further identified two novel Densoviruses (Parvoviridae), which we have provisionally named “Viltain virus”, a relative of Culex pipiens densovirus found at 94-fold coverage in sample FR_Vil_14_07 (Viltain; KX648535) and “Linvill Road virus”, a relative of Dendrolimus punctatus densovirus that was represented by only 300 reads here, but which has previously been found to have a high coverage in dataset SRR2396966 from a North American sample of D. simulans (KX648536; Machado et al. 2016). In addition, we detected a novel member of the Bidnaviridae family,“Vesanto virus”, a bidensovirus related to Bombyx mori densovirus 3 with approximately 900-fold coverage in sample FI_Ves_14_38 (Vesanto; KX648533 and KX648534), Using a detection threshold of >0.1% of the Drosophila genome copy number, the most commonly detected viruses were Kallithea virus (30/48 of the pools) and Vesanto virus (25/48), followed by Linvill Road virus (7/48) and Viltain virus (5/48), with Esparto virus being the rarest (2/48). In some samples, the viruses reached strikingly high titers: on 13 occasions the virus genome copy number in the pool exceeded the host genome copy number, reaching a maximum of nearly 20-fold in Vesanto.

This continent-wide analysis of the microbiota associated with fruit flies suggests that natural populations of European D. melanogaster differ greatly in the composition and relative abundance of microbes and viruses.

Discussion

In recent years, large-scale population resequencing projects have shed light on the biology of both model (Mackay et al. 2012; Langley et al. 2012; Consortium 2015; Lack et al. 2015; Alonso-Blanco et al. 2016; Lack et al. 2016) and non-model organisms (e.g., Hohenlohe et al. 2010; Wolf et al. 2010). Such massive datasets contribute greatly to our growing understanding of the processes that create and maintain genetic variation in natural populations. However, the relevant spatio-temporal scales for population genomic analyses remain largely unknown. Here we have applied, for the first time, a comprehensive sampling and sequencing strategy to European populations of D. melanogaster, allowing us to uncover previously unknown aspects of this species’ population biology.

A main result from our analyses of SNPs located in short introns and presumably evolving neutrally (Parsch et al. 2010) is that European D. melanogaster populations exhibit very pronounced longitudinal differentiation, a pattern that – to the best of our knowledge – has not been observed before for the European continent (for patterns of longitudinal differentiation in Africa see e.g. Michalakis & Veuille 1996; Aulard et al. 2002; Fabian et al. 2015). Genetic differentiation was greatest between populations from eastern and western Europe (Figure 14). The eastern populations included those from the Ukraine, Russia, and Turkey, as well as one from eastern Austria, suggesting that there may be a region of restricted gene flow in south-central Europe. However, populations from Finland and Cyprus are more similar to western populations than to eastern populations, possibly as a result of migration along shipping routes in the Baltic and Mediterranean seas. More data from populations in the unsampled, intermediate regions are needed to better delineate the geographic limits of the eastern and western population groups. Consistent with the strong differentiation between eastern and western populations, our PCA analysis revealed that longitude was the major factor associated with among-population divergence, with no significant effect of latitude (Figure 14C; Table 2). Thus, the patterns of neutral genetic differentiation in Europe contrast with those previously reported for North America, where latitude impacts neutral differentiation (Machado et al. 2016; Kapun et al. 2016a). However, our present analysis does not exclude the existence of clinally varying polymorphisms in European populations outside short introns: for example, we detected latitudinal frequency clines both for TEs and inversion polymorphisms. A detailed analysis of genome-wide patterns of clinal variation in the 2014 DrosEU data is beyond the scope of this paper and currently under way.

The mitochondrial genome and several chromosomal inversions and TEs showed similar patterns of differentiation as the rest of the genome, with the main axis of differentiation being longitudinal. Uncovering the extent to which this pattern is driven by demography and/or selection, and identifying the underlying environmental correlates (including any potential role of co-migration with human populations), will be an important task for future analyses. Due to the high density of samples and the large number of SNP markers examined, our results reveal that European populations of D. melanogaster exhibit much more differentiation and structure than previously thought (e.g., Baudry et al. 2004; Dieringer et al. 2005; Schlötterer et al. 2006; Nunes et al. 2008; Mateo et al. 2018).

Within the eastern and western population groups there was a low – but detectable – level of genetic differentiation among populations, including those that are geographically close (Figure 14C). These population differences persisted over a timespan of at least 2–3 months, as there was less genetic differentiation between the summer and fall samples of the 13 locations sampled at multiple time points than between neighboring populations (Figure 14C). Thus, while the weak but significant signal of IBD suggests homogenizing gene flow across geography, there is seasonally stable differentiation among populations. The season in which samples were collected did not show a significant association with genetic differentiation, except when considered in conjunction with longitude or altitude (Table 2). However, the data analyzed here are from a single year only: demonstrating recurrent shifts in SNP frequencies due to temporally varying selection will require analysis of additional annual samples. For an extensive analysis of patterns of seasonal variation across a broad geographic scale see Machado et al. (2018)

Our Pool-Seq data also allowed us to characterize geographic patterns in both inversions and TEs. In marked contrast to putatively neutral SNPs, the frequencies of several chromosomal inversions, including In(3L)P, In(3R)C, and In(3R)Payne, showed a significant correlation with latitude (Table 3). For In(3L)P and In(3R)Payne, the latitudinal clines were in qualitative agreement with parallel clines reported from North America and Australia, with the inversions decreasing in frequency as distance from the equator increases (Mettler et al. 1977; Knibb et al. 1981; Fabian et al. 2012; Kapun et al. 2014; Rane et al. 2015; Kapun et al. 2016a). This suggests that these inversions may contain genetic variants that are better adapted to warmer environments than to temperate climates. The overall frequencies of these inversions are, however, low in Europe (<5%), indicating that they might play only a minor role in local adaptation to European habitats. Some euchromatic TE insertions also showed geographic or seasonal patterns of variation (Table S7), indicating that they might play a role in local adaptation, particularly as many of them are located in regions where they could affect gene regulation. Importantly, several inversions and TEs also showed longitudinal frequency gradients, thus supporting the notion that European populations exhibit marked longitudinal differentiation.

We also examined signatures of selective sweeps in our dataset. We found 144 genomic regions that showed signatures of hard sweeps in regions of normal recombination (cM/Mb ≥ 0.5), and with reduced variation and negative Tajima’s D(D ≤ −0.8) in all European populations (Figure 12, Table S6). Four of these regions were identified in previous studies as potential targets for positive selection.

The first region, at the center of chromosome arm 2R (Figure 12A, Table S6), was previously found to be strongly differentiated between African and North American populations (Langley et al. 2012) and contains two genes, Cyp6g1 and Hen-1, that are associated with recent, strong selection. The cytochrome P450 gene Cyp6g1 has been linked to insecticide resistance (Daborn et al. 2002; Schmidt et al. 2010), shows evidence for recent selection independently in both D. melanogaster and D. simulans (Schlenke & Begun 2003; Catania et al. 2004), and is associated with a large differentiated region in the Australian latitudinal cline (Kolaczkowski et al. 2011a). Hen-1, a methyltransferase involved in maturation of small RNAs involved in virus and TE suppression, showed marginally non-significant evidence for selective sweeps in North American and African populations of D. melanogaster (Kolaczkowski et al. 2011b).

The second region previously implicated in a selective sweep is located on chromosome arm 3L (Figure 12B, Table S6) and centered around the chimeric gene CR18217, which formed from the fusion of a gene encoding a DNA-repair enzyme (CG4098) and a centriole gene (spd-2; Rogers & Hartl 2012). CR18217 appears to be unique to D. melanogaster, but – in spite of its recent origin – segregates at frequencies of around 90% (Rogers & Hartl 2012), consistent with a recent strong sweep in this region of the genome. This putative sweep region also spans Prosbeta6, which (like HDAC) encodes a gene involved in proteolysis (Flybase v. FB2017_05; Gramates et al. 2017). Prosbeta6 also shows homology to genes involved in immune function (Lyne et al. 2007; Handu et al. 2015), which might explain why it has been a target of positive selection.

The third previously characterized sweep region, surrounding the wapl gene on the X chromosome (Table S6), was identified as showing evidence of strong selective sweeps in both African and European D. melanogaster populations (Beisswanger et al. 2006; Boitard et al. 2012). The genic targets of selection in this region are unclear, but most likely are ph-p in Europe and ph-p or ph-d in Africa (Beisswanger et al. 2006). These genes are tandem duplicates involved in the Polycomb response pathway, which functions as an epigenetic repressor of transcription (reviewed in Kassis et al. 2017).

The fourth previously observed sweep region, originally identified in African populations of D. melanogaster, is also located on the X chromosome (Table S6), but 30 cM closer to the telomere and thus not implicating the wapl region (Beisswanger et al. 2006; Boitard et al. 2012). Selection in this region has been attributed to the HDAC6 gene (Svetec et al. 2009). HDAC6, although nominally a histone deacetylase, actually functions as a central player in managing cytotoxic assaults, including in transport and degradation of misfolded protein aggregates (reviewed in Matthias et al. 2008; Svetec et al. 2009).

Our data support the widespread occurrence of these previously identified sweeps in many populations in Europe. Notably, practically all European populations examined showed reduced variation and negative Tajima’s D in these sweep regions. This is consistent with the sweeps either pre-dating the colonization of Europe (e.g., Beisswanger et al. 2006) or having swept across Europe more recently (also see Stephan 2010 for discussion). In addition, we also uncovered several novel genomic regions with tentative evidence for hard sweeps (Table S6) – these regions represent a valuable source for future analyses of signals of adaptive evolution in European Drosophila.

Finally, we used our Pool-Seq data to identify microbes and viruses and to quantify their presence in natural populations of D. melanogaster across the European continent. Wolbachia was the most abundant bacterial genus associated with the flies, but its relative abundance and load varied greatly among samples (Figure 18). The second most abundant bacterial taxon was acetic acid bacteria (Acetobacteraceae), a group previously found among the most abundant bacteria in natural D. melanogaster isolates (Chandler et al. 2011; Staubach et al. 2013). Other microbes were highly variable abundance in relative abundance. For example, Serratia abundance was low in most populations, but very high in the Nicosia sample, which might reflect that there are individuals in the Nicosia sample that carry a systemic Serratia infection generating high bacterial loads. Future sampling may shed light on the temporal stability and/or population specificity of these patterns. Contrary to expectation, we found relatively few yeast sequences. This is a bit surprising because yeasts are commonly found on rotting fruit, the main food substrate of D. melanogaster, and have been found in association with Drosophila before (Barata et al. 2012; Chandler et al. 2012). This suggests that, although yeasts can attract flies and play a role in food choice (Becher et al. 2012; Buser et al. 2014), they might not be highly prevalent in or on D. melanogaster bodies. While trypanosomatids have been reported in association with Drosophila before (Wilfert et al. 2011; Chandler & James 2013; Hamilton et al. 2015), our study provides the first systematic detection across a wide geographic range in D. melanogaster. Despite being host to a wide diversity of RNA viruses (Huszar & Imler 2008; Webster et al. 2015), only three DNA viruses have previously been reported in association with Drosophilidae, and only one from D. melanogaster (Unckless 2011; Webster et al. 2015; 2016). Here, we have discovered four new DNA viruses in D. melanogaster. Although it is not possible to directly estimate viral prevalence from pooled sequencing data, we found that the DNA viruses of D. melanogaster can be very widespread, with Kallithea virus detectable at a low level in most populations.

A striking qualitative pattern in our microbiome data is the high level of variability among populations in the composition and relative amounts of different microbiota and viruses. Thus, an interesting open question is to what extent geographic differences in microbiota might contribute to phenotypic differences and local adaptation among fly populations, especially given that there might be tight and presumably local co-evolutionary interactions between fly hosts and their endosymbionts (e.g., Haselkorn et al. 2009; Richardson et al. 2012; Staubach et al. 2013; Kriesner et al. 2016).

In conclusion, our study demonstrates that extensive sampling on a continent-wide scale and pooled sequencing of natural populations can reveal new aspects of population biology, even for a well-studied species such as D. melanogaster. Such extensive population sampling is feasible due to the close cooperation and synergism within our international consortium. Our efforts in Europe are paralleled in North America by the Drosophila Real Time Evolution Consortium (Dros-RTEC), with whom we are currently collaborating to compare population genomic data across continents. In future years, our consortia will continue to sample and sequence European and North American Drosophila populations in order to study these populations with increasing spatial and temporal resolution and to provide an unprecedented resource for the Drosophila and population genetics communities.

Materials and Methods

The 2014 DrosEU dataset analyzed here consists of 48 samples of D. melanogaster collected from 32 geographical locations at different time-points across the European continent, through a joint effort of 18 European research groups (see Figure 2, Table 1). Field collections were performed with baited traps using a standardized protocol (see Supplementary file for details). Up to 40 males from each collection were pooled, and DNA extracted from each pool, using a standard phenol-chloroform based protocol. Each sample was processed in a single pool (Pool-Seq; Schlötterer et al. 2014), with each pool consisting of at least 33 wild-caught individuals. To exclude morphologically similar and co-occurring species, such as D. simulans, as potential contaminants from the samples, we only used wild-caught males and distinguished among species by examining genital morphology. Despite this precaution, we identified a low level of D. simulans contamination in our samples, and further steps were thus taken to exclude D. simulans sequences from our analysis (see below). The 2014 DrosEU dataset represents the most comprehensive spatio-temporal sampling of European D. melanogaster populations available to date (Table 1, Figure 3).

DNA extraction, library preparation and sequencing

DNA was extracted from pools of 33–40 males per sample after joint homogenization with bead beating and standard phenol/chloroform extraction. A detailed extraction protocol can be found in the Supporting Information file. In brief, 500 ng of DNA in a final volume of 55.5 μl were sheared with a Covaris instrument (Duty cycle 10, intensity 5, cycles/burst 200, time 30) for each sample separately. Library preparation was performed using NEBNext Ultra DNA Lib Prep-24 and NebNext Multiplex Oligos for Illumina-24 following the manufacturer’s instructions. Each pool was sequenced as paired-end fragments on a Illumina NextSeq 500 sequencer at the Genomics Core Facility of Pompeu Fabra University (UPF; Barcelona, Spain). Samples were multiplexed in five batches of 10 samples each, except for one batch that contained only 8 samples (see Supplementary Table S1 for further information). Each multiplexed batch was sequenced on four lanes to obtain an approximate 50x raw coverage for each sample. Reads were sequenced to a length of 151 bp with a median insert size of 348 bp (ranging from 209 to 454 bp).

Mapping pipeline and variant calling

Prior to mapping, we trimmed and filtered raw FASTQ reads to remove low-quality bases (minimum base PHRED quality = 18; minimum sequence length = 75 bp) and sequencing adaptors using cutadapt (v. 1.8.3; Martin 2011). We only retained read pairs for which both reads fulfilled our quality criteria after trimming. FastQC analyses of trimmed and quality filtered reads showed overall high base-qualities (median ranging from 29 to 35 in all 48 samples) and indicated a loss of ∼1.36% of all bases after trimming relative to the raw data. We used bwa mem (v. 0.7.15; Li 2013) with default parameters to map trimmed reads against a compound reference genome consisting of the genomes from D. melanogaster (v.6.12) and genomes from common commensals and pathogens, including Saccharomyces cerevisiae (GCF_000146045.2), Wolbachia pipientis (NC_002978.6), Pseudomonas entomophila (NC_008027.1), Commensalibacter intestine (NZ_AGFR00000000.1), Acetobacter pomorum (NZ_AEUP00000000.1), Gluconobacter morbifer (NZ_AGQV00000000.1), Providencia burhodogranariea (NZ_AKKL00000000.1), Providencia alcalifaciens (NZ_AKKM01000049.1), Providencia rettgeri (NZ_AJSB00000000.1), Enterococcus faecalis (NC_004668.1), Lactobacillus brevis (NC_008497.1), and Lactobacillus plantarum (NC_004567.2), to avoid paralogous mapping. We used Picard (v.1.109; http://picard.sourceforge.net) to remove duplicate reads and reads with a mapping quality below 20. In addition, we re-aligned sequences flanking insertions-deletions (indels) with GATK (v3.4-46; McKenna et al. 2010).

After mapping, Pool-Seq samples were tested for DNA contamination from D. simulans. To do this, we used a set of SNPs known to be divergent between D. simulans and D. melanogaster and assessed the frequencies of D. simulans-specific alleles following the approach of Bastide et al. (2013). We combined the genomes of D. melanogaster (v.6.12) and D. simulans (Hu et al. 2013) and separated species-specific reads for samples with a contamination level > 1% via competitive mapping against the combined references using the pipeline described above. Custom software was used to remove reads uniquely mapping to D. simulans. In 9 samples, we identified contamination with D. simulans, ranging between 1.2 % and 8.7% (Table S1). After applying our decontamination pipeline, contamination levels dropped below 0.4 % in all 9 samples.

We used Qualimap (v. 2.2., Okonechnikov et al. 2016) to evaluate average mapping qualities per population and chromosome, which ranged from 58.3 to 58.8 (Table S1). We found heterogeneous sequencing depths among the 48 samples, ranging from 34x to 115x for autosomes and from 17x to 59x for X-chromosomes (Figure S1, Table S1). We then combined individual BAM files from all samples into a single mpileup file using samtools (v. 1.3; Li & Durbin 2009). Due to the large number of Pool-Seq datasets analyzed in parallel, we had to implement quality control criteria for all libraries jointly in order to call SNPs. To accomplish this, we implemented a novel custom SNP calling software to call SNPs with stringent heuristic parameters (PoolSNP; see Supplementary Information), available at Dryad (doi: https://doi.org/10.5061/dryad.rj1gn54). A site was considered polymorphic if (1) the minimum coverage from all samples was greater than 10x, (2) the maximum coverage from all samples was less than the 95th coverage percentile for a given chromosome and sample (to avoid paralogous regions duplicated in the sample but not in the reference), (3) the minimum read count for a given allele was greater than 20x across all samples pooled, and (4) the minimum read frequency of a given allele was greater than 0.001 across all samples pooled. The above threshold parameters were optimized based on simulated Pool-Seq data in order to maximize true positives and minimize false positives (see Figure S18 and Supporting Information). Additionally, we excluded SNPs (1) for which more than 20% of all samples did not fulfill the above-mentioned coverage thresholds, (2) which were located within 5 bp of an indel with a minimum count larger than 10x in all samples pooled, and (3) which were located within known transposable elements (TE) based on the D. melanogaster TE library v.6.10. We further annotated our final set of SNPs with SNPeff (v.4.2; Cingolani et al. 2012) using the Ensembl genome annotation version BDGP6.82 (Figure 3).

Combined and population-specific site frequency spectra (SFS)

We quantified the amount of allelic variation with respect to different SNP classes. For this, we first combined the full dataset across all 48 samples and used the SNPeff annotation (see above) to classify the SNPs into four classes (intergenic, intronic, non-synonymous, and synonymous). For each class, we calculated the site frequency spectrum (SFS) based on minor allele frequencies for the X-chromosome and the autosomes, as well as for each sample and chromosomal arm separately, by counting alleles in 50 frequency bins of size 0.01.

Genetic variation in Europe

We characterized patterns of genetic variation among the 48 samples by estimating three standard population genetic parameters: π, Watterson’s θ and Tajima’s D (Watterson 1975; Nei 1987; Tajima 1989). We focused on SNPs located on the five major chromosomal arms (X, 2L, 2R, 3L, 3R) and calculated sample-wise π, θ and Tajima’s D with corrections for Pool-Seq data (Kofler et al. 2011). Since PoPoolation, the most commonly used software for population genetics inference from Pool-Seq data, does not allow using predefined SNPs (which was desirable for our analyses), we implemented corrected population genetic estimators described in Kofler et al. (2011) in Python (PoolGen; available at Dryad; doi: https://doi.org/10.5061/dryad.rj1gn54). Before calculating the estimators, we subsampled the data to an even coverage of 40x for the autosomes and 20x for the X-chromosome to control for the sensitivity to coverage variation of Watterson’s θ and Tajima’s D (Korneliussen et al. 2013). At sites with greater than 40x coverage, we randomly subsampled reads to 40x without replacement; at sites with below 40x coverage, we sampled reads 40 times with replacement. Using R (R Development Core Team 2009), we calculated sample-wise chromosome-wide averages for autosomes and X chromosomes separately and tested for correlations of π, θ and Tajima’s D with latitude, longitude, altitude, and season using a linear regression model of the following form: y_i = Lat + Lon + Alt + Season + ε_i, where y_i is either π, θ and D. Here, latitude, longitude, and altitude are continuous predictors (Table 1), while ‘season’ is a categorical factor with two levels S (“summer”) and F (“fall”), corresponding to collection dates before and after September 1^st, respectively. We chose this arbitrary threshold for consistency with previous studies (Bergland et al. 2014; Kapun et al. 2016a). To further test for residual spatio-temporal autocorrelation among the samples (Kühn & Dormann 2012), we calculated Moran’s I (Moran 1950) with the R package spdep (v.06-15., Bivand & Piras 2015). To do this, we used the residuals of the above-mentioned models, as well as matrices defining pairs of samples as neighbors weighted by geographical distances between the locations (samples within 10° latitude/longitude were considered neighbors). Whenever these tests revealed significant autocorrelation (indicating non-independence of the samples), we repeated the above-mentioned regressions using spatial error models as implemented in the R package spdep, which incorporate spatial effects through weighted error terms, as described above.

To test for confounding effects of variation in sequencing errors between runs, we extended the above-mentioned linear models including the run ID as a random factor using the R package lme4 (v.1.1-14; see Supporting Information). Preliminary analyses showed that this model was not significantly better than simpler models, so we did not include sequencing run in the final analysis (see Supporting information and Table S4).

To investigate genome-wide patterns of variation, we averaged π, θ, and D in 200 kb non-overlapping windows for each sample and chromosomal arm separately and plotted the distributions in R. In addition, we calculated Tajima’s D in 50 kb sliding windows with a step size of 10 kb to investigate fine-scale deviations from neutral expectations. We applied heuristic parameters to identify genomic regions harboring potential candidates for selective sweeps. To identify candidate regions with sweep patterns across most of the 48 samples, we searched for windows with log-transformed recombination rates ≥ 0.5, pairwise nucleotide diversity (π ≤ 0.004), and average Tajima’s D across all populations ≤ - 0.8 (5% percentile). To identify potential selective sweeps restricted to a few population samples only, we searched for regions characterized as above but allowing one or more samples with Tajima’s D being more than two standard deviations smaller than the window-wise average. To account for the effects of strong purifying selection in gene-rich genomic regions which can result in local negative Tajima’s D (Tajima 1989) and thus confound the detection of selective sweeps, we repeated the analysis based on silent sites (4-fold degenerate sites, SNPs in short introns of ≤ 60 bp lengths and SNPs in intergenic regions in ≥ 2000 bp distance to the closest gene) only. Despite of the reduction in polymorphic sites available for this analysis, we found highly consistent sweep regions and therefore proceded with the full SNP datasets, which provided better resolution (results not shown).

For statistical analysis, the diversity statistics were log-transformed to normalize the data. We then tested for correlations between π and recombination rate using R in 100 kb non-overlapping windows and plotted these data using the ggplot2 package (v.2.2.1., Wickham 2016). We used two different recombination rate measurements: (i) a fine-scale, high resolution genomic recombination rate map based on millions of SNPs in a small number of strains (Comeron et al. 2012), and (ii) the broad-scale Recombination Rate Calculator based on Marey maps generated by laboratory cross data fitting genetic and physical positions of 644 markers to a third-order polynomial curve for each chromosome arm (Fiston-Lavier et al. 2010). Both measurements were converted to version 6 of the D. melanogaster reference genome to match the genomic position of π estimates (see above).

SNP counts and overlap with other datasets

We used the panel of SNPs identified in the DrosEU dataset (available at Dryad; doi: https://doi.org/10.5061/dryad.rj1gn54) to describe the overlap in SNP calls with other published D. melanogaster population data: the Drosophila Population Genomics Project 3 (DPGP3) from Siavonga, Zambia (69 non-admixed lines; Lack et al. 2015; 2016) and the Drosophila Genetic Reference Panel (DGRP) from Raleigh, North Carolina, USA (205 inbred lines; Mackay et al. 2012; Huang et al. 2014). For these comparisons, we focused on biallelic SNPs on the 5 major chromosome arms. We used bwa mem for mapping and a custom pipeline for heuristic SNP calling (PoolSNP; Figure 3). To make the data from the 69 non-admixed lines from Zambia (Lack et al. 2015; 2016) comparable to our data, we reanalyzed these data using our pipeline for mapping and variant calling (Figure 3).

The VCF file of the DGRP data was downloaded from http://dgrp2.gnets.ncsu.edu/ and converted to coordinates according to the D. melanogaster reference genome v.6. We depicted the overlap of SNPs called in the three different populations using elliptic Venn diagrams with eulerAPE software (v3 3.0.0., Micallef & Rodgers 2014). While the DrosEU data were generated from sequencing pools of wild-caught individuals, both the DGRP and DPGP3 data are based on individual sequencing of inbreed lines and haploid individuals, respectively.

Genetic differentiation and population structure in European populations

To estimate genome-wide pairwise genetic differences, we used custom software to calculate SNP-wise F_ST using the approach of Weir and Cockerham (1984). We estimated SNP-wise F_ST for all possible pairwise combinations among samples. For each sample, we then averaged F_ST across all SNPs for all pairwise combinations that include this particular sample and finally ranked the 48 population samples by overall differentiation.

We inferred demographic patterns in European populations by focusing on 20,008 putatively neutrally evolving SNPs located in small introns (less than 60 bp length; Haddrill et al. 2005; Singh et al. 2009; Parsch et al. 2010; Clemente & Vogl 2012; Lawrie et al. 2013) that were at least 200 kb distant from the major chromosomal inversions (see below). To assess isolation by distance (IBD), we averaged F_ST values for each sample pair across all neutral markers and calculated geographic distances between samples using the haversine formula (Green & Smart 1985) which takes the spherical curvature of the planet into account. We tested for correlations between genetic differentiation and geographic distance using Mantel tests using the R package ade4 (v.1.7-8., Dray & Dufour 2007) with 1,000,000 iterations. In addition, we plotted the 5% smallest and 5% largest F_ST values from all 1,128 pairwise comparisons among the 48 population samples onto a map to visualize geographic patterns of genetic differentiation. From these putatively neutral SNPs, we used observed F_ST on the autosomes (F_aut) to calculate the expected F_ST on X chromosomes (F_X) as in Machado et al. (2016) using the equation where z is the ratio of effective population sizes of males (N_m) and females (N_f), N_m/N_f (Ramachandran et al. 2004). For the purposes of this study we assume z = 1.

We further investigated genetic variation in our dataset by principal component analysis (PCA) based on allele frequencies of the neutral marker SNPs described above. We used the R package LEA (v. 1.2.0., Frichot et al. 2013) and performed PCA on unscaled allele frequencies as suggested by Menozzi et al. (1978) and Novembre and Stephens (2008). We focused on the first three principal components (PCs) and employed a model-based approach as implemented in the R package mclust (v. 5.2., Fraley & Raftery 2012) to identify the most likely number of clusters based on maximum likelihood and assigned population samples to clusters by k-means clustering in R (R Development Core Team 2009). Finally, we examined the first three PCs for correlations with latitude, longitude, altitude, and season using general linear models and tested for spatial autocorrelation as described above. A Bonferroni-corrected α threshold (α’= 0.05/3 = 0.017) was used to account for multiple testing.

Mitochondrial DNA

To obtain consensus mitochondrial sequences for each of the 48 European populations, reads from individual FASTQ files were aligned and minor variants replaced by the major variant using Coral (Salmela & Schröder 2011). This way, ambiguities that might prevent the growth of contigs from reads during the assembly process can be eliminated. For each population, a genome assembly was obtained using SPAdes using standard parameters and k-mers of size 21, 33, 55, and 77 (Bankevich et al. 2012) and the corrected FASTQ files. Mitochondrial contigs were retrieved by blastn, using the D. melanogaster NC 024511 sequence as a query and each genome assembly as the database. To avoid nuclear mitochondrial DNA segments (numts), we ensured that only contigs with a much higher coverage than the average coverage of the genome were retrieved. When multiple contigs were available for the same region, the one with the highest coverage was selected. Possible contamination with D. simulans was assessed by looking for two or more consecutive sites that show the same variant as D. simulans and looking for alternative contigs for that region with similar coverage. As an additional quality control measure, we also examined the presence of pairs of sites showing four gametic types using DNAsp 6 (Rozas et al. 2017) – given that there is no recombination in mitochondrial DNA no such sites are expected. The very few sites presenting such features were rechecked by looking for alternative contigs for that region and were corrected if needed. The uncorrected raw reads for each population were mapped on top of the different consensus haplotypes using Express as implemented in Trinity (Grabherr et al. 2011). If most reads for a given population mapped to the consensus sequence derived for that population the consensus sequence was retained, otherwise it was discarded as a possible chimera between different mitochondrial haplotypes. The repetitive mitochondrial hypervariable region is difficult to assemble and was therefore not used; the mitochondrial region was thus analyzed as in Cooper et al. (2015). Mitochondrial genealogy was estimated using statistical parsimony (TCS network; Clement et al. 2000), as implemented in PopArt (http://popart.otago.ac.nz), and the surviving mitochondrial haplotypes.

Frequencies of the different mitochondrial haplotypes were estimated from FPKM values using the surviving mitochondrial haplotypes and expressed as implemented in Trinity (Grabherr et al. 2011).

Transposable elements

To quantify the transposable element (TE) abundance in each sample, we assembled and quantified the repeats from unassembled sequenced reads using dnaPipeTE (v.1.2., Goubert et al. 2015). The vast majority of high-quality trimmed reads were longer than 135 bp. We thus discarded reads less than 135 bp before sampling. Reads matching mtDNA were filtered out by mapping to the D. melanogaster reference mitochondrial genome (NC_024511.2. 1) with bowtie2 (v. 2.1.0., Langmead & Salzberg 2012). Prokaryotic sequences, including reads from symbiotic bacteria such as Wolbachia, were filtered out from the reads using the implementation of blastx (translated nucleic vs. protein database) vs. the non-redundant protein database (nr) using DIAMOND (v. 0.8.7., Buchfink et al. 2015). To quantify TE content, we subsampled a proportion of the raw reads (after filtering) corresponding to a genome coverage of 0.1X (assuming a genome size of 175 MB), and then assembled these reads with Trinity assembler (Grabherr et al. 2011). Due to the low coverage of the genome obtained with the subsampled reads, only repetitive DNA present in multiple copies should be fully assembled (Goubert et al. 2015). We repeated this process with three iterations per sample, as recommended by the program guidelines, to assess the repeatability of the estimates.

We further estimated frequencies of previously characterized TEs present in the reference genome with T-lex2 (v. 2.2.2., Fiston-Lavier et al. 2015), using all annotated TEs (5,416 TEs) in version 6.04 of the D. melanogaster genome from flybase.org (Gramates et al. 2017). For 108 of these TEs, we used the corrected coordinates as described in Fiston-Lavier et al. (2015), based on the identification of target site duplications at the site of the insertion. We excluded TEs nested or flanked by other TEs (<100 bp on each side of the TE), and TEs which are part of segmental duplications, since T-lex2 does not provide accurate frequency estimates in complex regions (Fiston-Lavier et al. 2015). We additionally excluded the INE-1 TE family, as this TE family is ancient, with thousands of insertions in the reference genome, which appear to be mostly fixed (2,234 TEs; Kapitonov & Jurka 2003).

After applying these filters, we were able to estimate frequencies of 1,630 TE insertions from 113 families from the three main orders, LTR, non-LTR, and DNA across all DrosEU samples. T-lex2 contains three main modules: (i) the presence detection module, (ii) the absence detection module, and (iii) the combine module, which joins the results from the former two detection modules. In the presence module, T-lex2 uses Maq (v. 0.7.1., Li et al. 2008) for the mapping of reads. As Maq only accepts reads 127 bp or shorter, we cut the trimmed reads following the general pipeline (Figure 3) and then used Trimmomatic (v. 0.35; Bolger et al. 2014) to cut trimmed reads longer than 100 bp into two equally sized fragments using CROP and HEADCROP parameters. Only the presence module was run with the cut reads.

To avoid inaccurate TE frequency estimates due to very low numbers of reads, we only considered frequency estimates based on at least 3 reads. Despite the stringency of T-lex2 to select only high-quality reads, we additionally discarded frequency estimates supported by more than 90 reads, i.e. 3 times the average coverage of the sample with the lowest coverage (CH_Cha_14_43, Table 1), in order to avoid non-unique mapping reads.

This filtering allows to estimate TE frequencies for ∼96% (92.9% to 97.8%) of the TEs in each population. For 85% of the TEs, we were able to estimate their frequencies in more than 44 out of 48 DrosEU samples.

We tested for correlations between TE insertion frequencies and recombination rates using Spearman’s rank correlations as implemented in R. For SNPs, we used recombination rates from Comeron et al. (2012) and from the Recombination Rate Calculator (Fiston-Lavier et al. 2010) in non-overlapping 100 kb windows, and assigned to each TE insertion the recombination rate of the corresponding 100 kb genomic window.

To test for spatio-temporal variation of TE insertions, we excluded TEs with an interquartile range (IQR) < 10. There were 141 TE insertions with variable population frequencies among the DrosEU samples. We tested the population frequencies of these insertions for correlations with latitude, longitude, altitude, and season using generalized linear models (ANCOVA) following the method used for SNPs but with a binomial error structure in R.

We also tested for residual spatio-temporal autocorrelations, with Moran’s I test (Moran 1950; Kühn & Dormann 2012). We used Bonferroni corrections to account for multiple testing (α’= 0.05/141 = 0.00035) and only considered Bonferroni-corrected p-values < 0.001 to be significant. TEs with a recombination rate that differed from 0 cM/Mb according to both used measures (see above) were considered as high recombination regions. To test TE family enrichment among the significant TEs we performed a χ² test and applied Yate’s correction to account for the low number of some of the cells.

Inversion polymorphisms

Since Pool-Seq data precludes a direct assessment of the presence and frequencies of chromosomal inversions, we indirectly estimated inversion frequencies using a panel of approximately 400 inversion-specific marker SNPs (Kapun et al. 2014) for six cosmopolitan inversions (In(2L)t, In(2R)NS, In(3L)P, In(3R)C, In(3R)Mo, In(3R)Payne). We averaged allele frequencies of these markers in each sample separately. To test for clinal variation in the frequencies of inversions, we tested for correlations with latitude, longitude, altitude and season using generalized linear models with a binomial error structure in R to account for the biallelic nature of karyotype frequencies. In addition, we tested for residual spatio-temporal autocorrelations as described above and Bonferroni-corrected the α threshold (α’= 0.05/7 = 0.007) to account for multiple testing.

Microbiome

Raw sequences were trimmed and quality filtered as described for the genomic data analysis. The remaining high quality sequences were mapped against the D. melanogaster genome (v.6.04) including mitochondria using bbmap (v. 35; Bushnell 2016) with standard settings. The unmapped sequences were submitted to the online classification tool, MGRAST (Meyer et al. 2008) for annotation. Taxonomy information was downloaded and analyzed in R (v. 3.2.3; R Development Core Team 2009) using the matR (v. 0.9; Braithwaite & Keegan) and RJSONIO (v. 1.3; Lang) packages. Metazoan sequence features were removed. For microbial load comparisons, the number of protein features identified by MGRAST for each taxon and sample was divided by the number of sequences that mapped to D. melanogaster chromosomes X, Y, 2L, 2R, 3L, 3R and 4.

We also surveyed the datasets for the presence of novel DNA viruses by performing de novo assembly of the non-fly reads using SPAdes 3.9.0 (Bankevich et al. 2012), and using conceptual translations to query virus proteins from Genbank using DIAMOND ‘blastp’ (Buchfink et al. 2015). In three cases (Kallithea virus, Vesanto virus, Viltain virus), reads from a single sample pool were sufficient to assemble a (near) complete genome. In two other cases, fragmentary assemblies allowed us to identify additional publicly available datasets that contained sufficient reads to complete the genomes (Linvill Road virus, Esparto virus; completed using SRA datasets SRR2396966 and SRR3939042, respectively). Novel viruses were provisionally named based on the localities where they were first detected, and the corresponding novel genome sequences were submitted to Genbank (KX130344, KY608910, KY457233, KX648533-KX648536). To assess the relative amount of viral DNA, unmapped (non-fly) reads from each sample pool were mapped to repeat-masked Drosophila DNA virus genomes using bowtie2, and coverage normalized relative to virus genome length and the number of mapped Drosophila reads.

Additional information

Funding

Author contributions

Martin Kapun, Visualization, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Supervision, Methodology, Investigation, Data curation, Project administration, Validation, Resources, Software; Maite G. Barrón, Visualization, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Methodology, Investigation, Data curation, Project administration, Validation, Resources, Software; Fabian Staubach, Visualization, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Supervision, Funding acquisition, Methodology, Investigation, Data curation, Validation, Resources, Software; Jorge Vieira, Visualization, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Methodology, Investigation, Validation, Resources; Darren J. Obbard, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Methodology, Investigation, Validation, Resources; Clément Goubert, Visualization, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Investigation, Resources; Omar Rota-Stabelli, Visualization, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Methodology, Investigation, Resources; Maaria Kankare, Writing-original draft preparation, Conceptualization, Writing-review & editing, Methodology, Investigation, Resources; Annabelle Haudry, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Investigation, Validation, Resources; R. Axel W. Wiberg, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Methodology, Investigation, Resources, Software; Lena Waidele, Svitlana Serga, Patricia Gibert, Damiano Porcelli, Sonja Grath, Eliza Argyridou, Lain Guio, Mads Fristrup Schou, Conceptualization, Writing-review & editing, Investigation, Resources; Iryna Kozeretska, Conceptualization, Writing-review & editing, Methodology, Investigation, Resources; Elena G. Pasyukova, Marta Pascual, Alan O. Bergland, Conceptualization, Writing-review & editing, Funding acquisition, Methodology, Investigation, Resources; Volker Loeschcke, Catherine Montchamp-Moreau, Jessica Abbott, Nico Posnien, Maria Pilar Garcia Guerreiro, Banu Sebnem Onder, Conceptualization, Writing-review & editing, Funding acquisition, Investigation, Resources; Cristina P. Vieira, Visualization, Formal analysis, Conceptualization, Writing-review & editing, Investigation, Resources; Élio Sucena, Conceptualization, Writing-review & editing, Methodology, Investigation, Project administration, Resources; Cristina Vieira, Michael G. Ritchie, Thomas Flatt, Josefa González, Writing-original draft preparation, Conceptualization, Writing-review & editing, Supervision, Funding acquisition, Methodology, Investigation, Project administration, Validation, Resources; Bart Deplancke, Conceptualization, Writing-review & editing, Funding acquisition, Investigation; Bas J. Zwaan, Visualization, Writing-original draft preparation, Conceptualization, Writing-review & editing, Supervision, Funding acquisition, Methodology, Investigation, Project administration; Eran Tauber, Writing-original draft preparation, Conceptualization, Writing-review & editing, Funding acquisition, Methodology, Investigation, Resources; Dorcas J. Orengo, Eva Puerma, Conceptualization, Writing-review & editing, Investigation, Validation, Resources; Montserrat Aguadé, Writing-original draft preparation, Conceptualization, Writing-review & editing, Methodology, Investigation, Validation, Resources; Paul S. Schmidt, John Parsch, Writing-original draft preparation, Conceptualization, Writing-review & editing, Funding acquisition, Methodology, Investigation, Validation, Resources; Andrea J. Betancourt, Writing-original draft preparation, Formal analysis, Conceptualization, Writing-review & editing, Supervision, Funding acquisition, Methodology, Investigation, Project administration, Validation, Resources

Author ORCIDs

Acknowledgments

We are grateful to all members of the DrosEU and Dros-RTEC consortia and to Dmitri Petrov (Stanford University) for support and discussion. DrosEU is funded by a Special Topic Networks (STN) grant from the European Society for Evolutionary Biology (ESEB). Computational analyses were partially executed at the Vital-IT bioinformatics facility of the University of Lausanne (Switzerland) and at the computing facilities of the CC LBBE/PRABI in Lyon (France).

Footnotes

↵§ Members of the Drosophila Real Time Evolution (Dros-RTEC) Consortium
Competing interests: The authors declare that no competing interests exist.

References

↵
Adrian AB, Comeron JM (2013) The Drosophila early ovarian transcriptome provides insight to the molecular causes of recombination rate variation across genomes. BMC Genomics, 14, 794.
OpenUrl CrossRef PubMed
↵
Adrion JR, Hahn MW, Cooper BS (2015) Revisiting classic clines in Drosophila melanogaster in the age of genomics. Trends in Genetics, 31, 434–444.
OpenUrl CrossRef PubMed
↵
Alonso-Blanco C, Andrade J, Becker C et al. (2016) 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. 166, 481–491.
OpenUrl
↵
Anderson AR, Hoffmann AA, McKechnie SW, Umina PA, Weeks AR (2005) The latitudinal cline in the In(3R)Payne inversion polymorphism has shifted in the last 20 years in Australian Drosophila melanogaster populations. Molecular Ecology, 14, 851–858.
OpenUrl CrossRef PubMed
↵
Anderson PR, Knibb WR, Oakeshott JG (1987) Observations on the extent and temporal stability of latitudinal clines for alcohol dehydrogenase allozymes and four chromosome inversions in Drosophila melanogaster. Genetica, 75, 81–88.
OpenUrl CrossRef PubMed Web of Science
↵
Anderson WW, Arnold J, Baldwin DG et al. (1991) Four decades of inversion polymorphism in Drosophila pseudoobscura. Proceedings of the National Academy of Sciences of the United States of America, 88, 10367–10371.
OpenUrl Abstract/FREE Full Text
↵
Andolfatto P, Wall JD, Kreitman M (1999) Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster. Genetics, 153, 1297–1311.
OpenUrl Abstract/FREE Full Text
↵
Ashburner M, Lemeunier F (1976) Relationships within the melanogaster Species Subgroup of the Genus Drosophila (Sophophora). I. Inversion Polymorphisms in Drosophila melanogaster and Drosophila simulans. Proceedings of the Royal Society of London. Series B: Biological Sciences, 193, 137–157.
OpenUrl CrossRef
↵
Aulard S, David JR, Lemeunier F (2002) Chromosomal inversion polymorphism in Afrotropical populations of Drosophila melanogaster. Genetic Research, 79, 49–63.
OpenUrl
↵
Bankevich A, Nurk S, Antipov D et al. (2012) SPAdes, a New Genome Assembly Algorithm and Its Applications to Single-cell Sequencing (7th Annual SFAF Meeting, 2012). Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA.
↵
Barata A, Santos SC, Malfeito-Ferreira M, Loureiro V (2012) New insights into the ecological interaction between grape berry microorganisms and Drosophila flies during the development of sour rot. Microbial ecology, 64, 416–430.
OpenUrl CrossRef PubMed Web of Science
↵
Bartolomé C, Maside X, Charlesworth B (2002) On the Abundance and Distribution of Transposable Elements in the Genome of Drosophila melanogaster. Molecular Biology and Evolution, 19, 926–937.
OpenUrl CrossRef PubMed Web of Science
↵
1. P Wittkopp
Bastide H, Betancourt A, Nolte V et al. (2013) A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster. ( P Wittkopp, Ed,). PLoS Genetics, 9, e1003534.
OpenUrl
↵
Baudry E, Viginier B, Veuille M (2004) Non-African populations of Drosophila melanogaster have a unique origin. Molecular Biology and Evolution, 21, 1482–1491.
OpenUrl CrossRef PubMed Web of Science
↵
Becher PG, Flick G, Rozpędowska E et al. (2012) Yeast, not fruit volatiles mediate Drosophila melanogaster attraction, oviposition and development. Functional Ecology, 26, 822–828.
OpenUrl CrossRef Web of Science
↵
1. SV Nuzhdin
Begun DJ (2015) Parallel Gene Expression Differences between Low and High Latitude Populations of Drosophila melanogaster and D. simulans. ( SV Nuzhdin, Ed,). PLoS Genetics, 11, e1005184.
OpenUrl
↵
Begun DJ, Aquadro CF (1993) African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature, 365, 548–550.
OpenUrl CrossRef PubMed Web of Science
↵
Begun DJ, Holloway AK, Stevens K et al. (2007) Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans. PLoS Biology, 5, e310.
OpenUrl CrossRef PubMed
↵
Behrman EL, Howick VM, Kapun M et al. (2018) Rapid seasonal evolution in innate immunity of wild Drosophila melanogaster. Proceedings of the Royal Society B: Biological Sciences, 285, 20172599.
OpenUrl CrossRef PubMed
↵
Beisswanger S, Stephan W, De Lorenzo D (2006) Evidence for a Selective Sweep in the wapl Region of Drosophila melanogaster. Genetics, 172, 265–274.
OpenUrl Abstract/FREE Full Text
↵
Bergland AO, Behrman EL, O’Brien KR, Schmidt PS, Petrov DA (2014) Genomic Evidence of Rapid and Stable Adaptive Oscillations over Seasonal Time Scales in Drosophila. PLoS Genetics, 10, e1004775.
OpenUrl
↵
Bergland AO, Tobler R, González J, Schmidt P, Petrov D (2016) Secondary contact and local adaptation contribute to genome-wide patterns of clinal variation in Drosophila melanogaster. Molecular Ecology, 25, 1157–1174.
OpenUrl CrossRef PubMed
↵
Bivand R, Piras G (2015) Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63, 1–36.
OpenUrl
↵
Black WC IV., Black WC IV., Baer CF, Antolin MF, DuTeau NM (2001) Population genomics: genome-wide sampling of insect populations. Annual Review of Entomology.
↵
Blumenstiel JP, Chen X, He M, Bergman CM (2014) An Age-of-Allele Test of Neutrality for Transposable Element Insertions. Genetics, 196, 523–538.
OpenUrl Abstract/FREE Full Text
↵
Boitard S, Schlötterer C, Nolte V, Pandey RV, Futschik A (2012) Detecting Selective Sweeps from Pooled Next-Generation Sequencing Samples. Molecular Biology and Evolution, 29, 2177–2186.
OpenUrl CrossRef PubMed Web of Science
↵
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–2120.
OpenUrl CrossRef PubMed Web of Science
↵
Boussy IA, Itoh M, Rand D, Woodruff RC (1998) Origin and decay of the P element-associated latitudinal cline in Australian Drosophila melanogaster. Genetica, 104, 45–57.
OpenUrl CrossRef PubMed Web of Science
↵
Božičević V, Hutter S, Stephan W, Wollstein A (2016) Population genetic evidence for cold adaptation in European Drosophila melanogaster populations. Molecular Ecology, 25, 1175–1191.
OpenUrl CrossRef
↵
Braithwaite DP, Keegan KP matR: Metagenomics Analysis Tools for R. https://CRAN.R-project.org/package=matR.
↵
Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12, 59–60.
OpenUrl CrossRef
↵
Buser CC, Newcomb RD, Gaskett AC, Goddard MR (2014) Niche construction initiates the evolution of mutualistic interactions. Ecology Letters, 17, 1257–1264.
OpenUrl CrossRef PubMed
↵
Bushnell B (2016) BBMap short read aligner. URL http://sourceforge.net/projects/bbmap.
↵
Camus MF, Wolff JN, Sgrò CM, Dowling DK (2017) Experimental Support That Natural Selection Has Shaped the Latitudinal Distribution of Mitochondrial Haplotypes in Australian Drosophila melanogaster. Molecular Biology and Evolution, 34, 2600–2612.
OpenUrl
↵
Caracristi G, Schlötterer C (2003) Genetic Differentiation Between American and European Drosophila melanogaster Populations Could Be Attributed to Admixture of African Alleles. Molecular Biology and Evolution, 20, 792–799.
OpenUrl CrossRef PubMed Web of Science
↵
Casillas S, Barbadilla A (2017) Molecular Population Genetics. Genetics, 205, 1003–1035.
OpenUrl Abstract/FREE Full Text
↵
Catania F, Kauer MO, Daborn PJ et al. (2004) World-wide survey of an Accord insertion and its association with DDT resistance in Drosophila melanogaster. Molecular Ecology, 13, 2491–2504.
OpenUrl CrossRef PubMed Web of Science
↵
Cavalli-Sforza LL (1966) Population Structure and Human Evolution. Proceedings of the Royal Society of London. Series B: Biological Sciences, 164, 362–379.
OpenUrl CrossRef
↵
Chandler JA, James PM (2013) Discovery of trypanosomatid parasites in globally distributed Drosophila species. PLoS ONE, 8, e61937.
OpenUrl
↵
Chandler JA, Eisen JA, Kopp A (2012) Yeast communities of diverse Drosophila species: comparison of two symbiont groups in the same hosts. Applied and Environmental Microbiology, 78, 7327–7336.
OpenUrl Abstract/FREE Full Text
↵
Chandler JA, Lang JM, Bhatnagar S, Eisen JA, Kopp A (2011) Bacterial communities of diverse Drosophila species: ecological context of a host-microbe model system. PLoS Genetics, 7, e1002272.
OpenUrl
↵
Charlesworth B (2010) Molecular population genomics: a short history. Genetical Research, 92, 397–411.
OpenUrl CrossRef PubMed
↵
Charlesworth B (2015) Causes of natural variation in fitness: evidence from studies of Drosophila populations. Proceedings of the National Academy of Sciences of the United States of America, 112, 1662–1669.
OpenUrl Abstract/FREE Full Text
↵
Charlesworth B, Sniegowski P, Stephan W (1994) The evolutionary dynamics of repetitive DNA in eukaryotes. Nature, 371, 215–220.
OpenUrl CrossRef PubMed Web of Science
↵
Cheng C, White BJ, Kamdem C et al. (2012) Ecological genomics of Anopheles gambiae along a latitudinal cline: a population-resequencing approach. Genetics, 190, 1417–1432.
OpenUrl Abstract/FREE Full Text
↵
Cingolani P, Platts A, Wang LL et al. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin), 6, 80–92.
OpenUrl
↵
Clement M, Posada D, Crandall KA (2000) TCS: a computer program to estimate gene genealogies. Molecular Ecology, 9, 1657–1659.
OpenUrl CrossRef PubMed Web of Science
↵
Clemente F, Vogl C (2012) Unconstrained evolution in short introns? – An analysis of genome-wide polymorphism and divergence data from Drosophila. Journal of Evolutionary Biology, 25, 1975–1990.
OpenUrl CrossRef PubMed
↵
1. DA Petrov
Comeron JM, Ratnappan R, Bailin S (2012) The many landscapes of recombination in Drosophila melanogaster. ( DA Petrov, Ed,). PLoS Genetics, 8, e1002905.
OpenUrl
↵
Consortium T (2015) A global reference for human genetic variation. 526, 68–74.
OpenUrl
↵
Cooper BS, Burrus CR, Ji C, Hahn MW, Montooth KL (2015) Similar Efficacies of Selection Shape Mitochondrial and Nuclear Genes in Both Drosophila melanogaster and Homo sapiens. G3 (Bethesda, Md.), 5, 2165–2176.
OpenUrl
↵
Corbett-Detig RB, Hartl DL (2012) Population Genomics of Inversion Polymorphisms in Drosophila melanogaster. PLoS Genetics, 8, e1003056.
OpenUrl
↵
Corbett-Detig RB, Cardeno C, Langley CH (2012) Sequence-based detection and breakpoint assembly of polymorphic inversions. Genetics, 192, 131–137.
OpenUrl Abstract/FREE Full Text
↵
Cridland JM, Macdonald SJ, Long AD, Thornton KR (2013) Abundance and distribution of transposable elements in two Drosophila QTL mapping resources. Molecular Biology and Evolution, 30, 2311–2327.
OpenUrl CrossRef PubMed Web of Science
↵
Daborn PJ, Yen JL, Bogwitz MR et al. (2002) A single p450 allele associated with insecticide resistance in Drosophila. Science, 297, 2253–2256.
OpenUrl Abstract/FREE Full Text
↵
Das A, Singh BN (1990) Chromosome inversions in Indian Drosophila melanogaster. Genetica, 81, 85–88.
OpenUrl PubMed
↵
Das A, Singh BN (1991) Genetic differentiation and inversion clines in Indian natural populations of Drosophila melanogaster. Genome / National Research Council Canada = Génome / Conseil national de recherches Canada, 34, 618–625.
OpenUrl
↵
David JR, Capy P (1988) Genetic variation of Drosophila melanogaster natural populations. Trends in genetics: TIG, 4, 106–111.
OpenUrl
↵
de Jong G, Bochdanovits Z (2003) Latitudinal clines in Drosophila melanogaster: body size, allozyme frequencies, inversion frequencies, and the insulin-signalling pathway. Journal of Genetics, 82, 207–223.
OpenUrl CrossRef PubMed Web of Science
↵
Dieringer D, Nolte V, Schlötterer C (2005) Population structure in African Drosophila melanogaster revealed by microsatellite analysis. Molecular Ecology, 14, 563–573.
OpenUrl CrossRef PubMed
↵
Dobzhansky T (1970) Genetics of the Evolutionary Process. Columbia University Press.
↵
Dobzhansky T, Sturtevant AH (1938) Inversions in the Chromosomes of Drosophila Pseudoobscura. Genetics, 23, 28–64.
OpenUrl FREE Full Text
↵
Dray S, Dufour A-B (2007) The ade4 Package: Implementing the Duality Diagram for Ecologists. Journal of Statistical Software, 22.
↵
Duchen P, Zivkovic D, Hutter S, Stephan W, Laurent S (2013) Demographic inference reveals African and European admixture in the North American Drosophila melanogaster population. Genetics, 193, 291–301.
OpenUrl Abstract/FREE Full Text
↵
Ellegren H (2014) Genome sequencing and population genomics in non-model organisms. Trends in Ecology & Evolution, 29, 51–63.
OpenUrl
↵
Endler JA (1977) Geographic Variation, Speciation, and Clines. Princeton University Press.
↵
Fabian DK, Kapun M, Nolte V et al. (2012) Genome-wide patterns of latitudinal differentiation among populations of Drosophila melanogaster from North America. Molecular Ecology, 21, 4748–4769.
OpenUrl CrossRef PubMed Web of Science
↵
Fabian DK, Lack JB, Mathur V et al. (2015) Spatially varying selection shapes life history clines among populations of Drosophila melanogaster from sub-Saharan Africa. Journal of Evolutionary Biology, 28, 826–840.
OpenUrl CrossRef PubMed
↵
Fiston-Lavier A-S, Barrón MG, Petrov DA, González J (2015) T-lex2: genotyping, frequency estimation and re-annotation of transposable elements using single or pooled next-generation sequencing data. Nucleic Acids Research, 43, e22–e22.
OpenUrl CrossRef PubMed
↵
Fiston-Lavier A-S, Singh ND, Lipatov M, Petrov DA (2010) Drosophila melanogaster recombination rate calculator. Gene, 463, 18–20.
OpenUrl CrossRef PubMed Web of Science
↵
Flatt T, Amdam GV, Kirkwood TBL, Omholt SW (2013) Life-history evolution and the polyphenic regulation of somatic maintenance and survival. The quarterly review of biology, 88, 185–218.
OpenUrl CrossRef PubMed
↵
Fraley C, Raftery AE (2012) mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation. Seattle.
↵
Francalacci P, Sanna D (2008) History and geography of human Y-chromosome in Europe: a SNP perspective. Journal of anthropological sciences, 86, 59–89.
OpenUrl
↵
Frichot E, Schoville SD, Bouchard G, François O (2013) Testing for associations between loci and environmental gradients using latent factor mixed models. Molecular Biology and Evolution, 30, 1687–1699.
OpenUrl CrossRef PubMed Web of Science
↵
1. HS Malik
González J, Karasov TL, Messer PW, Petrov DA (2010) Genome-Wide Patterns of Adaptation to Temperate Environments Associated with Transposable Elements in Drosophila ( HS Malik, Ed,). PLoS Genetics, 6, e1000905.
OpenUrl CrossRef
↵
González J, Lenkov K, Lipatov M, Macpherson JM, Petrov DA (2008) High Rate of Recent Transposable Element–Induced Adaptation in Drosophila melanogaster. PLoS Biology, 6, e251.
OpenUrl CrossRef PubMed
↵
Goubert C, Modolo L, Vieira C et al. (2015) De Novo Assembly and Annotation of the Asian Tiger Mosquito (Aedes albopictus) Repeatome with dnaPipeTE from Raw Genomic Reads and Comparative Analysis with the Yellow Fever Mosquito (Aedes aegypti). Genome Biology and Evolution, 7, 1192–1205.
OpenUrl CrossRef PubMed
↵
Grabherr MG, Haas BJ, Yassour M et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology, 29, 644–652.
OpenUrl CrossRef PubMed
↵
Gramates LS, Marygold SJ, Santos GD et al. (2017) FlyBase at 25: looking to the future. Nucleic Acids Research, 45, D663–D671.
OpenUrl CrossRef PubMed
↵
Green RM, Smart WM (1985) Textbook on Spherical Astronomy. Cambridge University.
↵
Haddrill PR, Charlesworth B, Halligan DL, Andolfatto P (2005) Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biology, 6, R67.
OpenUrl CrossRef PubMed
↵
Hales KG, Korey CA, Larracuente AM, Roberts DM (2015) Genetics on the Fly: A Primer on the Drosophila Model System. Genetics, 201, 815–842.
OpenUrl Abstract/FREE Full Text
↵
Hamilton PT, Votýpka J, Dostálová A et al. (2015) Infection Dynamics and Immune Response in a Newly Described Drosophila-Trypanosomatid Association. mBio, 6, e01356–15.
OpenUrl
↵
Handu M, Kaduskar B, Ravindranathan R et al. (2015) SUMO-Enriched Proteome for Drosophila Innate Immune Response. G3 (Bethesda, Md.), 5, 2137–2154.
OpenUrl
↵
Harpur BA, Kent CF, Molodtsova D et al. (2014) Population genomics of the honey bee reveals strong signatures of positive selection on worker traits. Proceedings of the National Academy of Sciences of the United States of America, 111, 2614–2619.
OpenUrl Abstract/FREE Full Text
↵
Haselkorn TS, Markow TA, Moran NA (2009) Multiple introductions of the Spiroplasma bacterial endosymbiont into Drosophila. Molecular Ecology, 18, 1294–1305.
OpenUrl CrossRef PubMed
↵
Hohenlohe PA, Bassham S, Etter PD et al. (2010) Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags. PLoS Genetics, 6, e1000862.
OpenUrl CrossRef
↵
Hu TT, Eisen MB, Thornton KR, Andolfatto P (2013) A second-generation assembly of the Drosophila simulans genome provides new insights into patterns of lineage-specific divergence. Genome Research, 23, 89–98.
OpenUrl Abstract/FREE Full Text
↵
Huang W, Massouras A, Inoue Y et al. (2014) Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Research, 24, 1193–1208.
OpenUrl Abstract/FREE Full Text
↵
Hudson RR, Kreitman M, Aguadé M (1987) A test of neutral molecular evolution based on nucleotide data. Genetics, 116, 153–159.
OpenUrl Abstract/FREE Full Text
↵
Huszar T, Imler J-L (2008) Drosophila viruses and the study of antiviral host-defense. Advances in virus research, 72, 227–265.
OpenUrl CrossRef PubMed Web of Science
↵
Hutter S, Li H, Beisswanger S, De Lorenzo D, Stephan W (2007) Distinctly Different Sex Ratios in African and European Populations of Drosophila melanogaster Inferred From Chromosomewide Single Nucleotide Polymorphism Data. Genetics, 177, 469–480.
OpenUrl Abstract/FREE Full Text
↵
Inoue Y, Watanabe TK (1979) Inversion polymorphisms in Japanese natural populations of Drosophila melanogaster. Japanese Journal of Genetics, 54, 69–82.
OpenUrl
↵
Inoue Y, Watanabe T, Watanabe TK (1984) Evolutionary Change of the Chromosomal Polymorphism in Drosophila melanogaster Populations. Evolution, 38, 753.
OpenUrl CrossRef
↵
Jorde LB, Watkins WS, Bamshad MJ (2001) Population genomics: a bridge from evolutionary history to genetic medicine. Human Molecular Genetics, 10, 2199–2207.
OpenUrl CrossRef PubMed Web of Science
↵
Kao JY, Zubair A, Salomon MP, Nuzhdin SV, Campo D (2015) Population genomic analysis uncovers African and European admixture in Drosophila melanogaster populations from the south-eastern United States and Caribbean Islands. Molecular Ecology, 24, 1499–1509.
OpenUrl CrossRef
↵
Kapitonov VV, Jurka J (2003) Molecular Paleontology of Transposable Elements in the Drosophila melanogaster Genome. Proceedings of the National Academy of Sciences of the United States of America, 100, 6569–6574.
OpenUrl Abstract/FREE Full Text
↵
Kapun M, Fabian DK, Goudet J, Flatt T (2016a) Genomic Evidence for Adaptive Inversion Clines in Drosophila melanogaster. Molecular Biology and Evolution, 33, 1317–1336.
OpenUrl CrossRef PubMed
↵
Kapun M, Schmidt C, Durmaz E, Schmidt PS, Flatt T (2016b) Parallel effects of the inversion In(3R)Payne on body size across the North American and Australian clines in Drosophila melanogaster. Journal of Evolutionary Biology, 29, 1059–1072.
OpenUrl CrossRef
↵
Kapun M, van Schalkwyk H, McAllister B, Flatt T, Schlötterer C (2014) Inference of chromosomal inversion dynamics from Pool-Seq data in natural and laboratory populations of Drosophila melanogaster. Molecular Ecology, 23, 1813–1827.
OpenUrl CrossRef
↵
Kassis JA, Kennison JA, Tamkun JW (2017) Polycomb and Trithorax Group Genes in Drosophila. Genetics, 206, 1699–1725.
OpenUrl Abstract/FREE Full Text
↵
Keller A (2007) Drosophila melanogaster’s history as a human commensal. Current Biology, 17, R77–R81.
OpenUrl CrossRef PubMed Web of Science
↵
Kennington WJ, Hoffmann AA (2013) Patterns of genetic variation across inversions: geographic variation in the In(2L)t inversion in populations of Drosophila melanogaster from eastern Australia. BMC evolutionary biology, 13, 100.
OpenUrl
↵
Kennington WJ, Hoffmann AA, Partridge L (2007) Mapping Regions Within Cosmopolitan Inversion In(3R)Payne Associated With Natural Variation in Body Size in Drosophila melanogaster. Genetics, 177, 549–556.
OpenUrl Abstract/FREE Full Text
↵
Kennison J (2008) Dissection of Larval Salivary Glands and Polytene Chromosome Preparation. Cold Spring Harbor Protocols, 2008, pdb.prot4708–pdb.prot4708.
↵
Kimura M (1984) The Neutral Theory of Molecular Evolution. Cambridge University Press.
↵
Kirkpatrick M (2010) How and why chromosome inversions evolve. PLoS Biology, 8, e1000501.
OpenUrl CrossRef PubMed
↵
Knibb WR (1982) Chromosome inversion polymorphisms in Drosophila melanogaster II. Geographic clines and climatic associations in Australasia, North America and Asia. Genetica, 58, 213–221.
OpenUrl CrossRef Web of Science
↵
Knibb WR (1986) Temporal variation of Drosophila melanogaster Adh allele frequencies, inversion freqencies, and population sizes. Genetica, 71, 175–190.
OpenUrl CrossRef
↵
Knibb WR, Oakeshott JG, Gibson JB (1981) Chromosome Inversion Polymorphisms in Drosophila melanogaster. I. Latitudinal Clines and Associations between Inversions in Australasian Populations. Genetics, 98, 833–847.
OpenUrl Abstract/FREE Full Text
↵
Kofler R, Betancourt AJ, Schlötterer C (2012) Sequencing of pooled DNA samples (Pool-Seq) uncovers complex dynamics of transposable element insertions in Drosophila melanogaster. PLoS Genetics, 8, e1002487.
OpenUrl
↵
Kofler R, Orozco-terWengel P, De Maio N et al. (2011) PoPoolation: A Toolbox for Population Genetic Analysis of Next Generation Sequencing Data from Pooled Individuals. PLoS ONE, 6, e15925.
OpenUrl CrossRef PubMed
↵
Kolaczkowski B, Hupalo DN, Kern AD (2011a) Recurrent adaptation in RNA interference genes across the Drosophila phylogeny. Molecular Biology and Evolution, 28, 1033–1042.
OpenUrl CrossRef PubMed Web of Science
↵
Kolaczkowski B, Kern AD, Holloway AK, Begun DJ (2011b) Genomic Differentiation Between Temperate and Tropical Australian Populations of Drosophila melanogaster. Genetics, 187, 245–260.
OpenUrl Abstract/FREE Full Text
↵
Korneliussen TS, Moltke I, Albrechtsen A, Nielsen R (2013) Calculation of Tajima’s D and other neutrality test statistics from low depth next-generation sequencing data. BMC Bioinformatics, 14, 289.
OpenUrl CrossRef PubMed
↵
Kreitman M (1983) Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature, 304, 412–417.
OpenUrl CrossRef PubMed
↵
Kriesner P, Conner WR, Weeks AR, Turelli M, Hoffmann AA (2016) Persistence of a Wolbachia infection frequency cline in Drosophila melanogaster and the possible role of reproductive dormancy. Evolution, 70, 979–997.
OpenUrl CrossRef
↵
Kunze-Mühl E, Müller E (1957) Weitere Untersuchungen über die chromosomale Struktur und die naturlichen Strukturtypen von Drosophila subobscura coll. Chromosoma, 9, 559–570.
OpenUrl
↵
Kühn I, Dormann CF (2012) Less than eight (and a half) misconceptions of spatial analysis. Journal of Biogeography, 39, 995–998.
OpenUrl CrossRef
↵
Lachaise D, Cariou M-L, David JR et al. (1988) Historical Biogeography of the Drosophila melanogaster Species Subgroup. In: Evolutionary Biology, pp. 159–225. Springer, Boston, MA, Boston, MA.
↵
Lack JB, Cardeno CM, Crepeau MW et al. (2015) The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population. Genetics, 199, 1229–1241.
OpenUrl Abstract/FREE Full Text
↵
Lack JB, Lange JD, Tang AD, Corbett-Detig RB, Pool JE (2016) A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus. Molecular Biology and Evolution, 33, msw195–3313.
OpenUrl
↵
Lang DT RJSONIO: Serialize R objects to JSON, JavaScript Object Notation. https://CRAN.R-project.org/package=RJSONIO.
↵
Langley CH, Stevens K, Cardeno C et al. (2012) Genomic variation in natural populations of Drosophila melanogaster. Genetics, 192, 533–598.
OpenUrl Abstract/FREE Full Text
↵
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nature Methods, 9, 357–359.
OpenUrl CrossRef
↵
Lavington E, Kern AD (2017) The Effect of Common Inversion PolymorphismsIn(2L)tandIn(3R)Moon Patterns of Transcriptional Variation inDrosophila melanogaster. G3 (Bethesda, Md.), 7, 3659–3668.
OpenUrl
↵
Lawrie DS, Messer PW, Hershberg R, Petrov DA (2013) Strong Purifying Selection at Synonymous Sites in D. melanogaster. arXiv.org, q-bio.PE, e1003527.
↵
1. Krimbas CB,
2. Powell JR
Lemeunier F, Aulard S (1992) Inversion polymorphism in Drosophila melanogaster. In: Drosophila Inversion Polymorphism (eds Krimbas CB, Powell JR), p. 576. CRC Press.
↵
Levins R (1968) Evolution in Changing Environments. Princeton University Press.
↵
Lewontin RC (1974) The Genetic Basis of Evolutionary Change. Columbia University Press.
↵
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv.org.
↵
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754–1760.
OpenUrl CrossRef PubMed Web of Science
↵
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research, 18, 1851–1858.
OpenUrl Abstract/FREE Full Text
↵
Luikart G, England PR, Tallmon D, Jordan S, Tableret P (2003) The power and promise of population genomics: from genotyping to genome typing. Nature Reviews Genetics, 4, 981–994.
OpenUrl CrossRef PubMed Web of Science
↵
Lyne R, Smith R, Rutherford K et al. (2007) FlyMine: an integrated database for Drosophila and Anopheles genomics. Genome Biology, 8, R129.
OpenUrl CrossRef PubMed
↵
Machado HE, Bergland AO, O’Brien KR et al. (2016) Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster. Molecular Ecology, 25, 723–740.
OpenUrl CrossRef
↵
Machado H, Bergland AO, Taylor R et al. (2018) Broad geographic sampling reveals predictable and pervasive seasonal adaptation in Drosophila. bioRxiv. https://doi.org/10.1101/337543
↵
Mackay TFC, Richards S, Stone EA et al. (2012) The Drosophila melanogaster Genetic Reference Panel. Nature, 482, 173–178.
OpenUrl CrossRef PubMed Web of Science
↵
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 17, 10–12.
OpenUrl CrossRef PubMed
↵
Martino ME, Ma D, Leulier F (2017) Microbial influence on Drosophila biology. Current Opinion in Microbiology, 38, 165–170.
OpenUrl CrossRef
↵
Mateo L, Rech GE, González J (2018) Genome-wide patterns of local adaptation in Drosophila melanogaster: adding intra European variability to the map. bioRxiv.
↵
Matthias P, Yoshida M, Khochbin S (2008) HDAC6 a new cellular stress surveillance factor. Cell Cycle, 7, 7–10.
OpenUrl CrossRef PubMed Web of Science
↵
Matzkin LM, Merritt TJS, Zhu C-T, Eanes WF (2005) The structure and population genetics of the breakpoints associated with the cosmopolitan chromosomal inversion In(3R)Payne in Drosophila melanogaster. Genetics, 170, 1143–1152.
OpenUrl Abstract/FREE Full Text
↵
McDonald JH, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature, 351, 652–654.
OpenUrl CrossRef PubMed Web of Science
↵
McKenna A, Hanna M, Banks E et al. (2010) The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20, 1297–1303.
OpenUrl Abstract/FREE Full Text
↵
Menozzi P, Piazza A, Cavalli-Sforza L (1978) Synthetic maps of human gene frequencies in Europeans. Science, 201, 786–792.
OpenUrl Abstract/FREE Full Text
↵
Messer PW, Petrov DA (2013) Population genomics of rapid adaptation by soft selective sweeps. 28, 659–669.
OpenUrl
↵
Mettler LE, Voelker RA, Mukai T (1977) Inversion Clines in Populations of Drosophila melanogaster. Genetics, 87, 169–176.
OpenUrl Abstract/FREE Full Text
↵
Meyer F, Paarmann D, D’Souza M et al. (2008) The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics, 9, 386.
OpenUrl CrossRef PubMed
↵
Micallef L, Rodgers P (2014) eulerAPE: drawing area-proportional 3-Venn diagrams using ellipses. PLoS ONE, 9, e101717.
OpenUrl CrossRef PubMed
↵
Michalakis Y, Veuille M (1996) Length variation of CAG/CAA trinucleotide repeats in natural populations of Drosophila melanogaster and its relation to the recombination rate. Genetics, 143, 1713–1725.
OpenUrl Abstract/FREE Full Text
↵
Moran PAP (1950) Notes on Continuous Stochastic Phenomena. Biometrika, 37, 17.
OpenUrl CrossRef PubMed Web of Science
↵
Navarro A, Faria R (2014) Pool and conquer: new tricks for (c)old problems. Molecular Ecology, 23, 1653–1655.
OpenUrl
↵
Nei M (1987) Molecular Evolutionary Genetics. Columbia University Press.
↵
Nolte V, Pandey RV, Kofler R, Schlötterer C (2013) Genome-wide patterns of natural variation reveal strong selective sweeps and ongoing genomic conflict in Drosophila mauritiana. Genome Research, 23, 99–110.
OpenUrl Abstract/FREE Full Text
↵
Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nature Genetics, 40, 646–649.
OpenUrl CrossRef PubMed Web of Science
↵
Nunes MDS, Neumeier H, Schlötterer C (2008) Contrasting patterns of natural variation in global Drosophila melanogaster populations. Molecular Ecology, 17, 4470–4479.
OpenUrl CrossRef PubMed
↵
Okonechnikov K, Conesa A, García-Alcalde F (2016) Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics, 32, 292–294.
OpenUrl CrossRef PubMed
↵
Parsch J, Novozhilov S, Saminadin-Peter SS, Wong KM, Andolfatto P (2010) On the utility of short intron sequences as a reference for the detection of positive and negative selection in Drosophila. Molecular Biology and Evolution, 27, 1226–1234.
OpenUrl CrossRef PubMed Web of Science
↵
Peel MC, Finlayson BL, McMahon TA (2007) Updated world map of the Köppen-Geiger climate classification. Hydrology and Earth System Sciences, 11, 1633–1644.
OpenUrl CrossRef Web of Science
↵
Petrov DA, Fiston-Lavier AS, Lipatov M, Lenkov K, González J (2011) Population Genomics of Transposable Elements in Drosophila melanogaster. Molecular Biology and Evolution, 28, 1633–1644.
OpenUrl CrossRef PubMed Web of Science
↵
Pimpinelli S, Bonaccorsi S, Fanti L, Gatti M (2010) Preparation and Orcein Staining of Mitotic Chromosomes from Drosophila Larval Brain. Cold Spring Harbor Protocols, 2010, pdb.prot5389–pdb.prot5389.
↵
Pool JE (2015) The Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D. melanogaster Reference Genome Reveals a Network of Epistatic Fitness Interactions. Molecular Biology and Evolution, 32, 3236–3251.
OpenUrl CrossRef PubMed
↵
Pool JE, Braun DT, Lack JB (2016) Parallel Evolution of Cold Tolerance Within Drosophila melanogaster. Molecular Biology and Evolution, msw232.
↵
Pool JE, Corbett-Detig RB, Sugino RP et al. (2012) Population Genomics of Sub-Saharan Drosophila melanogaster: African Diversity and Non-African Admixture. PLoS Genetics, 8, e1003080.
OpenUrl
↵
Powell JR (1997) Progress and Prospects in Evolutionary Biology: The Drosophila Model. Oxford University Press.
↵
R Development Core Team (2009) R: A Language and Environment for Statistical Computing. R-project.org.
↵
Rako L, Anderson AR, Sgrò CM, Stocker AJ, Hoffmann AA (2006) The association between inversion In(3R)Payne and clinally varying traits in Drosophila melanogaster. Genetica, 128, 373–384.
OpenUrl CrossRef PubMed
↵
Ramachandran S, Rosenberg NA, Zhivotovsky LA, Feldman MW (2004) Robustness of the inference of human population structure: a comparison of X-chromosomal and autosomal microsatellites. Human genomics, 1, 87–97.
OpenUrl PubMed
↵
Rane RV, Rako L, Kapun M, Lee SF (2015) Genomic evidence for role of inversion 3RP of Drosophila melanogaster in facilitating climate change adaptation. Molecular Ecology, 24, 2423–2432.
OpenUrl CrossRef
↵
Richardson JL, Urban MC, Bolnick DI, Skelly DK (2014) Microgeographic adaptation and the spatial scale of evolution. Trends in Ecology & Evolution, 29, 165–176.
OpenUrl
↵
Richardson MF, Weinert LA, Welch JJ et al. (2012) Population Genomics of the Wolbachia Endosymbiont in Drosophila melanogaster (A Kopp, Ed,). PLoS Genetics, 8, e1003129.
OpenUrl
↵
1. Roberts DB
(Ed.) (1998) Drosophila: A Practical Approach The Practical Approach. Practical Approach Series.
↵
Rogers RL, Hartl DL (2012) Chimeric genes as a source of rapid evolution in Drosophila melanogaster. Molecular Biology and Evolution, 29, 517–529.
OpenUrl CrossRef PubMed Web of Science
↵
Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC et al. (2017) DnaSP 6: DNA Sequence Polymorphism Analysis of Large Datasets. Molecular Biology and Evolution, 34, 32993302.
OpenUrl
↵
Salmela L, Schröder J (2011) Correcting errors in short reads by multiple alignments. Bioinformatics, 27, 1455–1461.
OpenUrl CrossRef PubMed Web of Science
↵
Schlenke TA, Begun DJ (2003) Natural selection drives Drosophila immune system evolution., 164, 1471–1480.
OpenUrl
↵
Schlötterer C, Neumeier H, Sousa C, Nolte V (2006) Highly structured Asian Drosophila melanogaster populations: a new tool for hitchhiking mapping? Genetics, 172, 287–292.
OpenUrl Abstract/FREE Full Text
↵
Schlötterer C, Tobler R, Kofler R, Nolte V (2014) Sequencing pools of individuals - mining genome-wide polymorphism data without big funding. Nature Reviews Genetics, 15, 749–763.
OpenUrl CrossRef PubMed
↵
Schmidt JM, Good RT, Appleton B et al. (2010) Copy number variation and transposable elements feature in recent, ongoing adaptation at the Cyp6g1 locus. PLoS Genetics, 6, e1000998.
OpenUrl
↵
Schmidt PS, Paaby AB (2008) Reproductive Diapause and Life-History Clines in North American Populations of Drosophila melanogaster. Evolution, 62, 1204–1215.
OpenUrl CrossRef PubMed Web of Science
↵
Schmidt PS, Zhu CT, Das J et al. (2008) An amino acid polymorphism in the couch potato gene forms the basis for climatic adaptation in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America, 105, 16207–16211.
OpenUrl Abstract/FREE Full Text
↵
Sella G, Petrov DA, Przeworski M, Andolfatto P (2009) Pervasive Natural Selection in the Drosophila Genome? PLoS Genetics, 5, e1000495.
OpenUrl
↵
Sezgin E, Duvernell DD, Matzkin LM et al. (2004) Single-locus latitudinal clines and their relationship to temperate adaptation in metabolic genes and derived alleles in Drosophila melanogaster. Genetics, 168, 923–931.
OpenUrl Abstract/FREE Full Text
↵
Singh AK (2018) Chromosomal variability due to inversions in indian populations of Drosophila. Indian J Sci Res.
↵
Singh BN, Das A (1992) Further evidence for latitudinal inversion clines in natural populations of Drosophila melanogaster from India. Journal of Heredity, 83, 227–230.
OpenUrl PubMed
↵
Singh ND, Arndt PF, Clark AG, Aquadro CF (2009) Strong evidence for lineage and sequence specificity of substitution rates and patterns in Drosophila. Molecular Biology and Evolution, 26, 1591–1605.
OpenUrl CrossRef PubMed Web of Science
↵
Stalker HD (1976) Chromosome studies in wild populations of D. melanogaster. Genetics, 82, 323–347.
OpenUrl Abstract/FREE Full Text
↵
Stalker HD (1980) Chromosome Studies in Wild Populations of Drosophila melanogaster. II. Relationship of Inversion Frequencies to Latitude, Season, Wing-Loading and Flight Activity. Genetics, 95, 211–223.
OpenUrl Abstract/FREE Full Text
↵
Staubach F, Baines JF, Künzel S, Bik EM, Petrov DA (2013) Host species and environmental effects on bacterial communities associated with Drosophila in the laboratory and in the natural environment. PLoS ONE, 8, e70749.
OpenUrl CrossRef PubMed
↵
Stephan W (2010) Genetic hitchhiking versus background selection: the controversy and its implications. Philosophical Transactions Of The Royal Society Of London Series B-Biological Sciences, 365, 1245–1253.
OpenUrl CrossRef PubMed
↵
Svetec N, Pavlidis P, Stephan W (2009) Recent strong positive selection on Drosophila melanogaster HDAC6, a gene encoding a stress surveillance factor, as revealed by population genomic analysis. Molecular Biology and Evolution, 26, 1549–1556.
OpenUrl CrossRef PubMed Web of Science
↵
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics, 123, 585–595.
OpenUrl Abstract/FREE Full Text
↵
Trinder M, Daisley BA, Dube JS, Reid G (2017) Drosophila melanogaster as a High-Throughput Model for Host-Microbiota Interactions. Frontiers in microbiology, 8, 751.
OpenUrl
↵
Turner TL, Levine MT, Eckert ML, Begun DJ (2008) Genomic analysis of adaptive differentiation in Drosophila melanogaster. Genetics, 179, 455–473.
OpenUrl Abstract/FREE Full Text
↵
Umina PA, Weeks AR, Kearney MR, McKechnie SW, Hoffmann AA (2005) A rapid shift in a classic clinal pattern in Drosophila reflecting climate change. Science, 308, 691–693.
OpenUrl Abstract/FREE Full Text
↵
Unckless RL (2011) A DNA virus of Drosophila. PLoS ONE, 6, e26564.
OpenUrl CrossRef PubMed
↵
Van ‘t Land J, Van Putten WF, Villarroel H, Kamping A, van Delden W (2000) Latitudinal variation for two enzyme loci and an inversion polymorphism in Drosophila melanogaster from Central and South America. Evolution, 54, 201–209.
OpenUrl PubMed
↵
Voelker RA, Cockerham CC, Johnson FM et al. (1978) INVERSIONS FAIL TO ACCOUNT FOR ALLOZYME CLINES. Genetics, 88, 515–527.
OpenUrl Abstract/FREE Full Text
↵
Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theoretical Population Biology, 7, 256–276.
OpenUrl CrossRef PubMed Web of Science
↵
Webster CL, Longdon B, Lewis SH, Obbard DJ (2016) Twenty-Five New Viruses Associated with the Drosophilidae (Diptera). Evolutionary bioinformatics online, 12, 13–25.
OpenUrl
↵
Webster CL, Waldron FM, Robertson S et al. (2015) The Discovery, Distribution, and Evolution of Viruses Associated with Drosophila melanogaster. PLoS Biology, 13, e1002210.
OpenUrl CrossRef PubMed
↵
Weir BS, Cockerham CC (1984) Estimating F-Statistics for the Analysis of Population Structure. Evolution, 38, 1358–1370.
OpenUrl CrossRef PubMed Web of Science
↵
Werren JH, Baldo L, Clark ME (2008) Wolbachia: master manipulators of invertebrate biology. Nature Reviews Microbiology, 6, 741–751.
OpenUrl CrossRef PubMed Web of Science
↵
Wesley CS, Eanes WF (1994) Isolation and analysis of the breakpoint sequences of chromosome inversion In(3L)Payne in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America, 91, 3132–3136.
OpenUrl Abstract/FREE Full Text
↵
Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis. Springer.
↵
Wilfert L, Longdon B, Ferreira AGA, Bayer F, Jiggins FM (2011) Trypanosomatids are common and diverse parasites of Drosophila. Parasitology, 138, 858–865.
OpenUrl
↵
Wittmann MJ, Bergland AO, Feldman MW, Schmidt PS, Petrov DA (2017) Seasonally fluctuating selection can maintain polymorphism at many loci via segregation lift. Proceedings of the National Academy of Sciences of the United States of America, 114, E9932–E9941.
OpenUrl Abstract/FREE Full Text
↵
Wolf JBW, Bayer T, Haubold B et al. (2010) Nucleotide divergence vs. gene expression differentiation: comparative transcriptome sequencing in natural isolates from the carrion crow and its hybrid zone with the hooded crow. Molecular Ecology, 19, 162–175.
OpenUrl CrossRef PubMed Web of Science
↵
Wolff JN, Camus MF, Clancy DJ, Dowling DK (2016) Complete mitochondrial genome sequences of thirteen globally sourced strains of fruit fly (Drosophila melanogaster) form a powerful model for mitochondrial research. Mitochondrial DNA. Part A, DNA mapping, sequencing, and analysis, 27, 4672–4674.
OpenUrl
↵
Xiao F-X, Yotova V, Zietkiewicz E et al. (2004) Human X-chromosomal lineages in Europe reveal Middle Eastern and Asiatic contacts. European Journal of Human Genetics, 12, 301–311.
OpenUrl CrossRef PubMed Web of Science
↵
Yukilevich R, True JR (2008a) Incipient sexual isolation among cosmopolitan Drosophila melanogaster populations. Evolution, 62, 2112–2121.
OpenUrl CrossRef PubMed
↵
Yukilevich R, True JR (2008b) African morphology, behavior and phermones underlie incipient sexual isolation between us and Caribbean Drosophila melanogaster. Evolution, 62, 2807–2828.
OpenUrl CrossRef PubMed
↵
Yukilevich R, Turner TL, Aoki F, Nuzhdin SV, True JR (2010) Patterns and processes of genome-wide divergence between North American and African Drosophila melanogaster. Genetics, 186, 219–239.
OpenUrl Abstract/FREE Full Text
↵
1. AK Chakraborty
Zanini F, Brodin J, Thebo L et al. (2015) Population genomics of intrapatient HIV-1 evolution. ( AK Chakraborty, Ed,). eLife, 4, e11282.
OpenUrl PubMed

View the discussion thread.

Posted June 04, 2018.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5200)
Biochemistry (11703)
Bioengineering (8722)
Bioinformatics (29127)
Biophysics (14932)
Cancer Biology (12048)
Cell Biology (17359)
Clinical Trials (138)
Developmental Biology (9406)
Ecology (14143)
Epidemiology (2067)
Evolutionary Biology (18268)
Genetics (12220)
Genomics (16766)
Immunology (11841)
Microbiology (28005)
Molecular Biology (11552)
Neuroscience (60808)
Paleontology (450)
Pathology (1864)
Pharmacology and Toxicology (3231)
Physiology (4939)
Plant Biology (10384)
Scientific Communication and Education (1679)
Synthetic Biology (2877)
Systems Biology (7333)
Zoology (1642)

[1] ↵
Adrian AB, Comeron JM (2013) The Drosophila early ovarian transcriptome provides insight to the molecular causes of recombination rate variation across genomes. BMC Genomics, 14, 794.
OpenUrl CrossRef PubMed

[2] ↵
Adrion JR, Hahn MW, Cooper BS (2015) Revisiting classic clines in Drosophila melanogaster in the age of genomics. Trends in Genetics, 31, 434–444.
OpenUrl CrossRef PubMed

[3] ↵
Alonso-Blanco C, Andrade J, Becker C et al. (2016) 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. 166, 481–491.
OpenUrl

[4] ↵
Anderson AR, Hoffmann AA, McKechnie SW, Umina PA, Weeks AR (2005) The latitudinal cline in the In(3R)Payne inversion polymorphism has shifted in the last 20 years in Australian Drosophila melanogaster populations. Molecular Ecology, 14, 851–858.
OpenUrl CrossRef PubMed

[5] ↵
Anderson PR, Knibb WR, Oakeshott JG (1987) Observations on the extent and temporal stability of latitudinal clines for alcohol dehydrogenase allozymes and four chromosome inversions in Drosophila melanogaster. Genetica, 75, 81–88.
OpenUrl CrossRef PubMed Web of Science

[6] ↵
Anderson WW, Arnold J, Baldwin DG et al. (1991) Four decades of inversion polymorphism in Drosophila pseudoobscura. Proceedings of the National Academy of Sciences of the United States of America, 88, 10367–10371.
OpenUrl Abstract/FREE Full Text

[7] ↵
Andolfatto P, Wall JD, Kreitman M (1999) Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster. Genetics, 153, 1297–1311.
OpenUrl Abstract/FREE Full Text

[8] ↵
Ashburner M, Lemeunier F (1976) Relationships within the melanogaster Species Subgroup of the Genus Drosophila (Sophophora). I. Inversion Polymorphisms in Drosophila melanogaster and Drosophila simulans. Proceedings of the Royal Society of London. Series B: Biological Sciences, 193, 137–157.
OpenUrl CrossRef

[9] ↵
Aulard S, David JR, Lemeunier F (2002) Chromosomal inversion polymorphism in Afrotropical populations of Drosophila melanogaster. Genetic Research, 79, 49–63.
OpenUrl

[10] ↵
Bankevich A, Nurk S, Antipov D et al. (2012) SPAdes, a New Genome Assembly Algorithm and Its Applications to Single-cell Sequencing (7th Annual SFAF Meeting, 2012). Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA.

[11] ↵
Barata A, Santos SC, Malfeito-Ferreira M, Loureiro V (2012) New insights into the ecological interaction between grape berry microorganisms and Drosophila flies during the development of sour rot. Microbial ecology, 64, 416–430.
OpenUrl CrossRef PubMed Web of Science

[12] ↵
Bartolomé C, Maside X, Charlesworth B (2002) On the Abundance and Distribution of Transposable Elements in the Genome of Drosophila melanogaster. Molecular Biology and Evolution, 19, 926–937.
OpenUrl CrossRef PubMed Web of Science

[13] ↵
P Wittkopp
Bastide H, Betancourt A, Nolte V et al. (2013) A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster. ( P Wittkopp, Ed,). PLoS Genetics, 9, e1003534.
OpenUrl

[14] P Wittkopp

[15] ↵
Baudry E, Viginier B, Veuille M (2004) Non-African populations of Drosophila melanogaster have a unique origin. Molecular Biology and Evolution, 21, 1482–1491.
OpenUrl CrossRef PubMed Web of Science

[16] ↵
Becher PG, Flick G, Rozpędowska E et al. (2012) Yeast, not fruit volatiles mediate Drosophila melanogaster attraction, oviposition and development. Functional Ecology, 26, 822–828.
OpenUrl CrossRef Web of Science

[17] ↵
SV Nuzhdin
Begun DJ (2015) Parallel Gene Expression Differences between Low and High Latitude Populations of Drosophila melanogaster and D. simulans. ( SV Nuzhdin, Ed,). PLoS Genetics, 11, e1005184.
OpenUrl

[18] SV Nuzhdin

[19] ↵
Begun DJ, Aquadro CF (1993) African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature, 365, 548–550.
OpenUrl CrossRef PubMed Web of Science

[20] ↵
Begun DJ, Holloway AK, Stevens K et al. (2007) Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans. PLoS Biology, 5, e310.
OpenUrl CrossRef PubMed

[21] ↵
Behrman EL, Howick VM, Kapun M et al. (2018) Rapid seasonal evolution in innate immunity of wild Drosophila melanogaster. Proceedings of the Royal Society B: Biological Sciences, 285, 20172599.
OpenUrl CrossRef PubMed

[22] ↵
Beisswanger S, Stephan W, De Lorenzo D (2006) Evidence for a Selective Sweep in the wapl Region of Drosophila melanogaster. Genetics, 172, 265–274.
OpenUrl Abstract/FREE Full Text

[23] ↵
Bergland AO, Behrman EL, O’Brien KR, Schmidt PS, Petrov DA (2014) Genomic Evidence of Rapid and Stable Adaptive Oscillations over Seasonal Time Scales in Drosophila. PLoS Genetics, 10, e1004775.
OpenUrl

[24] ↵
Bergland AO, Tobler R, González J, Schmidt P, Petrov D (2016) Secondary contact and local adaptation contribute to genome-wide patterns of clinal variation in Drosophila melanogaster. Molecular Ecology, 25, 1157–1174.
OpenUrl CrossRef PubMed

[25] ↵
Bivand R, Piras G (2015) Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63, 1–36.
OpenUrl

[26] ↵
Black WC IV., Black WC IV., Baer CF, Antolin MF, DuTeau NM (2001) Population genomics: genome-wide sampling of insect populations. Annual Review of Entomology.

[27] ↵
Blumenstiel JP, Chen X, He M, Bergman CM (2014) An Age-of-Allele Test of Neutrality for Transposable Element Insertions. Genetics, 196, 523–538.
OpenUrl Abstract/FREE Full Text

[28] ↵
Boitard S, Schlötterer C, Nolte V, Pandey RV, Futschik A (2012) Detecting Selective Sweeps from Pooled Next-Generation Sequencing Samples. Molecular Biology and Evolution, 29, 2177–2186.
OpenUrl CrossRef PubMed Web of Science

[29] ↵
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30, 2114–2120.
OpenUrl CrossRef PubMed Web of Science

[30] ↵
Boussy IA, Itoh M, Rand D, Woodruff RC (1998) Origin and decay of the P element-associated latitudinal cline in Australian Drosophila melanogaster. Genetica, 104, 45–57.
OpenUrl CrossRef PubMed Web of Science

[31] ↵
Božičević V, Hutter S, Stephan W, Wollstein A (2016) Population genetic evidence for cold adaptation in European Drosophila melanogaster populations. Molecular Ecology, 25, 1175–1191.
OpenUrl CrossRef

[32] ↵
Braithwaite DP, Keegan KP matR: Metagenomics Analysis Tools for R. https://CRAN.R-project.org/package=matR.

[33] ↵
Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12, 59–60.
OpenUrl CrossRef

[34] ↵
Buser CC, Newcomb RD, Gaskett AC, Goddard MR (2014) Niche construction initiates the evolution of mutualistic interactions. Ecology Letters, 17, 1257–1264.
OpenUrl CrossRef PubMed

[35] ↵
Bushnell B (2016) BBMap short read aligner. URL http://sourceforge.net/projects/bbmap.

[36] ↵
Camus MF, Wolff JN, Sgrò CM, Dowling DK (2017) Experimental Support That Natural Selection Has Shaped the Latitudinal Distribution of Mitochondrial Haplotypes in Australian Drosophila melanogaster. Molecular Biology and Evolution, 34, 2600–2612.
OpenUrl

[37] ↵
Caracristi G, Schlötterer C (2003) Genetic Differentiation Between American and European Drosophila melanogaster Populations Could Be Attributed to Admixture of African Alleles. Molecular Biology and Evolution, 20, 792–799.
OpenUrl CrossRef PubMed Web of Science

[38] ↵
Casillas S, Barbadilla A (2017) Molecular Population Genetics. Genetics, 205, 1003–1035.
OpenUrl Abstract/FREE Full Text

[39] ↵
Catania F, Kauer MO, Daborn PJ et al. (2004) World-wide survey of an Accord insertion and its association with DDT resistance in Drosophila melanogaster. Molecular Ecology, 13, 2491–2504.
OpenUrl CrossRef PubMed Web of Science

[40] ↵
Cavalli-Sforza LL (1966) Population Structure and Human Evolution. Proceedings of the Royal Society of London. Series B: Biological Sciences, 164, 362–379.
OpenUrl CrossRef

[41] ↵
Chandler JA, James PM (2013) Discovery of trypanosomatid parasites in globally distributed Drosophila species. PLoS ONE, 8, e61937.
OpenUrl

[42] ↵
Chandler JA, Eisen JA, Kopp A (2012) Yeast communities of diverse Drosophila species: comparison of two symbiont groups in the same hosts. Applied and Environmental Microbiology, 78, 7327–7336.
OpenUrl Abstract/FREE Full Text

[43] ↵
Chandler JA, Lang JM, Bhatnagar S, Eisen JA, Kopp A (2011) Bacterial communities of diverse Drosophila species: ecological context of a host-microbe model system. PLoS Genetics, 7, e1002272.
OpenUrl

[44] ↵
Charlesworth B (2010) Molecular population genomics: a short history. Genetical Research, 92, 397–411.
OpenUrl CrossRef PubMed

[45] ↵
Charlesworth B (2015) Causes of natural variation in fitness: evidence from studies of Drosophila populations. Proceedings of the National Academy of Sciences of the United States of America, 112, 1662–1669.
OpenUrl Abstract/FREE Full Text

[46] ↵
Charlesworth B, Sniegowski P, Stephan W (1994) The evolutionary dynamics of repetitive DNA in eukaryotes. Nature, 371, 215–220.
OpenUrl CrossRef PubMed Web of Science

[47] ↵
Cheng C, White BJ, Kamdem C et al. (2012) Ecological genomics of Anopheles gambiae along a latitudinal cline: a population-resequencing approach. Genetics, 190, 1417–1432.
OpenUrl Abstract/FREE Full Text

[48] ↵
Cingolani P, Platts A, Wang LL et al. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin), 6, 80–92.
OpenUrl

[49] ↵
Clement M, Posada D, Crandall KA (2000) TCS: a computer program to estimate gene genealogies. Molecular Ecology, 9, 1657–1659.
OpenUrl CrossRef PubMed Web of Science

[50] ↵
Clemente F, Vogl C (2012) Unconstrained evolution in short introns? – An analysis of genome-wide polymorphism and divergence data from Drosophila. Journal of Evolutionary Biology, 25, 1975–1990.
OpenUrl CrossRef PubMed

[51] ↵
DA Petrov
Comeron JM, Ratnappan R, Bailin S (2012) The many landscapes of recombination in Drosophila melanogaster. ( DA Petrov, Ed,). PLoS Genetics, 8, e1002905.
OpenUrl

[52] DA Petrov

[53] ↵
Consortium T (2015) A global reference for human genetic variation. 526, 68–74.
OpenUrl

[54] ↵
Cooper BS, Burrus CR, Ji C, Hahn MW, Montooth KL (2015) Similar Efficacies of Selection Shape Mitochondrial and Nuclear Genes in Both Drosophila melanogaster and Homo sapiens. G3 (Bethesda, Md.), 5, 2165–2176.
OpenUrl

[55] ↵
Corbett-Detig RB, Hartl DL (2012) Population Genomics of Inversion Polymorphisms in Drosophila melanogaster. PLoS Genetics, 8, e1003056.
OpenUrl

[56] ↵
Corbett-Detig RB, Cardeno C, Langley CH (2012) Sequence-based detection and breakpoint assembly of polymorphic inversions. Genetics, 192, 131–137.
OpenUrl Abstract/FREE Full Text

[57] ↵
Cridland JM, Macdonald SJ, Long AD, Thornton KR (2013) Abundance and distribution of transposable elements in two Drosophila QTL mapping resources. Molecular Biology and Evolution, 30, 2311–2327.
OpenUrl CrossRef PubMed Web of Science

[58] ↵
Daborn PJ, Yen JL, Bogwitz MR et al. (2002) A single p450 allele associated with insecticide resistance in Drosophila. Science, 297, 2253–2256.
OpenUrl Abstract/FREE Full Text

[59] ↵
Das A, Singh BN (1990) Chromosome inversions in Indian Drosophila melanogaster. Genetica, 81, 85–88.
OpenUrl PubMed

[60] ↵
Das A, Singh BN (1991) Genetic differentiation and inversion clines in Indian natural populations of Drosophila melanogaster. Genome / National Research Council Canada = Génome / Conseil national de recherches Canada, 34, 618–625.
OpenUrl

[61] ↵
David JR, Capy P (1988) Genetic variation of Drosophila melanogaster natural populations. Trends in genetics: TIG, 4, 106–111.
OpenUrl

[62] ↵
de Jong G, Bochdanovits Z (2003) Latitudinal clines in Drosophila melanogaster: body size, allozyme frequencies, inversion frequencies, and the insulin-signalling pathway. Journal of Genetics, 82, 207–223.
OpenUrl CrossRef PubMed Web of Science

[63] ↵
Dieringer D, Nolte V, Schlötterer C (2005) Population structure in African Drosophila melanogaster revealed by microsatellite analysis. Molecular Ecology, 14, 563–573.
OpenUrl CrossRef PubMed

[64] ↵
Dobzhansky T (1970) Genetics of the Evolutionary Process. Columbia University Press.

[65] ↵
Dobzhansky T, Sturtevant AH (1938) Inversions in the Chromosomes of Drosophila Pseudoobscura. Genetics, 23, 28–64.
OpenUrl FREE Full Text

[66] ↵
Dray S, Dufour A-B (2007) The ade4 Package: Implementing the Duality Diagram for Ecologists. Journal of Statistical Software, 22.

[67] ↵
Duchen P, Zivkovic D, Hutter S, Stephan W, Laurent S (2013) Demographic inference reveals African and European admixture in the North American Drosophila melanogaster population. Genetics, 193, 291–301.
OpenUrl Abstract/FREE Full Text

[68] ↵
Ellegren H (2014) Genome sequencing and population genomics in non-model organisms. Trends in Ecology & Evolution, 29, 51–63.
OpenUrl

[69] ↵
Endler JA (1977) Geographic Variation, Speciation, and Clines. Princeton University Press.

[70] ↵
Fabian DK, Kapun M, Nolte V et al. (2012) Genome-wide patterns of latitudinal differentiation among populations of Drosophila melanogaster from North America. Molecular Ecology, 21, 4748–4769.
OpenUrl CrossRef PubMed Web of Science

[71] ↵
Fabian DK, Lack JB, Mathur V et al. (2015) Spatially varying selection shapes life history clines among populations of Drosophila melanogaster from sub-Saharan Africa. Journal of Evolutionary Biology, 28, 826–840.
OpenUrl CrossRef PubMed

[72] ↵
Fiston-Lavier A-S, Barrón MG, Petrov DA, González J (2015) T-lex2: genotyping, frequency estimation and re-annotation of transposable elements using single or pooled next-generation sequencing data. Nucleic Acids Research, 43, e22–e22.
OpenUrl CrossRef PubMed

[73] ↵
Fiston-Lavier A-S, Singh ND, Lipatov M, Petrov DA (2010) Drosophila melanogaster recombination rate calculator. Gene, 463, 18–20.
OpenUrl CrossRef PubMed Web of Science

[74] ↵
Flatt T, Amdam GV, Kirkwood TBL, Omholt SW (2013) Life-history evolution and the polyphenic regulation of somatic maintenance and survival. The quarterly review of biology, 88, 185–218.
OpenUrl CrossRef PubMed

[75] ↵
Fraley C, Raftery AE (2012) mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation. Seattle.

[76] ↵
Francalacci P, Sanna D (2008) History and geography of human Y-chromosome in Europe: a SNP perspective. Journal of anthropological sciences, 86, 59–89.
OpenUrl

[77] ↵
Frichot E, Schoville SD, Bouchard G, François O (2013) Testing for associations between loci and environmental gradients using latent factor mixed models. Molecular Biology and Evolution, 30, 1687–1699.
OpenUrl CrossRef PubMed Web of Science

[78] ↵
HS Malik
González J, Karasov TL, Messer PW, Petrov DA (2010) Genome-Wide Patterns of Adaptation to Temperate Environments Associated with Transposable Elements in Drosophila ( HS Malik, Ed,). PLoS Genetics, 6, e1000905.
OpenUrl CrossRef

[79] HS Malik

[80] ↵
González J, Lenkov K, Lipatov M, Macpherson JM, Petrov DA (2008) High Rate of Recent Transposable Element–Induced Adaptation in Drosophila melanogaster. PLoS Biology, 6, e251.
OpenUrl CrossRef PubMed

[81] ↵
Goubert C, Modolo L, Vieira C et al. (2015) De Novo Assembly and Annotation of the Asian Tiger Mosquito (Aedes albopictus) Repeatome with dnaPipeTE from Raw Genomic Reads and Comparative Analysis with the Yellow Fever Mosquito (Aedes aegypti). Genome Biology and Evolution, 7, 1192–1205.
OpenUrl CrossRef PubMed

[82] ↵
Grabherr MG, Haas BJ, Yassour M et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology, 29, 644–652.
OpenUrl CrossRef PubMed

[83] ↵
Gramates LS, Marygold SJ, Santos GD et al. (2017) FlyBase at 25: looking to the future. Nucleic Acids Research, 45, D663–D671.
OpenUrl CrossRef PubMed

[84] ↵
Green RM, Smart WM (1985) Textbook on Spherical Astronomy. Cambridge University.

[85] ↵
Haddrill PR, Charlesworth B, Halligan DL, Andolfatto P (2005) Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biology, 6, R67.
OpenUrl CrossRef PubMed

[86] ↵
Hales KG, Korey CA, Larracuente AM, Roberts DM (2015) Genetics on the Fly: A Primer on the Drosophila Model System. Genetics, 201, 815–842.
OpenUrl Abstract/FREE Full Text

[87] ↵
Hamilton PT, Votýpka J, Dostálová A et al. (2015) Infection Dynamics and Immune Response in a Newly Described Drosophila-Trypanosomatid Association. mBio, 6, e01356–15.
OpenUrl

[88] ↵
Handu M, Kaduskar B, Ravindranathan R et al. (2015) SUMO-Enriched Proteome for Drosophila Innate Immune Response. G3 (Bethesda, Md.), 5, 2137–2154.
OpenUrl

[89] ↵
Harpur BA, Kent CF, Molodtsova D et al. (2014) Population genomics of the honey bee reveals strong signatures of positive selection on worker traits. Proceedings of the National Academy of Sciences of the United States of America, 111, 2614–2619.
OpenUrl Abstract/FREE Full Text

[90] ↵
Haselkorn TS, Markow TA, Moran NA (2009) Multiple introductions of the Spiroplasma bacterial endosymbiont into Drosophila. Molecular Ecology, 18, 1294–1305.
OpenUrl CrossRef PubMed

[91] ↵
Hohenlohe PA, Bassham S, Etter PD et al. (2010) Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags. PLoS Genetics, 6, e1000862.
OpenUrl CrossRef

[92] ↵
Hu TT, Eisen MB, Thornton KR, Andolfatto P (2013) A second-generation assembly of the Drosophila simulans genome provides new insights into patterns of lineage-specific divergence. Genome Research, 23, 89–98.
OpenUrl Abstract/FREE Full Text

[93] ↵
Huang W, Massouras A, Inoue Y et al. (2014) Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Research, 24, 1193–1208.
OpenUrl Abstract/FREE Full Text

[94] ↵
Hudson RR, Kreitman M, Aguadé M (1987) A test of neutral molecular evolution based on nucleotide data. Genetics, 116, 153–159.
OpenUrl Abstract/FREE Full Text

[95] ↵
Huszar T, Imler J-L (2008) Drosophila viruses and the study of antiviral host-defense. Advances in virus research, 72, 227–265.
OpenUrl CrossRef PubMed Web of Science

[96] ↵
Hutter S, Li H, Beisswanger S, De Lorenzo D, Stephan W (2007) Distinctly Different Sex Ratios in African and European Populations of Drosophila melanogaster Inferred From Chromosomewide Single Nucleotide Polymorphism Data. Genetics, 177, 469–480.
OpenUrl Abstract/FREE Full Text

[97] ↵
Inoue Y, Watanabe TK (1979) Inversion polymorphisms in Japanese natural populations of Drosophila melanogaster. Japanese Journal of Genetics, 54, 69–82.
OpenUrl

[98] ↵
Inoue Y, Watanabe T, Watanabe TK (1984) Evolutionary Change of the Chromosomal Polymorphism in Drosophila melanogaster Populations. Evolution, 38, 753.
OpenUrl CrossRef

[99] ↵
Jorde LB, Watkins WS, Bamshad MJ (2001) Population genomics: a bridge from evolutionary history to genetic medicine. Human Molecular Genetics, 10, 2199–2207.
OpenUrl CrossRef PubMed Web of Science

[100] ↵
Kao JY, Zubair A, Salomon MP, Nuzhdin SV, Campo D (2015) Population genomic analysis uncovers African and European admixture in Drosophila melanogaster populations from the south-eastern United States and Caribbean Islands. Molecular Ecology, 24, 1499–1509.
OpenUrl CrossRef

[101] ↵
Kapitonov VV, Jurka J (2003) Molecular Paleontology of Transposable Elements in the Drosophila melanogaster Genome. Proceedings of the National Academy of Sciences of the United States of America, 100, 6569–6574.
OpenUrl Abstract/FREE Full Text

[102] ↵
Kapun M, Fabian DK, Goudet J, Flatt T (2016a) Genomic Evidence for Adaptive Inversion Clines in Drosophila melanogaster. Molecular Biology and Evolution, 33, 1317–1336.
OpenUrl CrossRef PubMed

[103] ↵
Kapun M, Schmidt C, Durmaz E, Schmidt PS, Flatt T (2016b) Parallel effects of the inversion In(3R)Payne on body size across the North American and Australian clines in Drosophila melanogaster. Journal of Evolutionary Biology, 29, 1059–1072.
OpenUrl CrossRef

[104] ↵
Kapun M, van Schalkwyk H, McAllister B, Flatt T, Schlötterer C (2014) Inference of chromosomal inversion dynamics from Pool-Seq data in natural and laboratory populations of Drosophila melanogaster. Molecular Ecology, 23, 1813–1827.
OpenUrl CrossRef

[105] ↵
Kassis JA, Kennison JA, Tamkun JW (2017) Polycomb and Trithorax Group Genes in Drosophila. Genetics, 206, 1699–1725.
OpenUrl Abstract/FREE Full Text

[106] ↵
Keller A (2007) Drosophila melanogaster’s history as a human commensal. Current Biology, 17, R77–R81.
OpenUrl CrossRef PubMed Web of Science

[107] ↵
Kennington WJ, Hoffmann AA (2013) Patterns of genetic variation across inversions: geographic variation in the In(2L)t inversion in populations of Drosophila melanogaster from eastern Australia. BMC evolutionary biology, 13, 100.
OpenUrl

[108] ↵
Kennington WJ, Hoffmann AA, Partridge L (2007) Mapping Regions Within Cosmopolitan Inversion In(3R)Payne Associated With Natural Variation in Body Size in Drosophila melanogaster. Genetics, 177, 549–556.
OpenUrl Abstract/FREE Full Text

[109] ↵
Kennison J (2008) Dissection of Larval Salivary Glands and Polytene Chromosome Preparation. Cold Spring Harbor Protocols, 2008, pdb.prot4708–pdb.prot4708.

[110] ↵
Kimura M (1984) The Neutral Theory of Molecular Evolution. Cambridge University Press.

[111] ↵
Kirkpatrick M (2010) How and why chromosome inversions evolve. PLoS Biology, 8, e1000501.
OpenUrl CrossRef PubMed

[112] ↵
Knibb WR (1982) Chromosome inversion polymorphisms in Drosophila melanogaster II. Geographic clines and climatic associations in Australasia, North America and Asia. Genetica, 58, 213–221.
OpenUrl CrossRef Web of Science

[113] ↵
Knibb WR (1986) Temporal variation of Drosophila melanogaster Adh allele frequencies, inversion freqencies, and population sizes. Genetica, 71, 175–190.
OpenUrl CrossRef

[114] ↵
Knibb WR, Oakeshott JG, Gibson JB (1981) Chromosome Inversion Polymorphisms in Drosophila melanogaster. I. Latitudinal Clines and Associations between Inversions in Australasian Populations. Genetics, 98, 833–847.
OpenUrl Abstract/FREE Full Text

[115] ↵
Kofler R, Betancourt AJ, Schlötterer C (2012) Sequencing of pooled DNA samples (Pool-Seq) uncovers complex dynamics of transposable element insertions in Drosophila melanogaster. PLoS Genetics, 8, e1002487.
OpenUrl

[116] ↵
Kofler R, Orozco-terWengel P, De Maio N et al. (2011) PoPoolation: A Toolbox for Population Genetic Analysis of Next Generation Sequencing Data from Pooled Individuals. PLoS ONE, 6, e15925.
OpenUrl CrossRef PubMed

[117] ↵
Kolaczkowski B, Hupalo DN, Kern AD (2011a) Recurrent adaptation in RNA interference genes across the Drosophila phylogeny. Molecular Biology and Evolution, 28, 1033–1042.
OpenUrl CrossRef PubMed Web of Science

[118] ↵
Kolaczkowski B, Kern AD, Holloway AK, Begun DJ (2011b) Genomic Differentiation Between Temperate and Tropical Australian Populations of Drosophila melanogaster. Genetics, 187, 245–260.
OpenUrl Abstract/FREE Full Text

[119] ↵
Korneliussen TS, Moltke I, Albrechtsen A, Nielsen R (2013) Calculation of Tajima’s D and other neutrality test statistics from low depth next-generation sequencing data. BMC Bioinformatics, 14, 289.
OpenUrl CrossRef PubMed

[120] ↵
Kreitman M (1983) Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature, 304, 412–417.
OpenUrl CrossRef PubMed

[121] ↵
Kriesner P, Conner WR, Weeks AR, Turelli M, Hoffmann AA (2016) Persistence of a Wolbachia infection frequency cline in Drosophila melanogaster and the possible role of reproductive dormancy. Evolution, 70, 979–997.
OpenUrl CrossRef

[122] ↵
Kunze-Mühl E, Müller E (1957) Weitere Untersuchungen über die chromosomale Struktur und die naturlichen Strukturtypen von Drosophila subobscura coll. Chromosoma, 9, 559–570.
OpenUrl

[123] ↵
Kühn I, Dormann CF (2012) Less than eight (and a half) misconceptions of spatial analysis. Journal of Biogeography, 39, 995–998.
OpenUrl CrossRef

[124] ↵
Lachaise D, Cariou M-L, David JR et al. (1988) Historical Biogeography of the Drosophila melanogaster Species Subgroup. In: Evolutionary Biology, pp. 159–225. Springer, Boston, MA, Boston, MA.

[125] ↵
Lack JB, Cardeno CM, Crepeau MW et al. (2015) The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population. Genetics, 199, 1229–1241.
OpenUrl Abstract/FREE Full Text

[126] ↵
Lack JB, Lange JD, Tang AD, Corbett-Detig RB, Pool JE (2016) A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus. Molecular Biology and Evolution, 33, msw195–3313.
OpenUrl

[127] ↵
Lang DT RJSONIO: Serialize R objects to JSON, JavaScript Object Notation. https://CRAN.R-project.org/package=RJSONIO.

[128] ↵
Langley CH, Stevens K, Cardeno C et al. (2012) Genomic variation in natural populations of Drosophila melanogaster. Genetics, 192, 533–598.
OpenUrl Abstract/FREE Full Text

[129] ↵
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nature Methods, 9, 357–359.
OpenUrl CrossRef

[130] ↵
Lavington E, Kern AD (2017) The Effect of Common Inversion PolymorphismsIn(2L)tandIn(3R)Moon Patterns of Transcriptional Variation inDrosophila melanogaster. G3 (Bethesda, Md.), 7, 3659–3668.
OpenUrl

[131] ↵
Lawrie DS, Messer PW, Hershberg R, Petrov DA (2013) Strong Purifying Selection at Synonymous Sites in D. melanogaster. arXiv.org, q-bio.PE, e1003527.

[132] ↵
Krimbas CB,
Powell JR
Lemeunier F, Aulard S (1992) Inversion polymorphism in Drosophila melanogaster. In: Drosophila Inversion Polymorphism (eds Krimbas CB, Powell JR), p. 576. CRC Press.

[133] Krimbas CB,

[134] Powell JR

[135] ↵
Levins R (1968) Evolution in Changing Environments. Princeton University Press.

[136] ↵
Lewontin RC (1974) The Genetic Basis of Evolutionary Change. Columbia University Press.

[137] ↵
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv.org.

[138] ↵
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754–1760.
OpenUrl CrossRef PubMed Web of Science

[139] ↵
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research, 18, 1851–1858.
OpenUrl Abstract/FREE Full Text

[140] ↵
Luikart G, England PR, Tallmon D, Jordan S, Tableret P (2003) The power and promise of population genomics: from genotyping to genome typing. Nature Reviews Genetics, 4, 981–994.
OpenUrl CrossRef PubMed Web of Science

[141] ↵
Lyne R, Smith R, Rutherford K et al. (2007) FlyMine: an integrated database for Drosophila and Anopheles genomics. Genome Biology, 8, R129.
OpenUrl CrossRef PubMed

[142] ↵
Machado HE, Bergland AO, O’Brien KR et al. (2016) Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster. Molecular Ecology, 25, 723–740.
OpenUrl CrossRef

[143] ↵
Machado H, Bergland AO, Taylor R et al. (2018) Broad geographic sampling reveals predictable and pervasive seasonal adaptation in Drosophila. bioRxiv. https://doi.org/10.1101/337543

[144] ↵
Mackay TFC, Richards S, Stone EA et al. (2012) The Drosophila melanogaster Genetic Reference Panel. Nature, 482, 173–178.
OpenUrl CrossRef PubMed Web of Science

[145] ↵
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 17, 10–12.
OpenUrl CrossRef PubMed

[146] ↵
Martino ME, Ma D, Leulier F (2017) Microbial influence on Drosophila biology. Current Opinion in Microbiology, 38, 165–170.
OpenUrl CrossRef

[147] ↵
Mateo L, Rech GE, González J (2018) Genome-wide patterns of local adaptation in Drosophila melanogaster: adding intra European variability to the map. bioRxiv.

[148] ↵
Matthias P, Yoshida M, Khochbin S (2008) HDAC6 a new cellular stress surveillance factor. Cell Cycle, 7, 7–10.
OpenUrl CrossRef PubMed Web of Science

[149] ↵
Matzkin LM, Merritt TJS, Zhu C-T, Eanes WF (2005) The structure and population genetics of the breakpoints associated with the cosmopolitan chromosomal inversion In(3R)Payne in Drosophila melanogaster. Genetics, 170, 1143–1152.
OpenUrl Abstract/FREE Full Text

[150] ↵
McDonald JH, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature, 351, 652–654.
OpenUrl CrossRef PubMed Web of Science

[151] ↵
McKenna A, Hanna M, Banks E et al. (2010) The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20, 1297–1303.
OpenUrl Abstract/FREE Full Text

[152] ↵
Menozzi P, Piazza A, Cavalli-Sforza L (1978) Synthetic maps of human gene frequencies in Europeans. Science, 201, 786–792.
OpenUrl Abstract/FREE Full Text

[153] ↵
Messer PW, Petrov DA (2013) Population genomics of rapid adaptation by soft selective sweeps. 28, 659–669.
OpenUrl

[154] ↵
Mettler LE, Voelker RA, Mukai T (1977) Inversion Clines in Populations of Drosophila melanogaster. Genetics, 87, 169–176.
OpenUrl Abstract/FREE Full Text

[155] ↵
Meyer F, Paarmann D, D’Souza M et al. (2008) The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics, 9, 386.
OpenUrl CrossRef PubMed

[156] ↵
Micallef L, Rodgers P (2014) eulerAPE: drawing area-proportional 3-Venn diagrams using ellipses. PLoS ONE, 9, e101717.
OpenUrl CrossRef PubMed

[157] ↵
Michalakis Y, Veuille M (1996) Length variation of CAG/CAA trinucleotide repeats in natural populations of Drosophila melanogaster and its relation to the recombination rate. Genetics, 143, 1713–1725.
OpenUrl Abstract/FREE Full Text

[158] ↵
Moran PAP (1950) Notes on Continuous Stochastic Phenomena. Biometrika, 37, 17.
OpenUrl CrossRef PubMed Web of Science

[159] ↵
Navarro A, Faria R (2014) Pool and conquer: new tricks for (c)old problems. Molecular Ecology, 23, 1653–1655.
OpenUrl

[160] ↵
Nei M (1987) Molecular Evolutionary Genetics. Columbia University Press.

[161] ↵
Nolte V, Pandey RV, Kofler R, Schlötterer C (2013) Genome-wide patterns of natural variation reveal strong selective sweeps and ongoing genomic conflict in Drosophila mauritiana. Genome Research, 23, 99–110.
OpenUrl Abstract/FREE Full Text

[162] ↵
Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nature Genetics, 40, 646–649.
OpenUrl CrossRef PubMed Web of Science

[163] ↵
Nunes MDS, Neumeier H, Schlötterer C (2008) Contrasting patterns of natural variation in global Drosophila melanogaster populations. Molecular Ecology, 17, 4470–4479.
OpenUrl CrossRef PubMed

[164] ↵
Okonechnikov K, Conesa A, García-Alcalde F (2016) Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics, 32, 292–294.
OpenUrl CrossRef PubMed

[165] ↵
Parsch J, Novozhilov S, Saminadin-Peter SS, Wong KM, Andolfatto P (2010) On the utility of short intron sequences as a reference for the detection of positive and negative selection in Drosophila. Molecular Biology and Evolution, 27, 1226–1234.
OpenUrl CrossRef PubMed Web of Science

[166] ↵
Peel MC, Finlayson BL, McMahon TA (2007) Updated world map of the Köppen-Geiger climate classification. Hydrology and Earth System Sciences, 11, 1633–1644.
OpenUrl CrossRef Web of Science

[167] ↵
Petrov DA, Fiston-Lavier AS, Lipatov M, Lenkov K, González J (2011) Population Genomics of Transposable Elements in Drosophila melanogaster. Molecular Biology and Evolution, 28, 1633–1644.
OpenUrl CrossRef PubMed Web of Science

[168] ↵
Pimpinelli S, Bonaccorsi S, Fanti L, Gatti M (2010) Preparation and Orcein Staining of Mitotic Chromosomes from Drosophila Larval Brain. Cold Spring Harbor Protocols, 2010, pdb.prot5389–pdb.prot5389.

[169] ↵
Pool JE (2015) The Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D. melanogaster Reference Genome Reveals a Network of Epistatic Fitness Interactions. Molecular Biology and Evolution, 32, 3236–3251.
OpenUrl CrossRef PubMed

[170] ↵
Pool JE, Braun DT, Lack JB (2016) Parallel Evolution of Cold Tolerance Within Drosophila melanogaster. Molecular Biology and Evolution, msw232.

[171] ↵
Pool JE, Corbett-Detig RB, Sugino RP et al. (2012) Population Genomics of Sub-Saharan Drosophila melanogaster: African Diversity and Non-African Admixture. PLoS Genetics, 8, e1003080.
OpenUrl

[172] ↵
Powell JR (1997) Progress and Prospects in Evolutionary Biology: The Drosophila Model. Oxford University Press.

[173] ↵
R Development Core Team (2009) R: A Language and Environment for Statistical Computing. R-project.org.

[174] ↵
Rako L, Anderson AR, Sgrò CM, Stocker AJ, Hoffmann AA (2006) The association between inversion In(3R)Payne and clinally varying traits in Drosophila melanogaster. Genetica, 128, 373–384.
OpenUrl CrossRef PubMed

[175] ↵
Ramachandran S, Rosenberg NA, Zhivotovsky LA, Feldman MW (2004) Robustness of the inference of human population structure: a comparison of X-chromosomal and autosomal microsatellites. Human genomics, 1, 87–97.
OpenUrl PubMed

[176] ↵
Rane RV, Rako L, Kapun M, Lee SF (2015) Genomic evidence for role of inversion 3RP of Drosophila melanogaster in facilitating climate change adaptation. Molecular Ecology, 24, 2423–2432.
OpenUrl CrossRef

[177] ↵
Richardson JL, Urban MC, Bolnick DI, Skelly DK (2014) Microgeographic adaptation and the spatial scale of evolution. Trends in Ecology & Evolution, 29, 165–176.
OpenUrl

[178] ↵
Richardson MF, Weinert LA, Welch JJ et al. (2012) Population Genomics of the Wolbachia Endosymbiont in Drosophila melanogaster (A Kopp, Ed,). PLoS Genetics, 8, e1003129.
OpenUrl

[179] ↵
Roberts DB
(Ed.) (1998) Drosophila: A Practical Approach The Practical Approach. Practical Approach Series.

[180] Roberts DB

[181] ↵
Rogers RL, Hartl DL (2012) Chimeric genes as a source of rapid evolution in Drosophila melanogaster. Molecular Biology and Evolution, 29, 517–529.
OpenUrl CrossRef PubMed Web of Science

[182] ↵
Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC et al. (2017) DnaSP 6: DNA Sequence Polymorphism Analysis of Large Datasets. Molecular Biology and Evolution, 34, 32993302.
OpenUrl

[183] ↵
Salmela L, Schröder J (2011) Correcting errors in short reads by multiple alignments. Bioinformatics, 27, 1455–1461.
OpenUrl CrossRef PubMed Web of Science

[184] ↵
Schlenke TA, Begun DJ (2003) Natural selection drives Drosophila immune system evolution., 164, 1471–1480.
OpenUrl

[185] ↵
Schlötterer C, Neumeier H, Sousa C, Nolte V (2006) Highly structured Asian Drosophila melanogaster populations: a new tool for hitchhiking mapping? Genetics, 172, 287–292.
OpenUrl Abstract/FREE Full Text

[186] ↵
Schlötterer C, Tobler R, Kofler R, Nolte V (2014) Sequencing pools of individuals - mining genome-wide polymorphism data without big funding. Nature Reviews Genetics, 15, 749–763.
OpenUrl CrossRef PubMed

[187] ↵
Schmidt JM, Good RT, Appleton B et al. (2010) Copy number variation and transposable elements feature in recent, ongoing adaptation at the Cyp6g1 locus. PLoS Genetics, 6, e1000998.
OpenUrl

[188] ↵
Schmidt PS, Paaby AB (2008) Reproductive Diapause and Life-History Clines in North American Populations of Drosophila melanogaster. Evolution, 62, 1204–1215.
OpenUrl CrossRef PubMed Web of Science

[189] ↵
Schmidt PS, Zhu CT, Das J et al. (2008) An amino acid polymorphism in the couch potato gene forms the basis for climatic adaptation in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America, 105, 16207–16211.
OpenUrl Abstract/FREE Full Text

[190] ↵
Sella G, Petrov DA, Przeworski M, Andolfatto P (2009) Pervasive Natural Selection in the Drosophila Genome? PLoS Genetics, 5, e1000495.
OpenUrl

[191] ↵
Sezgin E, Duvernell DD, Matzkin LM et al. (2004) Single-locus latitudinal clines and their relationship to temperate adaptation in metabolic genes and derived alleles in Drosophila melanogaster. Genetics, 168, 923–931.
OpenUrl Abstract/FREE Full Text

[192] ↵
Singh AK (2018) Chromosomal variability due to inversions in indian populations of Drosophila. Indian J Sci Res.

[193] ↵
Singh BN, Das A (1992) Further evidence for latitudinal inversion clines in natural populations of Drosophila melanogaster from India. Journal of Heredity, 83, 227–230.
OpenUrl PubMed

[194] ↵
Singh ND, Arndt PF, Clark AG, Aquadro CF (2009) Strong evidence for lineage and sequence specificity of substitution rates and patterns in Drosophila. Molecular Biology and Evolution, 26, 1591–1605.
OpenUrl CrossRef PubMed Web of Science

[195] ↵
Stalker HD (1976) Chromosome studies in wild populations of D. melanogaster. Genetics, 82, 323–347.
OpenUrl Abstract/FREE Full Text

[196] ↵
Stalker HD (1980) Chromosome Studies in Wild Populations of Drosophila melanogaster. II. Relationship of Inversion Frequencies to Latitude, Season, Wing-Loading and Flight Activity. Genetics, 95, 211–223.
OpenUrl Abstract/FREE Full Text

[197] ↵
Staubach F, Baines JF, Künzel S, Bik EM, Petrov DA (2013) Host species and environmental effects on bacterial communities associated with Drosophila in the laboratory and in the natural environment. PLoS ONE, 8, e70749.
OpenUrl CrossRef PubMed

[198] ↵
Stephan W (2010) Genetic hitchhiking versus background selection: the controversy and its implications. Philosophical Transactions Of The Royal Society Of London Series B-Biological Sciences, 365, 1245–1253.
OpenUrl CrossRef PubMed

[199] ↵
Svetec N, Pavlidis P, Stephan W (2009) Recent strong positive selection on Drosophila melanogaster HDAC6, a gene encoding a stress surveillance factor, as revealed by population genomic analysis. Molecular Biology and Evolution, 26, 1549–1556.
OpenUrl CrossRef PubMed Web of Science

[200] ↵
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics, 123, 585–595.
OpenUrl Abstract/FREE Full Text

[201] ↵
Trinder M, Daisley BA, Dube JS, Reid G (2017) Drosophila melanogaster as a High-Throughput Model for Host-Microbiota Interactions. Frontiers in microbiology, 8, 751.
OpenUrl

[202] ↵
Turner TL, Levine MT, Eckert ML, Begun DJ (2008) Genomic analysis of adaptive differentiation in Drosophila melanogaster. Genetics, 179, 455–473.
OpenUrl Abstract/FREE Full Text

[203] ↵
Umina PA, Weeks AR, Kearney MR, McKechnie SW, Hoffmann AA (2005) A rapid shift in a classic clinal pattern in Drosophila reflecting climate change. Science, 308, 691–693.
OpenUrl Abstract/FREE Full Text

[204] ↵
Unckless RL (2011) A DNA virus of Drosophila. PLoS ONE, 6, e26564.
OpenUrl CrossRef PubMed

[205] ↵
Van ‘t Land J, Van Putten WF, Villarroel H, Kamping A, van Delden W (2000) Latitudinal variation for two enzyme loci and an inversion polymorphism in Drosophila melanogaster from Central and South America. Evolution, 54, 201–209.
OpenUrl PubMed

[206] ↵
Voelker RA, Cockerham CC, Johnson FM et al. (1978) INVERSIONS FAIL TO ACCOUNT FOR ALLOZYME CLINES. Genetics, 88, 515–527.
OpenUrl Abstract/FREE Full Text

[207] ↵
Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theoretical Population Biology, 7, 256–276.
OpenUrl CrossRef PubMed Web of Science

[208] ↵
Webster CL, Longdon B, Lewis SH, Obbard DJ (2016) Twenty-Five New Viruses Associated with the Drosophilidae (Diptera). Evolutionary bioinformatics online, 12, 13–25.
OpenUrl

[209] ↵
Webster CL, Waldron FM, Robertson S et al. (2015) The Discovery, Distribution, and Evolution of Viruses Associated with Drosophila melanogaster. PLoS Biology, 13, e1002210.
OpenUrl CrossRef PubMed

[210] ↵
Weir BS, Cockerham CC (1984) Estimating F-Statistics for the Analysis of Population Structure. Evolution, 38, 1358–1370.
OpenUrl CrossRef PubMed Web of Science

[211] ↵
Werren JH, Baldo L, Clark ME (2008) Wolbachia: master manipulators of invertebrate biology. Nature Reviews Microbiology, 6, 741–751.
OpenUrl CrossRef PubMed Web of Science

[212] ↵
Wesley CS, Eanes WF (1994) Isolation and analysis of the breakpoint sequences of chromosome inversion In(3L)Payne in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America, 91, 3132–3136.
OpenUrl Abstract/FREE Full Text

[213] ↵
Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis. Springer.

[214] ↵
Wilfert L, Longdon B, Ferreira AGA, Bayer F, Jiggins FM (2011) Trypanosomatids are common and diverse parasites of Drosophila. Parasitology, 138, 858–865.
OpenUrl

[215] ↵
Wittmann MJ, Bergland AO, Feldman MW, Schmidt PS, Petrov DA (2017) Seasonally fluctuating selection can maintain polymorphism at many loci via segregation lift. Proceedings of the National Academy of Sciences of the United States of America, 114, E9932–E9941.
OpenUrl Abstract/FREE Full Text

[216] ↵
Wolf JBW, Bayer T, Haubold B et al. (2010) Nucleotide divergence vs. gene expression differentiation: comparative transcriptome sequencing in natural isolates from the carrion crow and its hybrid zone with the hooded crow. Molecular Ecology, 19, 162–175.
OpenUrl CrossRef PubMed Web of Science

[217] ↵
Wolff JN, Camus MF, Clancy DJ, Dowling DK (2016) Complete mitochondrial genome sequences of thirteen globally sourced strains of fruit fly (Drosophila melanogaster) form a powerful model for mitochondrial research. Mitochondrial DNA. Part A, DNA mapping, sequencing, and analysis, 27, 4672–4674.
OpenUrl

[218] ↵
Xiao F-X, Yotova V, Zietkiewicz E et al. (2004) Human X-chromosomal lineages in Europe reveal Middle Eastern and Asiatic contacts. European Journal of Human Genetics, 12, 301–311.
OpenUrl CrossRef PubMed Web of Science

[219] ↵
Yukilevich R, True JR (2008a) Incipient sexual isolation among cosmopolitan Drosophila melanogaster populations. Evolution, 62, 2112–2121.
OpenUrl CrossRef PubMed

[220] ↵
Yukilevich R, True JR (2008b) African morphology, behavior and phermones underlie incipient sexual isolation between us and Caribbean Drosophila melanogaster. Evolution, 62, 2807–2828.
OpenUrl CrossRef PubMed

[221] ↵
Yukilevich R, Turner TL, Aoki F, Nuzhdin SV, True JR (2010) Patterns and processes of genome-wide divergence between North American and African Drosophila melanogaster. Genetics, 186, 219–239.
OpenUrl Abstract/FREE Full Text

[222] ↵
AK Chakraborty
Zanini F, Brodin J, Thebo L et al. (2015) Population genomics of intrapatient HIV-1 evolution. ( AK Chakraborty, Ed,). eLife, 4, e11282.
OpenUrl PubMed

[223] AK Chakraborty

Genomic analysis of European Drosophila populations reveals longitudinal structure and continent-wide selection

Abstract

Introduction

Results

Most SNPs are widespread throughout Europe

Derived European and North American populations share more SNPs with each other than they do with an ancestral African population

European and other derived populations exhibit similar amounts of genetic variation

Localized reductions in Tajima’s D are consistent with selective sweeps

European populations are strongly structured along an east-west gradient

Mitochondrial haplotypes also exhibit longitudinal population structure

The majority of TEs vary with longitude and altitude

Inversion polymorphisms in Europe exhibit latitudinal and longitudinal clines

European Drosophila microbiomes contain trypanosomatids and novel viruses

Discussion

Materials and Methods

DNA extraction, library preparation and sequencing

Mapping pipeline and variant calling

Combined and population-specific site frequency spectra (SFS)

Genetic variation in Europe

SNP counts and overlap with other datasets

Genetic differentiation and population structure in European populations

Mitochondrial DNA

Transposable elements

Inversion polymorphisms

Microbiome

Additional information

Funding

Author contributions

Author ORCIDs

Acknowledgments

Footnotes

References

Citation Manager Formats

Subject Area