Widespread selection and gene flow shape the genomic landscape during a radiation of monkeyflowers

Sean Stankowski; Madeline A. Chase; Allison M. Fuiten; Murillo F. Rodrigues; Peter L. Ralph; Matthew A. Streisfeld

doi:10.1101/342352

Abstract

Speciation genomic studies aim to interpret patterns of genome-wide variation in light of the processes that give rise to new species. However, interpreting the genomic ‘landscape’ of speciation is difficult, because many evolutionary processes can impact levels of variation. Facilitated by the first chromosome-level assembly for the group, we use whole-genome sequencing and simulations to shed light on the processes that have shaped the genomic landscape during a recent radiation of monkeyflowers. After inferring the phylogenetic relationships among the nine taxa in this radiation, we show that highly similar diversity (π) and differentiation (F_ST) landscapes have emerged across the group. Variation in these landscapes was strongly predicted by the local density of functional elements and the recombination rate, suggesting that the landscapes have been shaped by widespread natural selection. Using the varying divergence times between pairs of taxa, we show that the correlations between F_ST and genome features arose almost immediately after a population split and have become stronger over time. Simulations of genomic landscape evolution suggest that background selection (i.e., selection against deleterious mutations) alone is too subtle to generate the observed patterns, but scenarios that involve positive selection and genetic incompatibilities are plausible alternative explanations. Finally, tests for introgression among these taxa reveal widespread evidence of heterogeneous selection against gene flow during this radiation. Thus, combined with existing evidence for adaptation in this system, we conclude that the correlation in F_ST among these taxa informs us about the genomic basis of adaptation and speciation in this system.

Author summary What can patterns of genome-wide variation tell us about the speciation process? The answer to this question depends upon our ability to infer the evolutionary processes underlying these patterns. This, however, is difficult, because many processes can leave similar footprints, but some have nothing to do with speciation per se. For example, many studies have found highly heterogeneous levels of genetic differentiation when comparing the genomes of emerging species. These patterns are often referred to as differentiation ‘landscapes’ because they appear as a rugged topography of ‘peaks’ and ‘valleys’ as one scans across the genome. It has often been argued that selection against deleterious mutations, a process referred to as background selection, is primarily responsible for shaping differentiation landscapes early in speciation. If this hypothesis is correct, then it is unlikely that patterns of differentiation will reveal much about the genomic basis of speciation. However, using genome sequences from nine emerging species of monkeyflower coupled with simulations of genomic divergence, we show that it is unlikely that background selection is the primary architect of these landscapes. Rather, differentiation landscapes have probably been shaped by adaptation and gene flow, which are processes that are central to our understanding of speciation. Therefore, our work has important implications for our understanding of what patterns of differentiation can tell us about the genetic basis of adaptation and speciation.

Introduction

The primary goal of speciation genomics is to interpret patterns of genome-wide variation in light of the ecological and evolutionary processes that contribute to the origin of new species (Ravinet et al. 2017, Wolf and Ellegren 2017, Campbell et al. 2018). Advances in DNA sequencing now allow us to capture patterns of genome-wide variation from organisms across the tree of life, but inferring the processes underlying these patterns remains a formidable challenge (Ravinet et al. 2017). This is because speciation is highly complex, involving a range of factors and processes that shape genomes through time and across different spatial and ecological settings (Abbott et al. 2013).

The difficulty of inferring process from pattern is illustrated by recent efforts to characterize the genomic basis of reproductive isolation using patterns of genome-wide variation (Ravinet et al. 2017, Wolf and Ellegren 2017, Campbell et al. 2018). Numerous studies have revealed highly heterogeneous differentiation ‘landscapes’ between pairs of taxa at different stages in the speciation process (Turner et al. 2005, Hohenlohe et al. 2010, Ellegren et al. 2012, Martin et al. 2013, Renaut et al. 2013, Poelstra et al. 2014, Soria-Carrasco et al. 2014, Lamichhaney et al. 2015, Malinsky et al. 2015). This pattern, which is characterized by peaks and valleys of relative differentiation (i.e., F_ST) across the genome, was initially thought to provide insight into the genomic architecture of porous species boundaries (Wu 2001). Peaks in the differentiation landscape were interpreted as genomic regions containing loci underlying reproductive barriers, while valleys were thought to reflect regions that were homogenized by ongoing gene flow (Turner et al. 2005, Nosil et al. 2009, Feder et al. 2012). However, as the field of speciation genomics has matured, it has become clear that heterogeneous differentiation landscapes can be influenced by factors that have nothing to do with speciation per se.

For example, it is now clear that levels of genome-wide differentiation (F_ST) can be influenced by the genomic distribution of intrinsic properties, including the recombination rate and the local density of functional sites (Cruickshank and Hahn 2014). This is because these properties affect the way that natural selection impacts levels of variation across the genome. First, regions enriched for functional sequence are more likely to be subject to selection, because they provide a larger target size for mutations with fitness effects. Second, when positive or negative directional selection acts on these mutations, it can indirectly reduce levels of genetic variation at statistically associated sites (Maynard-Smith and Haigh 1974, Charlesworth et al. 1993, Hudson and Kaplan 1995, Charlesworth 1998, Gillespie 2000, Coop and Ralph 2012, Cruickshank and Hahn 2014). Because these indirect effects of selection are mediated by linkage disequilibrium (LD) between selected and neutral sites, stronger reductions in diversity (and corresponding increases in F_ST) are expected in genomic regions with low recombination, as this is where LD breaks down most gradually (Baird 2015). This is also why these indirect effects of selection are referred to as ‘linked selection’, even though physical linkage is not actually the cause. In this paper, we refer to the indirect effects of selection, as ‘indirect selection’ for short.

Although genomic features were not initially expected to play a major role in shaping patterns of between species variation (Kimura 1968, Ohta 1973) (but see (Charlesworth 1998, Gillespie 2000)), recent empirical studies indicate that they can have a strong impact on the topography of the differentiation landscape (Hahn 2008, Cruickshank and Hahn 2014, Burri 2017a, b, Wolf and Ellegren 2017). The most compelling evidence comes from studies that compare not one pair of taxa, but several closely related taxa that share a similar distribution of genomic elements due to their recent common ancestry (Burri 2017a, b, Wolf and Ellegren 2017). For example, Burri et al. (Burri et al. 2015) examined multiple episodes of divergence in Ficedula flycatchers and found strikingly similar differentiation landscapes between distinct pairs of taxa, presumably due to a shared pattern of indirect selection across the genome. Highly correlated differentiation landscapes have also been found in other groups of taxa, including sunflowers (Renaut et al. 2013), Heliconius butterflies (Kronforst et al. 2013, Martin et al. 2013), Darwin’s finches (Han et al. 2017), and other birds (Van Doren et al. 2017, Delmore et al. 2018).

But what, if anything, do these parallel signatures of selection reveal about the genomic basis of adaptation and speciation? Part of the answer lies in understanding which forms of selection cause these patterns. In a recent article, Burri (Burri 2017b) argued that selection against deleterious mutations (i.e., background selection, or BGS) is primarily responsible for the evolution of differentiation landscapes that are correlated both among taxa and with the distribution of intrinsic features. This argument was based on two premises: first, deleterious mutations are far more common than beneficial ones, so BGS has a greater opportunity to generate a genome-wide correlation between levels of variation and the distributions of intrinsic properties; second, although all functional elements are potential targets of BGS, the same loci are not expected to be repeatedly involved in adaptation or speciation across multiple taxa (Burri 2017b). Although this interpretation implies that correlated differentiation landscapes are unlikely to inform us about adaptation or speciation directly, it has been argued that this shared pattern of differentiation can be used to control for the effects of BGS when attempting to identify regions of the genome that have been affected by positive selection (Burri 2017b).

Even though it is often argued that correlated differentiation landscapes reflect the action of recurrent background selection, it stands to reason that they also could arise as a direct consequence of adaptation and speciation. For example, when adaptation occurs from standing variation, the rate of adaptation is not limited by the mutation rate, meaning that heterogeneous differentiation landscapes can evolve rapidly (Barrett et al. 2008, Bassham et al. 2018). Moreover, if adaptation is highly polygenic, positive selection will inevitably impact most of the genome (Rockman 2012). This could cause levels of differentiation to become correlated with the distribution of intrinsic genomic features, even across multiple taxa that are adapting to different environments. Also, correlated differentiation landscapes may arise across multiple closely related taxa due to a common basis of reproductive isolation among taxa. This is because incomplete isolating barriers generate a heterogeneous pattern of selection against gene flow across the genome. The loci underlying these porous barriers, including genetic incompatibilities and locally adapted alleles, are expected to accumulate in gene rich regions and will have stronger barrier effects in genomic regions where the recombination rate is low (Schumer et al. 2018, Martin et al. 2019). Thus, speciation also may result in genome-wide correlations between levels of differentiation and genome features, especially if reproductive isolation is highly polygenic.

Facilitated by a new chromosome-level genome assembly, genetic map, and annotation, we combine analyses of whole-genome sequencing with simulations to understand how different processes have contributed to the evolution of correlated genomic landscapes during a recent radiation. The bush monkeyflower radiation consists of eight taxa of Mimulus aurantiacus distributed mainly throughout California. (Fig. 1; (Chase et al. 2017). Together with their sister species M. clevelandii, they span a range of divergence times over the past approximately one million years. The plants inhabit a range of environments, including temperate coastal regions, mountain ranges, semi-arid habitats, and offshore islands (Thompson 2005). Crossing experiments have shown that all of these taxa are at least partially inter-fertile (McMinn 1951), and many hybridize in nature in narrow regions where their distributions overlap (Streisfeld and Kohn 2005, 2007, Sobel and Streisfeld 2015, Stankowski et al. 2015, 2017). Despite their close evolutionary relationships and opportunities for gene flow, these taxa show striking phenotypic differentiation (Chase et al. 2017). The most conspicuous trait differences are associated with their flowers, which show heritable variation in color, size, shape, and the placement of the reproductive organs (Streisfeld and Kohn 2005, Stankowski et al. 2015). In his seminal works on plant speciation, Grant (Grant 1981, 1993b, a) postulated that these floral trait differences were due to pollinator-mediated selection by different avian and insect pollinators. Detailed studies in one pair of taxa support this hypothesis and have shown that pollinator-mediated selection can generate strong premating reproductive isolation in the face of extensive gene flow (Streisfeld and Kohn 2005, 2007, Sobel and Streisfeld 2015, Stankowski et al. 2015, 2017).

Figure 1. Evolutionary relationships and patterns of genome-wide variation across the radiation.

A) The black tree was constructed from a concatenated alignment of genome-wide SNPs and is rooted using M. clevelandii. The 387 gray trees were constructed from 500 kb genomic windows. The first number associated with each node or taxon is the bootstrap support for that clade in the whole genome tree, and the second number is the percentage of window-based trees in which that clade is present. B) Levels of differentiation (F_ST), C) divergence (d_xy), and D) diversity (π) within and among taxa based on the same 500 kb windows. For simplicity, F_ST and d_xy are shown only for comparisons with the red ecotype of subspecies puniceus. See Fig. S7 for the distributions of F_ST and d_xy across all pairs of taxa.

After inferring the phylogenetic relationships among these taxa, we show that highly similar diversity (π) and differentiation (F_ST) landscapes have emerged across the group. Variation in these landscapes was strongly predicted by the local density of functional elements and the recombination rate, suggesting that they have been shaped by widespread selection. Using the varying divergence times between pairs of taxa, we show that the correlations between F_ST and genome features arose almost immediately after a population split and have become stronger over the course of time. Simulations of genomic landscape evolution suggest that background selection (i.e., selection against deleterious mutations) alone is too subtle to generate the observed patterns, but scenarios that involve positive selection and genetic incompatibilities are plausible alternative explanations. Finally, tests for introgression among these taxa reveal widespread evidence of heterogeneous selection against gene flow during this radiation. We discuss the implications of these results for our general understanding of genomic landscape evolution, particularly in light of recent efforts to reveal the genomic basis of adaptation and speciation.

Results and Discussion

A chromosome-level genome assembly, map, and annotation for the bush monkeyflower

To facilitate the analysis of genome-wide variation in this group, we constructed the first chromosome-level reference genome for the bush monkeyflower using a combination of long-read Single Molecule Real Time (SMRT) sequencing (PacBio), overlapping and mate-pair short-reads (Illumina), and a high-density genetic map (7,589 segregating markers across 10 linkage groups; Fig. S1; Table S1). Contig building and scaffolding yielded 1,547 scaffolds, with an N50 size of 1.6 Mbp, and a total length of 207 Mbp. The high-density map allowed us to anchor and orient 94% of the assembled genome onto 10 linkage groups, which is the number of chromosomes inferred from karyotypic analyses in all subspecies of M. aurantiacus and M. clevelandii (Vickery 1995). Analysis of assembly completeness based on conserved gene space (Simao et al. 2015) revealed that 93% of 1440 universal single copy orthologous genes were completely assembled, with a further 2% partially assembled (Table S2). Subsequent annotation yielded 23,018 predicted genes.

View this table:

Table S1. Summary of the genetic linkage map constructed using an F2 intercross between the red and yellow ecotypes of subspecies puniceus.

The table includes map length in cM for each linkage group (LG), the number of markers associated with each LG, the number of unique map positions, and the average genetic distance in cM between each unique map position. Standard deviations are given in parentheses.

View this table:

Table S2. Analysis of gene space completeness in the M. aurantiacus genome using CEGMA and BUSCO.

The number and percent of core genes found in the final assembly are shown for each analysis (CEGMA, n = 248; BUSCO, n = 1440).

Figure. S1. Map distance (cM/Mbp) vs. physical distance across the 10 linkage groups.

Recombination for each marker is estimated relative to the start of the linkage group and plotted at its physical location on each chromosome in the reference assembly.

Phylogenetic relationships among taxa and patterns of discordance across the genome

To infer phylogenetic relationships among the taxa in this radiation, we sequenced 37 whole genomes from the seven subspecies and two ecotypes of Mimulus aurantiacus (n = 4-5 per taxon) and its sister taxon M. clevelandii (n = 3) (Fig. S2; Table S3). Close sequence similarity allowed us to align reads from all samples to the reference assembly with high confidence (average 91.7% reads aligned; Table S3). After mapping, we identified 13.2 million variable sites that were used in subsequent analyses (average sequencing depth of 21x per individual, Table S3). Relationships were then inferred among the nine taxa from a concatenated alignment of genome-wide SNPs using maximum-likelihood (ML) phylogenetic analysis in RAxML (Stamatakis 2014).

View this table:

Table S3. Sample information for the 37 sequenced individuals.

Includes their taxon identity, sampling location, percent read alignment, and average sequencing depth.

Figure S2. Geographic distribution of sampling locations for each sample sequenced in this study.

Detailed position information for each population can be found in Table S3.

The tree topology obtained from this analysis (Fig. 1) confirmed the same phylogenetic relationships as previous analyses based on reduced-representation sequencing and five different methods of phylogenetic reconstruction (Stankowski and Streisfeld 2015, Chase et al. 2017), and was supported by patterns of clustering from principal components analysis (Fig. S3). Individuals of each of the seven subspecies formed monophyletic groups with 100% bootstrap support (Fig. 1). Relationships within subspecies puniceus were more complex, as the red ecotype formed a monophyletic sub-clade within the paraphyletic yellow ecotype. This is consistent with the recent origin of red flowers from a yellow-flowered ancestor (Stankowski and Streisfeld 2015).

Figure S3. Genome-wide Principal Components Analysis (PCA).

Each plot is a separate PCA performed using different sets of taxa. The legend to the right describes the set of taxa included in each analysis, with the capital letter in parentheses and the color representing the specific taxon. A) All taxa; B) all subspecies of M. aurantiacus, but excluding M. clevelandii; C) only subspecies aurantiacus, longiflorus, calycinus, and the red and yellow ecotypes of subspecies puniceus. The percent variation explained by each principal component is given in parentheses.

Although the whole genome phylogeny provides a well-supported summary of the relationships among these taxa, concatenated phylogenies can obscure phylogenetic discordance in more defined genomic regions (Pease et al. 2016) (Fig. 1A). To test for fine-scale phylogenetic discordance, we next constructed ML phylogenies for 500 kb and 100 kb genomic windows. We then calculated a ‘concordance score’ for each tree by computing the correlation between the distance matrix generated from each window-based tree and the whole-genome tree, with a stronger correlation indicating that two trees have a more similar topology.

At the 500 kb scale, only 22 (6%) trees showed the same taxon branching order as the whole-genome tree. However, concordance scores tended to be very high for all of the trees (mean = 0.964, s.d. 0.039; min 0.719), suggesting that variation in the topologies was due primarily to minor differences in branching order. This was confirmed by quantifying how often each node in the genome-wide tree was recovered in the set of window-based trees (Fig. 1). Specifically, we found that differences in branching order were associated with the most recent splits, which included pairs of closely related taxa. For example, even though subspecies puniceus was monophyletic in the majority of trees (72.6%), individuals from the red ecotype only formed a monophyletic group in 27% of the trees. Similarly, the closely related subspecies longiflorus and calycinus were monophyletic in only 45.7% and 24.1% of trees, respectively. Higher discordance among these closely related, geographically proximate taxa is likely due to two factors. First, only a short time has passed since they shared a common ancestor, meaning that ancestral polymorphisms have had little time to sort among lineages. Second, ongoing gene flow between some taxa may have opposed sorting, prolonging the retention of ancestral variants among them.

Next, we examined how the level of phylogenetic discordance varied across the bush monkeyflower genome. If the discordance was due to the stochastic effects of neutral processes, variation in tree concordance scores should be distributed randomly across the genome (Hudson 1990). To test this prediction, we plotted the tree concordance scores across the 10 linkage groups (Fig. 2A; Fig. S4 for results from 100 kb windows and Fig. S5 for plots along each chromosome). Rather than being randomly distributed, trees with lower concordance scores tended to cluster in relatively narrow regions of all 10 chromosomes (Fig. 2A; autocorrelation analysis permutation tests p = 0.001 - 0.023; Fig. S6). This non-random pattern indicates that the rate of sorting varies along chromosomes, which could be due to variation in the strength of indirect selection across the bush monkeyflower genome.

Figure S4. Common patterns of genome-wide variation mirror variation in the local properties of the genome.

Plots are the same as in Fig. 2 of the main text, but for 100 kb windows (step size 10 kbp). A) Tree concordance; B-D) Z-transformed PC1 for F_ST, d_xy and π, respectively; E) gene count; and F) recombination rate (cM/Mbp) are plotted across the 10 monkeyflower LGs.

Figure S5. Patterns of variation plotted across each bush monkeyflower linkage group.

Z-transformed F_ST, d_xy, and π in overlapping 500 kb windows (step size = 50 kbp). The gray lines are z-transformed scores for each of the 36 pairwise comparisons (F_STand d_xy) or nine taxa (π), and the blue line is the z-transformed score for the first principal component (PC1). Estimates of ree concordance, gene count and recombination rate (cM/Mbp) are also shown.

Figure S6. Patterns of variation are non-randomly distributed across the genome.

A) Strong autocorrelation of tree concordance scores on LG1 over a Mbp scale. B) Levels of tree concordance, F_ST, d_xy, and π all show significant autocorrelation at the 2 Mbp scale. The dashed vertical lines show the observed autocorrelation coefficients for each LG with a 2 Mbp lag. The histogram shows the null distribution of autocorrelation coefficients (same lag) generated from 1000 random permutations of the genome-wide values. The observed data are significant at p = 0.001 unless stated otherwise.

Figure 2. Common genomic landscapes mirror variation in the local properties of the genome.

A) Tree concordance scores for 500 kb non-overlapping genomic windows plotted across the 10 bush monkeyflower chromosomes. B – D) Plots of the first principal component (PC1) for F_ST, d_xy, and π in overlapping 500 kb windows (step size = 50 kb). PC1 explains 66%, 70%, and 85% of the variation in F_ST, d_xy and π, respectively, and is Z-transformed such that above average values have positive values and below average values have negative values. E – F) Gene count and recombination rate (cM/Mbp) in overlapping 500 kb windows. G), Mean f_d(admixture proportion) in 500 kb non-overlapping genomic windows. See Fig. S4 for the same plot made for 100 kb windows.

Correlated patterns of genome-wide variation across the radiation

To gain deeper insight into the evolutionary processes that have shaped patterns of genome-wide variation during this radiation, we used three summary statistics to quantify patterns of diversity (π), divergence (d_xy), and differentiation (F_ST) within and between these taxa (the same 500 kb and 100 kb sliding windows as above). π, or nucleotide diversity, was used to quantify the level of genetic variation within each taxon. It is defined as the average number of nucleotide differences per site between two sequences drawn from the population (Nei and Li 1979). Second, we calculated d_xy as a measure of sequence divergence between all 36 pairs of taxa. In principle, π and d_xy are the same measure, except that the former is calculated between sequences within a taxon and the latter between pairs of taxa; they have the same units and both are proportional to the average coalescence time of the sequences being compared multiplied by mutation rate. Third, we calculated F_STbetween each pair of taxa. Unlike d_xy, which is a measure of divergence between sequences drawn from different taxa, F_ST measures differentiation between the taxa relative to the total diversity in the sample. Between samples x and y, F_ST is roughly equal to 1 - π/d_xy, where π is the mean diversity in the two taxa (Slatkin 1991). F_ST is strongly influenced by current levels of within-taxon diversity, while d_xy is strongly influenced by the level of ancestral variation when divergence times are small. The use of the term ‘divergence’ to describe both d_xy and F_ST has caused some confusion in the literature, leading to alternative naming schemes. Here, we follow a recently proposed convention, referring to d_xy as ‘divergence’ and F_ST as ‘differentiation’ (Ravinet et al. 2017).

The variation in F_ST among all 36 pairs of taxa highlights the continuous nature of differentiation across the group (Fig. 1B; Fig. S7), with mean window-based estimates ranging from 0.06 (red vs. yellow ecotypes of puniceus) to more than 0.70. Distributions of divergence (d_xy) show a similar pattern (Fig. 1C), with mean values ranging from 0.54% (red vs. yellow ecotypes) to 1.6% (yellow ecotype vs. M. clevelandii). Broad distributions of window-based estimates indicate high variability in levels of differentiation and divergence among genomic regions (Fig. 1B & 1C). Window-based estimates of nucleotide diversity also varied markedly (π; Fig. 1D), ranging from 0.09% to 1.26%, even though mean estimates were very similar among the ingroup taxa (0.37% to 0.53%) and were only slightly lower in M. clevelandii (0.26%).

Figure S7. Patterns of differentiation and divergence for all 36 pairs of taxa.

Box plots for each of the 36 pairwise taxonomic comparisons reveal the range of variation in F_ST and d_xy across the radiation. Moreover, the data show extensive variance among genomic windows within each comparison. Vertical black lines indicate the median, boxes represent the lower and upper quartiles, and whiskers extend to 1.5 times the interquartile range. Taxon abbreviations: cal, calycinus; lon, longiflorus; aur, aurantiacus; par, parviflorus; ari, aridus; gra, grandiflorus; clv, M. clevelandii.

As with tree concordance, variation in these summary statistics was non-randomly distributed across broad regions of each chromosome (p < 0.005; Fig. 2; Fig. S4; Fig. S5; Fig. S6). To account for the magnitude of variation in these statistics across all nine taxa (for π) or among the 36 pairs of taxa (for d_xy and F_ST), we normalized the window-based estimates using Z-transformation and plotted them across the genome (Fig. S5). Visual inspection of these data revealed that patterns of genome-wide variation in each statistic were qualitatively similar in all comparisons. We therefore used principal components analysis to quantify their similarity and extracted a single variable (PC1) that summarized the common pattern (Fig. S5).

These analyses confirmed that patterns of genome-wide variation were highly correlated across this group of taxa. Indeed, PC1 explained 65.9% of the variation in F_ST across the 36 pairwise comparisons. Further, all comparisons loaded positively onto PC1 (mean loading = 0.78 s.d. 0.18; Table S5 for all loadings), indicating that peaks and troughs of F_ST tended to occur in the same genomic regions across all comparisons. Patterns of genome-wide divergence (d_xy) and diversity (π) also were highly correlated across comparisons, with PC1 explaining 69.5% and 84.7% of the variation among the window-based estimates, respectively. Again, all taxa (for π) and taxon comparisons (for d_xy) loaded positively onto the first principal component (mean loading for d_xy= 0.78 s.d. 0.18; for π, 0.91 s.d. 0.07). PC1 therefore provides a summary of the original landscapes and is effectively the same as taking the mean window-based scores for each statistic (r² between PC1 and mean scores > 0.995 for all three statistics).

View this table:

Table S4.

Loadings for principal component 1 calculated across all 36 pairwise comparisons (for F_ST and d_xy) or all nine taxa (for π).

View this table:

Table S5.

Details for the linear regressions presented in Figure 5 of the main text.

Common genomic landscapes have been shaped by heterogeneous indirect selection

Observing highly similar genomic landscapes suggests that a common pattern of heterogeneous selection has shaped variation in all nine taxa. Indeed, if a region experiences recurrent indirect selection across the phylogenetic tree, then it should show lower diversity (π) within species and lower divergence (d_xy) between species, because d_xyis influenced by levels of diversity in the common ancestor (Charlesworth et al. 1993, Charlesworth 1998, Burri 2017b). In agreement with this prediction, we observed a strong positive correlation between PC1 d_xy and PC1 π (r = 0.84), indicating that regions of the genome with lower diversity tended to be less diverged between these taxa (and thus had lower ancestral diversity) (Fig. 3, Fig. S8 for scatterplots and Fig. S9 for results at 100 kb scale). Regions with reduced diversity also tended to show higher levels of differentiation (F_ST) (r = −0.84) and tree concordance (r = −0.69). These relationships are also predicted by models of recurrent indirect selection, because local reductions in diversity decrease the amount of genetic variation in future generations, similar to a local reduction in N_e. As a result, ancestral variants in impacted regions sort more rapidly than under neutrality (Pease and Hahn 2013).

Figure S8. Bivariate plots among measures of variation and genomic features across 500 kb genomic windows.

Note that this is the same as Figure 3 but with axes units. Also note that the axes are different across rows and columns of the matrix.

Figure S9. Bivariate plots among measures of variation and genomic features across 100 kb genomic windows.

The number is the correlation coefficient. Positive correlation coefficients are colored blue and negative coefficients are colored red.

Figure 3. Correlations between measures of diversity and intrinsic features reveal the impact of heterogeneous indirect selection.

Matrix of pairwise correlations between PC1 F_ST, PC1 d_xy, PC1 π, tree concordance, mean f_d, gene density, and recombination rate, all estimated in 500 kb non-overlapping windows. The heat map indicates the strength of the correlation and its sign. All correlations are statistically significant at p < 0.001. (See Fig. S8 for a more detailed correlation matrix and Fig. S9 for the correlation matrix from 100 kb windows.

In models of recurrent indirect selection, variation in its impact on associated sites is determined by the distribution of genomic features, including the local density of functional elements and the recombination rate (Maynard-Smith and Haigh 1974, Charlesworth et al. 1993). To test these theoretical predictions, we used our genome annotation and genetic map to calculate the number of protein coding genes and the average recombination rate (cM/Mbp) in each 500 kb window (Fig. 2E-F; Fig. S4; Fig. S5). Regions of the genome with more functional elements tended to have a lower recombination rate (r = −0.40; p < 0.0001), leading to large variation in the predicted strength of indirect selection among regions. Consistent with the theoretical predictions outlined above, we found strong correlations between PC1 π and gene count (r = −0.84; p < 0.0001) and PC1 π and recombination rate (r = 0.44; p < 0.0001; Fig. 3; Figs. S8 & S9), indicating that diversity is indeed lower in regions where selection is predicted to have stronger indirect effects. However, we did not observe a significant interactive effect of gene count and recombination rate on diversity (p = 0.057). This may be because the distribution of these features is correlated, making it difficult to tease apart their relative impacts.

Taken together, our results indicate that a common pattern of indirect selection has caused correlated genomic landscapes to evolve across this radiation. As predicted by theory (Maynard-Smith and Haigh 1974, Charlesworth et al. 1993, Langley et al. 2012) (Hudson and Kaplan 1995) and observed in diverse taxa (Pease and Hahn 2013, Burri et al. 2015, Corbett-Detig et al. 2015), genome-wide variation in the strength of indirect selection is caused by the heterogeneous distributions of intrinsic genome features, namely the density of functional elements and the local recombination rate.

A known adaptive locus shows a strong deviation from the common pattern of differentiation

Because widespread signatures of indirect selection are often assumed to reflect the impact of heterogeneous background selection (BGS) across the genome, several authors have proposed that the correlation in F_ST across multiple pairs of taxa can be used as a baseline for detecting genomic regions that have been affected by positive selection (Berner and Salzburger 2015, Burri 2017b). More specifically, loci that have contributed to adaptation or speciation should be detectable as a positive deviation from the common pattern of differentiation (i.e., PC1), which is considered to reflect processes that are unrelated to ecologically relevant positive selection (Burri 2017b).

We have a unique opportunity to test the effectiveness of this comparative genomic approach by examining patterns of differentiation around a locus that is known to contribute to adaptation and speciation in subspecies puniceus. Using a candidate gene approach, Streisfeld et al. (Streisfeld et al. 2013) showed that the shift from yellow to red flowers in puniceus was caused by a cis-regulatory mutation in the R2R3-MYB transcription factor MaMyb2. Patterns of haplotype variation within the gene indicate that the red allele was subject to strong positive selection and rapidly swept to fixation in what is now the geographic range of the red ecotype (Stankowski and Streisfeld 2015).

To test for a linage-specific signature of differentiation at this locus, we examined patterns of Z-F_ST in a relatively narrow region (∼3 Mbp) surrounding MaMyb2. (Fig. 4; Fig. S10). At the 500 kb scale, the comparison between the red and yellow ecotypes shows a sharp peak centered near the flower color locus. The peak is strongly elevated above PC1 Z-F_ST(3.33 standard deviations), indicating that differentiation is indeed accentuated relative to the level observed between other taxon pairs. Further, other comparisons, including those with the red ecotype, do not show accentuated differentiation in this region, indicating that the signal is specific to this taxon pair. The peak is even more pronounced in the analysis at the 100 kb scale, rising 9.5 standard deviations above PC1 Z-F_ST. At this scale, the signature of positive selection is visible in other comparisons that include the red ecotype, though the signal is less pronounced than in the comparison with the yellow ecotype (Fig. S10).

Figure S10. A large-effect adaptive locus shows a lineage-specific signature of positive selection.

Plots of Z-transformed F_ST across the genome, estimated in 100 kb sliding windows (step size 10 kb). The red line shows values between the red and yellow ecotypes of subspecies puniceus. The orange lines show other comparisons with the red ecotype of punicues, and the gray lines show the values of all other comparisons. The dashed blue line shows the first PC calculated across all of the comparisons. The triangle marks the position of the gene MaMyb2. A cis-regulatory mutation that is tightly linked to this gene is responsible for the shift from yellow to red flowers (Corresponds to Figure 4 in the main text).

Figure 4. A large-effect adaptive locus shows a lineage-specific signature of positive selection.

Plots of Z-transformed F_STacross the genome, estimated in 500 kb sliding windows (step size 50 kb). The red line shows values between the red and yellow ecotypes of subspecies puniceus, while the gray lines show the values of all other comparisons. The dashed blue line shows the first PC calculated across all of the comparisons. The triangle marks the position of the gene MaMyb2. A cis-regulatory mutation that is tightly linked to this gene is responsible for the shift from yellow to red flowers. See Fig. S10 for the same plot made for 100 kb windows.

Correlated differentiation landscapes emerge rapidly following a population split

Although correlated landscapes may evolve due to the indirect effects of widespread background selection, it is also conceivable that they could evolve due to other evolutionary processes. To gain insight into the role that background selection may have played in shaping these patterns, we next tested several hypotheses about how BGS is expected to shape the genomic landscape over time. To do this, we used the level of genetic distance between each pair of taxa as a proxy for their divergence time and constructed a temporal picture of genomic landscape evolution that spans the first million years following a population split.

In a verbal model, Burri (Burri 2017b) made several predictions about how correlations between measures of variation and genome features should be impacted by BGS following a population split. First, the correlations between diversity (π) and intrinsic features should be relatively consistent over time, as heterogeneous BGS will continue to constrain patterns of diversity in each daughter population after the split. Second, π and d_xy should remain highly correlated with one another, because d_xy is largely influenced by ancestral diversity at these short timescales. Third, levels of genome-wide differentiation (F_ST) initially should be low and vary stochastically across the genome, due mainly to the sampling effect that accompanies a split. Therefore, F_ST should not be correlated with the distribution of genomic properties early in divergence. However, as time passes, and the indirect effects of ancestral and lineage-specific BGS accumulate, levels of differentiation should gradually become more correlated with underlying genome features.

To test these hypotheses, we computed correlations between relevant window-based measures of variation and genomic features between all 36 pairs of taxa. We then plotted these correlations on corresponding measures of between-taxon d_a, which ranged from very low (0.05%) between the parapatric red and yellow ecotypes of subspecies punicieus, up to 1.3% in allopatric comparisons that included M. clevelandii. We used d_a (d_xy minus mean π), because it corrects for sequence variation present in the common ancestor, making it a better proxy of recent divergence time than d_xy.

This analysis revealed very clear temporal signatures of genomic landscape evolution, some of which were consistent with the above predictions (Fig. 5; See Fig. S11 for results with d_xy as a proxy for time). First, the presence of strong correlations between π (mean of the two taxa) and the distribution of genomic features barely changed with increasing divergence time between populations (Fig. 5A & 5B; Table S6). This is consistent with the prediction that diversity landscapes are inherited from the ancestor and are then maintained by the indirect effects of selection in each daughter population following the split. Second, the relationships between F_ST and gene count, F_STand recombination rate, and F_ST and π all become stronger as d_a increases (Fig. 5C & 5D; Table S6). Thus, as predicted, the differentiation landscape increasingly reflects the distribution of intrinsic features as divergence time increases.

View this table:

Table S6. Parameters used in SLiM to simulate genomic landscape evolution. (N=10,000).

Figure S11. The range of divergence times reveals static and dynamic signatures of recurrent indirect selection.

Correlations between variables (500 kb windows) for all 36 taxonomic comparisons (gray dots) plotted against the average d_xy as a measure of divergence time. The left panels show how the relationships between π (each window averaged across a pair of taxa) and (A) gene count and (B) recombination rate vary with increasing divergence time. The middle panels (C & D) show the same relationships, but with F_ST. The right panels show the relationships between (E) d_xy and π and (F) F_ST and π. The regressions (dashed lines) in each plot are fitted to the eight independent contrasts (colored points) obtained using a phylogenetic correction, with the regression equation and strength of the correlation given in each panel. The color gradient shows the strength of the correlation.

Figure 5. The range of divergence times reveals static and dynamic signatures of recurrent indirect selection.

Correlations between variables (500 kb windows) for all 36 taxonomic comparisons (gray dots) plotted against the average d_a as a measure of divergence time. The left panels show how the relationships between π (each window averaged across a pair of taxa) and (A) gene count and (B) recombination rate vary with increasing divergence time. The middle panels (C & D) show the same relationships, but with F_ST. The right panels show the relationships between (E) d_xy and π and (F) F_STand π. The regressions (dashed lines) in each plot are fitted to the eight independent contrasts (colored points) obtained using a phylogenetic correction. The color gradient shows the strength of the correlation.

However, two of the observed patterns differed markedly from existing predictions (Burri 2017b). First, π and d_xy did not remain highly correlated over this relatively short timeframe. Although the correlation is almost perfect at the earliest time points (r > 0.99), it decays rapidly and is not significantly different from zero in the most diverged taxon pairs (Fig. 5E). In addition, the correlations between F_STand proxies of the strength of indirect selection are much higher than expected at these early divergence times. For example, for the red and yellow ecotypes, the correlations between F_STand gene density (r = 0.415), recombination rate (r = −0.208), and mean π (r = −0.452) are substantial and highly significant (p < 0.0001) at a very early stage of divergence. The strength of these correlations also increases rapidly, with the diversity and differentiation landscapes almost perfectly mirroring one another in the most divergent comparisons (Fig. 5F; r = −0.94; Fig. 6; Fig. S12). As our results only partially match the above predictions, it is possible that other forces aside from BGS are responsible for the rapid emergence of differentiation landscapes in this system. Therefore, we next considered the potential for other evolutionary processes to generate these patterns.

Figure S12. Negative correlation between nucleotide diversity and differentiation becomes stronger with increasing divergence time.

Bivariate plots of the correlation between F_ST and π at varying levels of sequence divergence (d_xy).

Figure 6. Emergence of a heterogeneous differentiation landscape across one million years of divergence.

A) Plots of F_ST(500 kb windows) across the genome for pairs of taxa at early (red vs. yellow), intermediate (red vs. parviflorus), and late stages (red vs. M. clevelandii) of divergence. B) Average nucleotide diversity (for the red ecotype of subspecies puniceus, yellow ecotype of subspecies puniceus, subspecies parviflorus, and M. clevelandii) across the genome in 500 kb windows. The geographic distribution (parapatric or allopatric), sequence divergence (d_a × 10^-2), and correlation between F_ST and mean π are provided next to each taxon pair.

Simulations suggest that adaptation has played an important role in genomic landscape evolution

To assess the plausibility of different modes of selection for generating the observed genomic landscapes and temporal patterns, we used individual-based simulations in SLiM (Haller et al. 2019, Haller and Messer 2019). Our basic model consisted of an ancestral population (N = 10,000) that evolved for 10N generations and then split into two daughter populations that diverged for a further 10N generations. Each individual carried a 21 Mbp chromosome (similar to the physical size of a bush monkeyflower chromosome), partitioned into three equally sized regions. In the central third, all mutations were neutral, while some mutations in the two distal ends could affect fitness, depending on the scenario. While simple, this partitioning scheme generates broad-scale variation in the strength of indirect selection across the chromosome and roughly approximates the distributions of genomic features in this system (e.g., LG 7 in Fig. 2).

We implemented six modifications to this basic model: (i) Neutral divergence: no mutations affect fitness; (ii) Background selection (BGS): non-neutral mutations are deleterious; (iii) Bateson-Dobzhansky-Muller incompatibility (BDMI): As ii, except that after the split, a fraction of non-neutral mutations is deleterious in one population and neutral in the other (randomly chosen). (iv) Positive selection: non-neutral mutations have positive fitness effects. (v) BGS and positive selection: ii and iv combined; (vi) Local adaptation: As iv, but after the split some non-neutral mutations are beneficial in one population and neutral in the other. Migration between populations was allowed only in scenarios iii and vi. See Methods for more details and Table S7 for parameter values. To summarize the results, measures of variation (π, d_xy, and F_ST) were calculated in 500 kb regions at 10 time points for each simulation.

View this table:

Table S7. Genome-wide estimates of admixture across the bush monkeyflower radiation.

Measures of Patterson’s D and the admixture proportion, f, are given for all 48 possible pairs of four-taxon tests, with M. clevelandii as the outgroup for each test. All values of Patterson’s D are statistically significant (P < 0.0001) based on a block jackknife approach.

View this table:

Table S8. Correlation coefficients between the window-based admixture proportion (f_d) and PC1 F_ST.

Pearson’s r was calculated between PC1 F_ST and f_d for each of the 48 four-taxon tests, measured in 500kb non-overlapping windows. M. clevelandii is the outgroup for each test.

This broad range of scenarios generated quite different patterns of variation, both through time and across the chromosome (Fig. 7; Fig. S13). As expected, the neutral model did not produce heterogeneous genomic landscapes. Background selection produced a diversity landscape characterized by higher diversity in the unconstrained central third of the chromosome (Fig 7; Fig S13). Consistent with the predictions of Burri (2017a), π and d_xyremained highly correlated over the course of the simulations. However, even with a high proportion of deleterious mutations (10%) and substantial fitness effects (1%), the variation in π and d_xy was modest, as was variation in the differentiation landscape (Fig. 7).

Figure S13. Genomic landscapes simulated under different different divergence histories.

Each row of plots shows patterns of within- and between-population variation (π, d_xy, and F_ST) across a 21Mb chromosome (500 kb windows) at ten timepoints (in N generations, where N = 10,000) for one parameter combination of six scenarios: neutral divergence, background selection, Bateson-Dobzhansky-Muller incompatibilities, positive selection, background selection and positive selection, and local adaptation. The grey boxes in the first column show the areas of the chromosome that are constrained by selection. Mean centered (above line) and raw values (below line) of π, d_xy. The parameter Ns modulates the average selective coefficient (where s = Ns/N) while Prop is the proportion of new mutations that are not neutral, Nm is the average number of migrants per generation.

Figure 7. Genomic landscapes simulated under different divergence histories.

Each row of plots shows patterns of within- and between-population variation (π, d_xy, and F_ST) across the chromosome (500 kb windows) at five time points (N generations, where N = 10000) during one of scenarios The selection parameter (Ns, where s = Ns/N), proportion of deleterious (-) and positive mutations (+), and number of migrants per generation (Nm; 0 unless stated) for these simulations are as follows: (i) neutral divergence (no selection), (ii) background selection (-Ns = 100; -prop = 0.01), (iii) Bateson-Dobzhansky-Muller incompatibilities (BDMI) (-Ns = 100, -prop = 0.05, Nm = 0.1), (iv) positive selection (+Ns = 100, +prop = 0.01), (v) background selection and positive selection (-Ns = 100, - prop = 0.01; +Ns = 100, +prop = 0.005), and (vi) local adaptation (+Ns = 100, +prop = 0.001, Nm = 0.1). The gray boxes in the first column show the areas of the chromosome that are experiencing selection, while the white central area evolves neutrally. Note that π (in populations a and b) and d_xy have been mean centered so they can be viewed on the same scale. Un-centered values and additional simulations with different parameter combinations and more time points can be found in Fig. S13.

Under some conditions, the BDM incompatibility model produced patterns of variation that were more similar to our empirical findings than the BGS model (Fig. 7, S13). In all simulations, patterns of diversity and divergence were relatively homogenous at the time of the split. However, selection against gene flow induced by incompatibility loci on the chromosome ends caused patterns of variation to become highly structured over time. Indeed, higher π, lower d_xy, and lower F_ST were observed in the neutrally evolving chromosome center due to the homogenizing effects of higher gene flow in this region. As in our empirical data, the correlation between π and d_xy decayed over the course of some simulations, and even became negative depending on rates of gene flow and selection. Although we simulated this scenario starting with undifferentiated populations (i.e., primary contact), widespread BDMIs should produce the same patterns for populations that diverged in allopatry and came back into secondary contact. This is because gene flow will have a stronger homogenizing effect in regions of the genome that are not associated with incompatibilities (Durrett et al. 2000).

The three models with positive selection (Positive, BGS & Positive, and Local adaptation) all produced highly heterogeneous genomic landscapes across the range of parameter values explored (Fig. 7; Fig. S13). In all cases, positive selection in the ancestor created a highly heterogeneous diversity landscape that was inherited by each daughter population and was maintained for the duration of the simulation. As in our data, π and d_xy were perfectly correlated at early divergence times, but the correlation decayed rapidly and even became negative in scenarios with stronger selection. Unlike in the BDMI scenario, where the correlation between π and d_xy decayed due to higher gene flow in the chromosome center, it broke down in these simulations because positive selection caused d_xy to increase more rapidly in the chromosome ends (Fig. S13). Highly heterogeneous differentiation landscapes, characterized by lower F_STin the chromosome center, emerged rapidly in all three scenarios. This pattern appeared irrespective of the parameter values examined and was not heavily influenced by other factors, like gene flow, the inclusion of BGS, or whether adaptation occurred from standing variation. However, some of these factors may have a larger influence in simulations with different non-neutral mutation rates and selection strengths.

Overall, the results of these simulations suggest that background selection (BGS) is not primarily responsible for heterogeneous genomic landscapes. Although our simulations should be interpreted cautiously, because we have not thoroughly explored parameter space associated with each model, the results of other recent simulation studies also support this conclusion. For example, Matthey-Doret & Whitlock (Matthey-Doret and Whitlock 2018) simulated population divergence with BGS under different scenarios with simulations using parameters estimated from humans and stickleback. They found that BGS was unable to generate heterogeneous differentiation landscapes over short timescales, with or without gene flow. Similarly, Rettelbach et al. (Rettelbach et al. 2019) simulated the evolution of diversity landscapes using empirical estimates of recombination rate and functional densities from the collared flycatcher genome. They found that BGS was able to generate modest variation in levels of diversity across chromosomes, but they concluded that it was not sufficient to explain the pronounced dips observed in empirical studies of flycatcher genomes (e.g., (Burri et al. 2015)). Both of these studies suggest that other processes (probably positive selection) are responsible for generating the observed patterns (Matthey-Doret and Whitlock 2018, Rettelbach et al. 2019).

Our simulations suggest that divergence histories involving positive selection and/or incompatibilities are plausible explanations for heterogeneous genomic landscapes that emerge rapidly after a population split. However, in order to account for the presence of strong correlations between measures of variation and genomic features across multiple comparisons, these scenarios assume that the genomic basis of adaptation and/or reproductive isolation is diffuse, shared across multiple taxa, and evolves rapidly. Although this was once considered unlikely, recent studies suggest that these conditions may be common. For example, adaptation and speciation are now thought to be highly polygenic (Rockman 2012, Jiggins and Martin 2017, Martin et al. 2019), and recent evidence suggests that adaptation draws heavily on ancestral mutations, rather than those arising independently in multiple, related descendent lineages (Barrett et al. 2008, Nelson and Cresko 2018, Marques et al. 2019). Similarly, the interaction between widespread hybrid incompatibilities (intrinsic or extrinsic) and intrinsic genomic features is thought to have caused similar patterns of genome-wide variation to evolve between different pairs of hybridizing taxa (Martin et al. 2019).

Given the rapid and extensive trait diversification that has occurred during the bush monkeyflower radiation (Stankowski et al. 2015, Chase et al. 2017, Sobel et al. 2019), positive selection has probably played an important role in shaping patterns of genome-wide variation across the group. This includes the striking divergence of floral traits (Fig. 1A), which is thought to be due to divergent pollinator-mediated selection across southern California (Grant 1981, 1993b, a). In the taxa that have been studied in detail, crossing experiments have shown that most of these traits are polygenic, and there is evidence that they are targets of selection (Sobel and Streisfeld 2015, Stankowski et al. 2015). Intrinsic genetic incompatibilities may also influence patterns of variation between some of the taxa, but they are unlikely to explain the correlated landscapes that have evolved across the whole group, as most of these taxa show little or no evidence for intrinsic incompatibilities between them (McMinn 1951, Sobel and Streisfeld 2015). However, local adaptation also can generate a porous isolating barrier (Wu 2001), so it is likely that gene flow has contributed to genomic landscape evolution among some taxa, particularly those that are geographically proximate and/or hybridize in areas where their distributions overlap.

Radiation-wide evidence for selection against gene flow

To determine if gene flow has contributed to the evolution of these correlated genomic landscapes, we first tested for evidence of introgression among these taxa using D-statistics (Green et al. 2010). Patterson’s D measures asymmetry between the numbers of sites with ABBA and BABA patterns (where A and B are ancestral and derived alleles, respectively) across three in-group taxa and an out-group (in our case, M. clevelandii) that have the relationship (((P1, P2), P3) O). A significant excess of either pattern gives a non-zero value of D, which is taken as evidence that gene flow has occurred between P3 and one of the in-group taxa (Green et al. 2010). Overall, these tests provide evidence that gene flow has occurred during this radiation. All 48 of the appropriate four-taxon combinations yielded significant non-zero values of Patterson’s D (p < 0.0001; Table S7; Fig. S14). The admixture proportions (f) for each test indicate that an average of 2.5% of the genome has been transferred between pairs of in-group taxa, but the proportion varies among the tests, ranging from less than 1% to more than 10% (Fig. S4.

Figure S14. Evidence for widespread gene flow across the bush monkeyflower radiation.

Genome-wide estimates of Patterson’s D (blue circles) and the admixture proportion (f; red crosses) are shown for all 48 possible four taxon comparisons, with M. clevelandii as the outgroup in each test. The tests are ordered by increasing values of D, and each value is significant based on a block jackknife approach. The taxa included in each test are shown in the order P1, P2, P3 and D is always positive, meaning gene flow occurs between P2 and P3.

Figure S15. Variation in levels of admixture across the genome.

The grey lines show measures of a modified test of the admixture proportion, f_d, estimated in 500kb non-overlapping windows for 48 different four-taxon comparisons, plotted across the 10 linkage groups of the bush monkeyflower genome. The blue line gives the mean value of f_d, calculated by taking the average value across all 48 tests in each genomic window.

Although gene flow initially involves the exchange of whole genomes between populations, selection against gene flow reduces effective migration (m_e) in regions of the genome that are associated with barrier loci (Jiggins and Martin 2017, Ravinet et al. 2017). Therefore, porous isolating barriers should result in a heterogeneous pattern of introgression across the genome (Barton and Hewitt 1985, Wu 2001). To test this hypothesis, we calculated the f_d statistic, a version of the admixture proportion modified for application to genomic windows (Martin et al. 2015). As predicted, estimates of f_d, varied markedly among genomic regions (Fig. 2; Fig. S14). Most windows showed levels of f_dat or near zero, indicating that foreign alleles have been purged from these regions by selection. However, a proportion of windows showed substantial admixture proportions. In some tests, values of f_d exceeded 0.5, indicating that more than half of the sites in some windows have been shared between taxa.

Because gene flow opposes differentiation, we would expect to see lower F_ST in regions with a higher proportion of introgressed variants. Consistent with this prediction, we observed a substantial negative correlation (r = −0.57; p < 0.0001) between mean f_d (for each window, averaged over the 48 tests) and PC1 F_ST. This result is not driven by a limited number of the four-taxon tests, as 44 of the 48 comparisons showed the same significant negative relationship (Table S15). Also consistent with models of widespread selection against gene flow, regions of the genome with higher f_dscores tended to have higher diversity (r = 0.45; p < 0.0001), lower tree concordance scores (r = −0.59; p < 0.0001), fewer functional genes (r = −0.36; p < 0.0001), and a higher recombination rate (r = −0.18; p < 0.0001).

Overall, these results indicate that widespread gene flow has played a key role in the formation of the genomic landscape in this system. In addition to reductions in diversity and increased differentiation owing to selection against gene flow, the persistence of introgressed variants has probably resulted in higher levels of diversity in regions with fewer genes and higher recombination rate. The four-taxon tests show that the impact of gene flow is widespread across the radiation, though some caution should be exercised when interpreting the specific pattern of individual introgression events.

Given that there is the potential for gene flow between so many taxa and ancestral lineages, it is difficult to infer the source and timing of individual admixture events. For example, rather than reflecting recent gene flow between all pairs of taxa, some introgression events may have occurred deeper in history, and their consequences inherited by multiple taxa. Although more sophisticated methods will be needed to model gene flow across this group, these results clearly show that it has contributed to the rapid evolution of correlated genomic landscapes during this radiation.

Conclusions and implications for understanding genomic landscape evolution and the basis of adaptation and speciation

Facilitated by a new chromosome-level genome assembly, we have shed light on the causes of correlated genomic landscapes across a radiation of monkeyflowers. Adaptive divergence and gene flow are hallmarks of rapid radiations (Schluter 2000), and our data suggest that the indirect effects of selection resulting from these processes have contributed to a common pattern of differentiation among these taxa. Our ‘time-course’ analysis shows that the common landscape emerged rapidly after populations split and has become more correlated with the distribution of genomic features as divergence time has increased. Although background selection may play some role, its effects are probably too subtle to have made a strong contribution to the correlations during over the timeframe of this radiation.

In addition to enhancing our understanding of the processes that have shaped the genomic landscape during this radiation, our study contributes toward a more general understanding of the role that natural selection plays in shaping genome-wide variation. In line with the findings of other recent studies, our results indicate that little, if any, of the genome evolves free of the effects of natural selection (Hahn 2008, Corbett-Detig et al. 2015). Moreover, our ‘time-course’ analysis shows that between-taxon signatures of selection can emerge very rapidly after a population split and can be substantial, even between populations at the early stages of divergence. Overall, these results suggest that patterns of between-population variation cannot be understood without taking the effects of natural selection into account.

An important point arising from this work is that multiple divergence histories can generate heterogeneous differentiation landscapes that are correlated both among taxa and with the distribution of intrinsic genomic properties. When divergence is recent, possible explanations include polygenic adaptation across multiple populations and porous barriers to gene flow arising from divergent ecological selection and/or intrinsic incompatibilities. Although it has often been assumed that recurrent background selection is the primary cause of correlated landscapes, it is important to remember that all forms of selection can indirectly impact levels of variation at associated sites. In fact, our simulations show that alternative explanations may be more likely when divergence times are short and there is opportunity for gene flow among taxa. Thus, we advocate for a more nuanced approach when interpreting correlated differentiation landscapes, rather than assuming that they are caused by a single evolutionary process.

Finally, our results have important practical implications for studies attempting to identify the genomic basis of adaptation and speciation from patterns of genome-wide variation. Indeed, detecting these loci is a major goal of adaptation and speciation studies, and genome scans are now commonly used to identify promising candidate regions (Ravinet et al. 2017). In an effort to correct for the potentially confounding effects of background selection (BGS), it has been proposed that the correlation in F_STamong multiple population pairs can be used as a baseline for outlier detection (Berner and Salzburger 2015, Burri 2017b). A core premise of this method is that correlated differentiation landscapes are caused primarily by BGS, so removing this signature should expose the loci relevant to adaptation and/or speciation. Although this approach may be successful in identifying large-effect loci targeted by lineage-specific positive selection, as we illustrated for the MaMyb2 locus, we caution against treating the common pattern of differentiation simply as background noise. Rather than parsing out the effects of BGS, studies that use this approach may actually be discarding the genomic signature of polygenic adaptation and speciation.

Data Accessibility

Raw sequencing reads used for the genome assembly, linkage map construction, and genome resequencing and are available on the Short-Read Archive (SRA) under the bioproject ID xxx. The genetic map, annotation, VCF file, and analysis scripts have been deposited on DRYAD. The reference assembly is available at mimubase.org. Code used to run the simulations is available on Github at: https://github.com/mufernando/mimulus_sims.

Materials and Methods

Genome Assembly

We used a combination of short-read Illumina and long-read Single Molecule, Real Time (SMRT) sequencing to assemble the genome of a single individual from the red ecotype of M. aurantiacus subspecies puniceus (population UCSD; Table S1). Genomic DNA was isolated from leaf tissue using either ZR plant/seed DNA miniprep kits (Zymo Research) or GeneJet Plant Genomic DNA purification kits (Thermo Fisher). Illumina libraries were generated following the Allpaths-LG assembly pipeline (Gnerre et al. 2011), which included a single fragment library with average 180 bp insert size and three mate pair libraries (average insert sizes: 3.5-5 kb, 5-7 kb, and 7-13 kb). Libraries were sequenced on the Illumina HiSeq 2500 using paired-end 100 bp reads. An initial scaffold-level assembly was performed with Allpaths-LG using default parameters and the haploidify function enabled. This assembly yielded 11,123 contigs (N50 = 40.5 kb) and 2,299 scaffolds (N50 = 1,310 kb), for a total assembly size of 193.3 Mbp. Long-read sequencing was performed from the same individual using 12 SMRT cells sequenced on the Pacific Biosystems RS II machine at Duke University. We obtained a total of 6.4 Gbp of sequence, which corresponds to ∼21× coverage of the genome. The PacBio reads were used to re-scaffold the Allpaths-LG scaffolds using Opera-LG (Gao et al. 2016). This reduced the number of scaffolds to 1,547 (N50 = 1,578 kb).

We then manually improved the scaffold containing the flower color gene MaMyb2 (Streisfeld et al. 2013). We first aligned this scaffold to a previously published draft sequence assembly from this same individual (Stankowski et al. 2017), which was generated using Illumina short-reads and the Velvet assembler (Zerbino and Birney 2008). We used long range PCR and cloning to generate Sanger sequences across three regions within 20 kb of MaMyb2 that did not assemble well. Genomic DNA was amplified using Phusion high fidelity polymerase (NEB). PCR products were cloned into the pCR2.1 TOPO-TA vector (Life Technologies), and purified plasmids were sequenced with Sanger technology. Resulting sequences were aligned to the scaffold containing MaMyb2, and new PCR primers were designed to sequence internal fragments until the entire insert was sequenced. Using this approach, we sequenced a total of 9,824 bp across the three regions. The reference sequence in the assembly was corrected manually to match the Sanger data.

Finally, we gap filled the assembly using the PacBio data and the program PBJelly (English et al. 2012). Resulting scaffolds were assembled into pseudomolecules using Chromonomer (http://catchenlab.life.illinois.edu/chromonomer/), according to the online manual. This software anchored and oriented scaffolds based on the order of markers in a high-density linkage map (see below) and made corrections to scaffolds when differences occurred between the genetic and physical positions of markers in the map. A final round of gap filling with PBJelly was performed to fill any gaps that were created by broken scaffolds in Chromonomer. To assess the completeness of the gene space in the assembly, we used both the BUSCO and CEGMA pipelines to estimate the proportion of 956 single copy plant genes (BUSCO) or 248 core eukaryotic genes (CEGMA) that were completely or partially assembled (Parra et al. 2007, Simao et al. 2015). The proportion of these genes present in an assembly has been shown to be correlated with the total proportion of assembled gene space, and thus serves as a good predictor of assembly completeness.

Construction of high-density linkage map

We generated an outbred F₂ mapping population by crossing two F₁ individuals, each the product of crosses between different greenhouse-raised red and yellow ecotype plants collected from one red ecotype and one yellow ecotype population (populations UCSD and LO, respectively; Table S1). We then used restriction-site associated DNA sequencing (RADseq) to genotype F₁ and F₂ individuals. DNA was extracted from leaf material using Zymo ZR plant/seed DNA miniprep kits, and RAD library preparation followed the protocol outlined in Sobel and Streisfeld (Sobel and Streisfeld 2015). Libraries were sequenced on the Illumina HiSeq 2000 platform using single-end 100 bp reads at the Genomics Core Facility, University of Oregon.

Reads were filtered based on quality, and errors in the barcode sequence or RAD site were corrected using the process_radtags script in Stacks v. 1.35 (Catchen et al. 2011, Catchen et al. 2013). Loci were created using the denovo_map.pl function of Stacks, with three identical raw reads required to create a stack, two mismatches allowed between loci for an individual, and two mismatches allowed when processing the catalog. Single nucleotide polymorphisms (SNPs) were determined and genotypes called using a maximum-likelihood (ML) statistical model implemented in Stacks and a stringent χ² significance level of 0.01 to distinguish between heterozygotes and homozygotes. We then used the genotypes program implemented in Stacks to identify a set of 9,029 mappable markers. We specified a ‘CP’ cross design (F₁ individuals coded as the parents), requiring that a marker was present in at least 85% of progeny at a minimum depth of 12 reads per individual, and we allowed automated corrections to be made to the data.

Linkage map construction was performed using Lep-MAP2 (Rastas et al. 2016). The data were filtered using the Filtering module to include only individuals with less than 15% missing data and excluded markers that showed evidence for extreme segregation distortion (χ² test, P < 0.01). To assign markers to linkage groups, we used the SeparateChromosomes module with a logarithm of odds (LOD) score limit of 20 and no minimum size for linkage groups (LG). This assigned 7,217 markers to 10 linkage groups, which matches the number of chromosomes in M. aurantiacus. The JoinSingles module was executed again with a LOD limit of 10 to join an additional 877 ungrouped markers to the 10 previously formed LGs. Fifty-seven singles that were not joined at this stage were discarded from the dataset. Initial marker orders were determined using sex-averaged and sex-specific recombination rates using the OrderMarkers module. For each LG, we conducted 10 independent runs using the Kosambi mapping function (useKosambi=1), with the dataset split into seven pseudo-families to take advantage of parallel processing. When multiple markers had identical genotypes, only the duplicate marker with the least missing data was used in marker ordering. We retained the marker order from the run with the best likelihood. After removing markers with an error rate > 0.05, the ML order was re-evaluated using the evaluateOrder flag. The map contained 8,094 informative loci from 269 F2 individuals, with an average of 3.5% ± SD 3.86 missing data per individual.

After the integration of our assembly and genetic map using the Chromonomer software (Amores et al. 2014), we made corrections to the map order based on the physical position of markers within assembled scaffolds. Using the output of Chromonomer, we identified markers that were out of order in the map compared to their local assembly order and aligned these markers to the assembly from Chromonomer using Bowtie2 v. 2.2.5 (Langmead and Salzberg 2012) with the very_sensitive settings to obtain their physical order. We then re-estimated the map using the evaluateOrder flag in Lep-MAP2 as described above, but with the marker order constrained to the physical order (improveOrder=0) and with all duplicate markers included in the analysis (removeDuplicates=0). After initial map construction, we removed 17 markers with an estimated error rate greater than 5% and estimated the map one last time using the same settings. The final map contained 7,589 markers across the 10 linkage groups.

Genome annotation

Prior to genome annotation, the assembly was soft-masked for repetitive elements and areas of low complexity with RepeatMasker (RepeatMasker Open-4.0) using a custom Mimulus aurantiacus library created by RepeatModeler (RepeatModeler Open-1.0), Repbase repeat libraries (Jurka et al. 2005), and a list of known transposable elements provided by MAKER (Holt and Yandell 2011). In total, 30.99% of the genome assembly was masked by RepeatMasker. Repetitive elements were annotated with RepeatModeler. Hidden Markov Models for gene prediction were generated by SNAP (Korf 2004) and Augustus (Stanke and Waack 2003) and were trained iteratively to the assembly using MAKER, as described by Cantarel et al. (Cantarel et al. 2008). Training was performed on the 14.5 Mbp sequence from LG9. Evidence used by MAKER for annotation included protein sequences from Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Solanum tuberosum, Daucus carota, Vitis vinifera (all downloaded from EnsemblPlants on 9 August 2016), Salvia miltiorrhiza (downloaded from Herbal Medicine Omics Database on 9 August 2016), Mimulus guttatus v. 2 (downloaded from JGI Genome Portal on 9 August 2016), and all Uniprot/swissprot proteins (downloaded on 18 August 2016) (Goodstein et al. 2012, Nordberg et al. 2013, Kersey et al. 2016) (Herbal Medicine Omics Database; Uniprot). We filtered the annotations with MAKER to include: 1) only evidence-based information that also contained assembled protein support, and 2) those ab initio gene predictions that did not overlap with the evidence-based annotations and that contained protein family domains, as detected with InterProScan (Quevillon et al. 2005).

Genome re-sequencing and variant calling

We collected leaf tissue from each taxon (Collection locations available in Table S3 and Fig. S2) and extracted DNA from dried tissue using the Zymo Plant/Seed MiniPrep DNA kit following the manufacturer’s instructions. We prepared sequencing libraries using the Kapa Biosystems HyperPrep kit, and libraries with an insert size between 400-600 bp were sequenced on the Illumina HiSeq 4000 using paired-end 150 bp reads at the Genomics Core Facility, University of Oregon.

We filtered raw reads using the process_shortreads script in Stacks v1.46 to remove reads with uncalled bases or poor quality scores. We then aligned the retained reads to the reference assembly using the BWA-MEM algorithm in BWA v0.7.15 (Li 2013). An average of 91.7% of reads aligned (range: 82.6-96.0%), and the average sequencing depth was 21x (range: 15.16 – 30.86x). We then marked PCR duplicates using Picard (http://broadinstitute.github.io/picard). We performed an initial run of variant calling using the UnifiedGenotyper tool in GATK v3.8 (McKenna et al. 2010) and filtered the data to remove variants with a mapping quality < 50, a quality depth < 4, and a Fisher Strand score > 50. We then used these variants to perform base quality score recalibration for each individual, before performing another run of the UnifiedGenotyper to call final variants. After the second run of variant calling, we removed variants with a mapping quality < 40, a quality depth < 2, and a Fisher Strand score > 60. The final dataset contained 13,233,829 SNPs across the nine taxa. Finally, we ran UnifiedGenotyper with the EMIT_ALL_SITES option to output all variant and invariant genotyped sites.

Phylogenetic and principal component analyses

We used RAxML v8 to reconstruct the evolutionary relationships among the nine taxa by concatenating variant sites from across the genome. To investigate patterns of phylogenetic discordance across the genome, we also built trees from windows across the genome. We phased SNPs using BEAGLE v4.1 (Browning and Browning 2007), using a window size of 100 kb and an overlap of 10 kb. We incorporated information on recombination rate from the genetic map and did not impute missing genotypes. After phasing, we used MVFtools (https://www.github.com/jbpease/mvftools) to run RAxML from 100 kb and 500 kb nonoverlapping windows, with the M. clevelandii samples set as the outgroups. We then visualized the window trees in DensiTree v2.01 (Bouckaert 2010).

To assess concordance between the window-based trees and the whole-genome tree, we converted trees to distance matrices using the Ape package in R (Paradis et al. 2004). We then calculated the Pearson’s correlation coefficient between the distance matrix from each window and the whole-genome tree, with a stronger correlation indicating higher concordance with the whole-genome tree. We used one-dimensional autocorrelation analysis to determine if variation in tree concordance was randomly distributed across the genome. This involved estimating the autocorrelation between genomic position and tree concordance for each LG with a lag size of 2 Mbp. The significance of the observed value for each LG was determined from a null distribution of autocorrelation coefficients estimated from 1000 random permutations of the genome-wide data.

We also conducted a Principal Components Analysis (PCA) based on all variant sites from across the entire genome using Plink v. 1.90 (Chang et al. 2015). Initially, we ran the PCA with all 37 samples, but we consecutively re-ran the analysis by removing different taxa in order to assess clustering patterns among more closely related samples.

Population genomic analyses

To examine how genome-wide patterns of diversity, differentiation, and divergence varied among taxa, we calculated nucleotide diversity (π), between-taxon differentiation (F_ST), and between-taxon divergence (d_xy), respectively, in non-overlapping and overlapping 100 kb (step size = 10kb) and 500 kb (step size = 50 kb) windows using custom Python scripts downloaded from https://github.com/simonhmartin/genomics_general. We calculated measures of differentiation and divergence across all 36 pairwise comparisons among the nine taxa, and diversity was estimated separately for each taxon. These scripts estimated π and d_xy by dividing the number of sequence differences within a window (either within or between taxa) by the total number of sites in that window. To account for missing data, the script counted the number of differences between each sample, divided by the total number of variant sites that were genotyped within those samples, and then averaged across all pairs of samples. To provide an unbiased estimate of diversity and divergence, we incorporated invariant sites into the calculation by dividing the number of pairwise differences (within and between taxa, respectively) by the total number of genotyped sites (variant and invariant) within a window. F_ST was calculated following the measure of K_ST (Hudson et al. 1992), equation 9), but was modified to incorporate missing data using the same approach as π and d_xy. We filtered the data separately for each taxonomic comparison, so that each site was required to be genotyped in at least three individuals for comparisons within the M. aurantiacus complex or at least two individuals for each comparison involving M. clevelandii.

We summarized the variation in each statistic across comparisons using a Principal Components Analysis (PCA), with taxon or taxon pair as the variables. Thus, across each window, the first principal component of π, F_ST, and d_xy provided multivariate measures that explained the greatest covariance in the data. We used a one-dimensional autocorrelation analysis and permutation test to determine if the genome-wide patterns of PC1 π, F_ST, and d_xydeparted from a random expectation, as described above for tree concordance (see section ‘phylogenetic analyses’).

To examine the relationships among PC1 diversity, differentiation, and divergence, we estimated Pearson’s correlation coefficient among all three statistics across genomic windows. Further, we estimated correlations among these three statistics and tree concordance, gene density, and recombination rate. Recombination rate was estimated by comparing the genetic and physical distance (in cM/Mbp) between all pairs of adjacent markers on each LG from the genetic linkage map described above. We removed the top 5% of recombination rates, as these represented unrealistically high values of recombination. A minimum of three estimates was required to obtain an average recombination rate estimate within each window. Gene density was calculated from the number of predicted genes in each window, as determined from the annotation described above. Genes spanning two windows were counted in both.

To determine how the correlations among the statistics (diversity, differentiation, divergence, recombination rate, gene count) changed with increasing divergence time, we examined the correlation coefficient among all pairs of statistics individually for each of the 36 pairwise comparisons. Because diversity was measured within taxa, we calculated the mean value of π between each pair of taxa. Also, because many of the pairwise comparisons are non-independent, we applied the phylogenetic correction outlined by (Felsenstein 1985, Coyne and Orr 1989) to produce a statistically independent set of data points for this analysis.

We calculated the divergence time between M. clevelandii and M. aurantiacus, based on a corrected estimate of sequence divergence (d_a) between individuals of M. clevelandii and all subspecies of M. aurantiacus combined. We then converted this value into a divergence time T (in generations) using the equation: T = d_a/(2µ), where µ is the mutation rate, 1.5 x 10^-8 (Koch et al. 2001, Brandvain et al. 2014). This value was then converted into years by multiplying by a generation time of two years.

Simulations

To assess the plausibility of different scenarios in producing heterogeneous genomic landscapes, we implemented forward-time Wright-Fisher simulations using SLiM (Haller et al. 2019, Haller and Messer 2019). The basic model consisted of an ancestral population with a fixed population size of N=10,000 that split after 10N non-overlapping generations into two daughter populations, each with a fixed size of N. These populations were then allowed to evolve for a further 10N non-overlapping generations. We simulated a 21 Mbp chromosome with a recombination rate of 2×10^-8 and a mutation rate of 10^-8, both per base pair and per generation.

We explored the following six modifications of this basic model. (i) Neutral evolution: mutations did not impact fitness; (ii) Background selection (BGS): mutations in the ancestor and daughter populations were neutral in the middle third of the chromosome but can be deleterious in the chromosome ends; (iii) Bateson-Dobzhansky-Muller incompatibility (BDMI): Neutral and deleterious mutations occurred in the ancestor. After the split, we allowed migration between the two daughter populations. To simulate BDM incompatibilities, a fraction of the selected mutations was deleterious in each of the populations and neutral in the other. (iv) Positive selection: mutations in the ancestor and daughter population were neutral in the middle third of the chromosome but could be beneficial in the remaining regions. (v) Local adaptation: mutations in the ancestor and daughter population were neutral in the middle third of the chromosome but could be beneficial in the remainder. After the split, we allowed migration between the daughter populations. To simulate local adaptation, a fraction of the selected mutations was beneficial in each population and neutral in the other. Unlike the BDMI simulation, we allowed local adaptation from standing variation, so some of the mutations that were neutral in the ancestral population became beneficial after the split. BGS and positive selection: mutations in the ancestor and daughter population were neutral in the middle third of the chromosome but could be either deleterious or beneficial in the tails.

Each scenario was simulated with varying proportions of beneficial/deleterious to neutral mutations, and varying mean selective coefficients and migration rates, where applicable (Table S7). For each combination of parameters, we ran five replicate simulations. As described in Kelleher et al. (Kelleher et al. 2018), to improve simulation speed we recorded genealogies and non-neutral mutations in a tree sequence and added neutral mutations afterwards with msprime (Kelleher et al. 2016). We used scikit-allel (v1.1.8) (Miles and Harding 2017) to calculate π, F_ST, and d_xy in 500 kb regions. All of the SLiM code used is available on GitHub (https://github.com/mufernando/mimulus_sims).

Tests for genome-wide admixture

We tested for introgression and quantified levels of admixture by calculating Patterson’s D and the admixture proportion (Green et al. 2010, Durand et al. 2011). Patterson’s D takes four taxa with the relationship (((P1, P2), P3) O) and looks for an excess of either ABBA or BABA sites, where A and B represent the ancestral and derived alleles respectively. We calculated Patterson’s D for all possible groups of three ingroup taxa based on the set of relationships inferred from the genome-wide (concatenated) dataset. M. clevelandii was used as the outgroup for all tests. The D statistic was calculated from allele frequency data for biallelic sites, with the ancestral allele called as the allele fixed within M. clevelandii. Therefore, only sites fixed within. M. clevelandii were included in the analysis. We converted genotype data into allele frequency data using a Python script download from https://github.com/simonhmartin/genomics. Patterson’s D was calculated from the proportion of ABBA and BABA sites in R using a custom script. To evaluate the significance of each test, we performed block jack-knifing with a block size of 500 kb. We then calculated the genome-wide admixture proportion, f, for all tests with a significant value of Patterson’s D. This test compares the excess of ABBA sites shared between P2 and P3 to the expected excess of ABBA sites between two completely admixed populations. The expected excess is calculated by splitting the samples from P3 into two different groups (P3a and P3b) and calculating the excess of ABBA to BABA sites with one group taking the place of P2 and the other the place of P3.

For tests with significant values of Patterson’s D, we then estimated levels of admixture in each 500 kb genomic window. Although Patterson’s D is a powerful test for detecting introgression genome-wide, it is not suited for estimating admixture in defined genomic regions (Martin et al. 2015). Thus, we calculated the four-taxon statistic f_d, which is a modified version of the admixture proportion developed for estimating local admixture (Martin et al. 2015). We calculated f_din 500 kb non-overlapping windows, using Python scripts (https://github.com/simonhmartin/genomics_general). Because f_d is designed to detect an excess of ABBA sites (i.e., gene flow from P3 to P2), we switched the order of P1 and P2 in tests where D was negative (excess of BABA) in the genome-wide tests for introgression. Note that this does not affect the set of relationships in these comparisons. We summarized genome-wide variation in f_d across the different tests by taking the average value of f_d within each window and the maximum value of f_d from the 48 different tests in each window, and estimated the Pearson’s correlation coefficient between these values other statistics.

Acknowledgments

We thank William A. Cresko, Thomas Nelson, Roger K. Butlin, Anja M. Westram, Mark Ravinet, Sarah. P. Otto, and Michael C. Whitlock for stimulating discussion. John Willis performed the Illumina mate-pair library preps used for the genome assembly. Julian Catchen, Clay Small, Susan Bassham, Mark Rausher and Janna Fierst provided technical or analytical support or advice. Doug Turnbull and Maggie Weitzman conducted the Illumina sequencing at the University of Oregon Genomics Core facility. Thomas Nelson, Martin Garlovsky, Roger K. Butlin, Anja M. Westram and Jeff Ross-Ibarra provided comments on an earlier version of this manuscript. We thank Reto Burri, Stuart J.E. Baird, and two anonymous reviewers for providing insightful comments that greatly improved the paper. We also thank people who contributed to insightful discussions at the 2017 Speciation Genomics Meeting, Cambridge UK, the 2019 Gordon Speciation Conference, Ventura CA, USA, and on Twitter. Funding was provided by the National Science Foundation grant DEB-1258199 to MS and the Sloan Foundation to PR.

Footnotes

Taking into account reviewer comments and feedback from other people.

References

↵
Abbott, R., D. Albach, S. Ansell, J. W. Arntzen, S. J. E. Baird, N. Bierne, J. W. Boughman, A. Brelsford, C. A. Buerkle, R. Buggs, R. K. Butlin, U. Dieckmann, F. Eroukhmanoff, A. Grill, S. H. Cahan, J. S. Hermansen, G. Hewitt, A. G. Hudson, C. Jiggins, J. Jones, B. Keller, T. Marczewski, J. Mallet, P. Martinez-Rodriguez, M. Most, S. Mullen, R. Nichols, A. W. Nolte, C. Parisod, K. Pfennig, A. M. Rice, M. G. Ritchie, B. Seifert, C. M. Smadja, R. Stelkens, J. M. Szymura, R. Vainola, J. B. W. Wolf, and D. Zinner. 2013. Hybridization and speciation. Journal of Evolutionary Biology 26:229–246.
OpenUrl CrossRef PubMed
↵
Amores, A., J. Catchen, I. Nanda, W. Warren, R. Walter, M. Schartl, and J. H. Postlethwait. 2014. A RAD-tag genetic map for the platyfish (Xiphophorus maculatus) reveals mechanisms of karyotype evolution among teleost fish. Genetics 197:625–641.
OpenUrl Abstract/FREE Full Text
↵
Baird, S. J. E. 2015. Exploring linkage disequilibrium. Mol Ecol Resour 15:1017–1019.
OpenUrl
↵
Barrett, R. D. H., S. M. Rogers, and D. Schluter. 2008. Natural selection on a major armor gene in threespine stickleback. Science 322:255–257.
OpenUrl Abstract/FREE Full Text
↵
Barton, N. H. and G. M. Hewitt. 1985. Analysis of Hybrid Zones. Annual Review of Ecology and Systematics 16:113–148.
OpenUrl CrossRef Web of Science
↵
Bassham, S., J. Catchen, E. Lescak, F. A. von Hippel, and W. A. Cresko. 2018. Repeated Selection of Alternatively Adapted Haplotypes Creates Sweeping Genomic Remodeling in Stickleback. Genetics 209:921–939.
OpenUrl Abstract/FREE Full Text
↵
Berner, D. and W. Salzburger. 2015. The genomics of organismal diversification illuminated by adaptive radiations. Trends in Genetics 31:491–499.
OpenUrl CrossRef PubMed
↵
Bouckaert, R. R. 2010. DensiTree: making sense of sets of phylogenetic trees. Bioinformatics 26:1372–1373.
OpenUrl CrossRef PubMed Web of Science
↵
Brandvain, Y., A. M. Kenney, L. Flagel, G. Coop, and A. L. Sweigart. 2014. Speciation and Introgression between Mimulus nasutus and Mimulus guttatus. Plos Genetics 10.
↵
Browning, S. R. and B. L. Browning. 2007. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. American Journal of Human Genetics 81:1084–1097.
OpenUrl CrossRef PubMed Web of Science
↵
Burri, R. 2017a. Dissecting differentiation landscapes: a linked selection’s perspective. Journal of Evolutionary Biology 30:1501–1505.
OpenUrl
↵
Burri, R. 2017b. Interpreting differentiation landscapes in the light of long-term linked selection. Evolution Letters 1:118–131.
OpenUrl
↵
Burri, R., A. Nater, T. Kawakami, C. F. Mugal, P. I. Olason, L. Smeds, A. Suh, L. Dutoit, S. Bures, L. Z. Garamszegi, S. Hogner, J. Moreno, A. Qvarnstrom, M. Ruzic, S. A. Saether, G. P. Saetre, J. Torok, and H. Ellegren. 2015. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Research 25:1656–1665.
OpenUrl Abstract/FREE Full Text
↵
Campbell, C. R., J. W. Poelstra, and A. D. Yoder. 2018. What is Speciation Genomics? The roles of ecology, gene flow, and genomic architecture in the formation of species. Biological Journal of the Linnean Society 124:561–583.
OpenUrl
↵
Cantarel, B. L., I. Korf, S. M. C. Robb, G. Parra, E. Ross, B. Moore, C. Holt, A. S. Alvarado, and M. Yandell. 2008. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research 18:188–196.
OpenUrl Abstract/FREE Full Text
↵
Catchen, J., P. A. Hohenlohe, S. Bassham, A. Amores, and W. A. Cresko. 2013. Stacks: an analysis tool set for population genomics. Molecular Ecology 22:3124–3140.
OpenUrl CrossRef PubMed Web of Science
↵
Catchen, J. M., A. Amores, P. Hohenlohe, W. Cresko, and J. H. Postlethwait. 2011. Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences. G3-Genes Genomes Genetics 1:171–182.
OpenUrl
↵
Chang, C. C., C. C. Chow, L. C. A. M. Tellier, S. Vattikuti, S. M. Purcell, and J. J. Lee. 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4.
↵
Charlesworth, B. 1998. Measures of divergence between populations and the effect of forces that reduce variability. Molecular Biology and Evolution 15:538–543.
OpenUrl CrossRef PubMed Web of Science
↵
Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289–1303.
OpenUrl Abstract/FREE Full Text
↵
Chase, M. A., S. Stankowski, and M. A. Streisfeld. 2017. Genomewide variation provides insight into evolutionary relationships in a monkeyflower species complex (Mimulus sect. Diplacus). American Journal of Botany 104:1510–1521.
OpenUrl
↵
Coop, G. and P. Ralph. 2012. Patterns of neutral diversity under general models of selective sweeps. Genetics 192:205–224.
OpenUrl Abstract/FREE Full Text
↵
Corbett-Detig, R. B., D. L. Hartl, and T. B. Sackton. 2015. Natural selection constrains neutral diversity across a wide range of species. Plos Biology 13:e1002112.
OpenUrl CrossRef PubMed
↵
Coyne, J. A. and H. A. Orr. 1989. Patterns of Speciation in Drosophila. Evolution 43:362–381.
OpenUrl CrossRef Web of Science
↵
Cruickshank, T. E. and M. W. Hahn. 2014. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Molecular Ecology 23:3133–3157.
OpenUrl CrossRef PubMed Web of Science
↵
Delmore, K. E., J. S. L. Ramos, B. M. Van Doren, M. Lundberg, S. Bensch, D. E. Irwin, and M. Liedvogel. 2018. Comparative analysis examining patterns of genomic differentiation across multiple episodes of population divergence in birds. Evolution Letters 2:76–87.
OpenUrl
↵
Durand, E. Y., N. Patterson, D. Reich, and M. Slatkin. 2011. Testing for ancient admixture between closely related populations. Molecular Biology and Evolution 28:2239–2252.
OpenUrl CrossRef PubMed Web of Science
↵
Durrett, R., L. Buttel, and R. Harrison. 2000. Spatial models for hybrid zones. Heredity (Edinb) 84 ( Pt 1):9–19.
OpenUrl
↵
Ellegren, H., L. Smeds, R. Burri, P. I. Olason, N. Backstrom, T. Kawakami, A. Kunstner, H. Makinen, K. Nadachowska-Brzyska, A. Qvarnstrom, S. Uebbing, and J. B. W. Wolf. 2012. The genomic landscape of species divergence in Ficedula flycatchers. Nature 491:756–760.
OpenUrl CrossRef PubMed Web of Science
↵
English, A. C., S. Richards, Y. Han, M. Wang, V. Vee, J. Qu, X. Qin, D. M. Muzny, J. G. Reid, K. C. Worley, and R. A. Gibbs. 2012. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. Plos One 7:e47768.
OpenUrl CrossRef PubMed
↵
Feder, J. L., S. P. Egan, and P. Nosil. 2012. The genomics of speciation-with-gene-flow. Trends in Genetics 28:342–350.
OpenUrl CrossRef PubMed Web of Science
↵
Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125:1–15.
OpenUrl CrossRef Web of Science
↵
Gao, S., D. Bertrand, B. K. Chia, and N. Nagarajan. 2016. OPERA-LG: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees. Genome Biol 17:102.
OpenUrl CrossRef PubMed
↵
Gillespie, J. H. 2000. Genetic drift in an infinite population: The pseudohitchhiking model. Genetics 155:909–919.
OpenUrl Abstract/FREE Full Text
↵
Gnerre, S., I. Maccallum, D. Przybylski, F. J. Ribeiro, J. N. Burton, B. J. Walker, T. Sharpe, G. Hall, T. P. Shea, S. Sykes, A. M. Berlin, D. Aird, M. Costello, R. Daza, L. Williams, R. Nicol, A. Gnirke, C. Nusbaum, E. S. Lander, and D. B. Jaffe. 2011. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 108:1513–1518.
OpenUrl Abstract/FREE Full Text
↵
Goodstein, D. M., S. Shu, R. Howson, R. Neupane, R. D. Hayes, J. Fazo, T. Mitros, W. Dirks, U. Hellsten, N. Putnam, and D. S. Rokhsar. 2012. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–1186.
OpenUrl CrossRef PubMed Web of Science
↵
Grant, V. 1981. Plant Speciation. 2 edition. Columbia University Press, New York.
Grant, V. 1993a. Effects of hybridization and selection on floral isolation. Proc Natl Acad Sci U S A 90:990–993.
OpenUrl Abstract/FREE Full Text
↵
Grant, V. 1993b. Origin of floral isolation between ornithophilous and sphingophilous plant species. Proc Natl Acad Sci U S A 90:7729–7733.
OpenUrl Abstract/FREE Full Text
↵
Green, R. E., J. Krause, A. W. Briggs, T. Maricic, U. Stenzel, M. Kircher, N. Patterson, H. Li, W. Zhai, M. H. Fritz, N. F. Hansen, E. Y. Durand, A. S. Malaspinas, J. D. Jensen, T. Marques-Bonet, C. Alkan, K. Prufer, M. Meyer, H. A. Burbano, J. M. Good, R. Schultz, A. Aximu-Petri, A. Butthof, B. Hober, B. Hoffner, M. Siegemund, A. Weihmann, C. Nusbaum, E. S. Lander, C. Russ, N. Novod, J. Affourtit, M. Egholm, C. Verna, P. Rudan, D. Brajkovic, Z. Kucan, I. Gusic, V.B. Doronichev, L. V. Golovanova, C. Lalueza-Fox, M. de la Rasilla, J. Fortea, A. Rosas, R. W. Schmitz, P. L. F. Johnson, E. E. Eichler, D. Falush, E. Birney, J. C. Mullikin, M. Slatkin, R. Nielsen, J. Kelso, M. Lachmann, D. Reich, and S. Paabo. 2010. A draft sequence of the Neandertal genome. Science 328:710–722.
OpenUrl Abstract/FREE Full Text
↵
Hahn, M. W. 2008. Toward a selection theory of molecular evolution. Evolution 62:255–265.
OpenUrl CrossRef PubMed Web of Science
↵
Haller, B. C., J. Galloway, J. Kelleher, P. W. Messer, and P. L. Ralph. 2019. Tree-sequence recording in SLiM opens new horizons for forward-time simulation of whole genomes. Mol Ecol Resour 19:552–566.
OpenUrl
↵
Haller, B. C. and P. W. Messer. 2019. SLiM 3: Forward Genetic Simulations Beyond the Wright-Fisher Model. Molecular Biology and Evolution 36:632–637.
OpenUrl
↵
Han, F., S. Lamichhaney, B. R. Grant, P. R. Grant, L. Andersson, and M. T. Webster. 2017. Gene flow, ancient polymorphism, and ecological adaptation shape the genomic landscape of divergence among Darwin’s finches. Genome Research 27:1004–1015.
OpenUrl Abstract/FREE Full Text
↵
Hohenlohe, P. A., S. Bassham, P. D. Etter, N. Stiffler, E. A. Johnson, and W. A. Cresko. 2010. Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags. Plos Genetics 6.
↵
Holt, C. and M. Yandell. 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491.
OpenUrl CrossRef PubMed
↵
Hudson, R. R. 1990. Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology 7:44.
↵
Hudson, R. R., D. D. Boos, and N. L. Kaplan. 1992. A Statistical Test for Detecting Geographic Subdivision. Molecular Biology and Evolution 9:138–151.
OpenUrl PubMed Web of Science
↵
Hudson, R. R. and N. L. Kaplan. 1995. Deleterious Background Selection with Recombination. Genetics 141:1605–1617.
OpenUrl Abstract/FREE Full Text
↵
Jiggins, C. D. and S. H. Martin. 2017. Glittering gold and the quest for Isla de Muerta. Journal of Evolutionary Biology 30:1509–1511.
OpenUrl CrossRef
↵
Jurka, J., V. V. Kapitonov, A. Pavlicek, P. Klonowski, O. Kohany, and J. Walichiewicz. 2005. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467.
OpenUrl CrossRef PubMed Web of Science
↵
Kelleher, J., A. M. Etheridge, and G. McVean. 2016. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. Plos Computational Biology 12.
↵
Kelleher, J., K. R. Thornton, J. Ashander, and P. L. Ralph. 2018. Efficient pedigree recording for fast population genetics simulation. Plos Computational Biology 14.
↵
Kersey, P. J., J. E. Allen, I. Armean, S. Boddu, B. J. Bolt, D. Carvalho-Silva, M. Christensen, P. Davis, L. J. Falin, C. Grabmueller, J. Humphrey, A. Kerhornou, J. Khobova, N. K. Aranganathan, N. Langridge, E. Lowy, M. D. McDowall, U. Maheswari, M. Nuhn, C. K. Ong, B. Overduin, M. Paulini, H. Pedro, E. Perry, G. Spudich, E. Tapanari, B. Walts, G. Williams, M. Tello-Ruiz, J. Stein, S. Wei, D. Ware, D. M. Bolser, K. L. Howe, E. Kulesha, D. Lawson, G. Maslen, and D. M. Staines. 2016. Ensembl Genomes 2016: more genomes, more complexity. Nucleic Acids Res 44:D574–580.
OpenUrl CrossRef PubMed
↵
Kimura, M. 1968. Evolutionary rate at the molecular level. Nature 217:624–626.
OpenUrl CrossRef PubMed Web of Science
↵
Koch, M., B. Haubold, and T. Mitchell-Olds. 2001. Molecular systematics of Brassicaceae: evidence from plastidic matK and nuclear Chs sequences. American Journal of Botany 88:534–544.
OpenUrl Abstract/FREE Full Text
↵
Korf, I. 2004. Gene finding in novel genomes. BMC Bioinformatics 5:59.
OpenUrl CrossRef PubMed
↵
Kronforst, M. R., M. E. B. Hansen, N. G. Crawford, J. R. Gallant, W. Zhang, R. J. Kulathinal, D. D. Kapan, and S. P. Mullen. 2013. Hybridization Reveals the Evolving Genomic Architecture of Speciation. Cell Reports 5:666–677.
OpenUrl
↵
Lamichhaney, S., J. Berglund, M. S. Almen, K. Maqbool, M. Grabherr, A. Martinez-Barrio, M. Promerova, C. J. Rubin, C. Wang, N. Zamani, B. R. Grant, P. R. Grant, M. T. Webster, and L. Andersson. 2015. Evolution of Darwin’s finches and their beaks revealed by genome sequencing. Nature 518:371–375.
OpenUrl CrossRef PubMed
↵
Langley, C. H., K. Stevens, C. Cardeno, Y. C. Lee, D. R. Schrider, J. E. Pool, S. A. Langley, C. Suarez, R. B. Corbett-Detig, B. Kolaczkowski, S. Fang, P. M. Nista, A. K. Holloway, A. D. Kern, C. N. Dewey, Y. S. Song, M. W. Hahn, and D. J. Begun. 2012. Genomic variation in natural populations of Drosophila melanogaster. Genetics 192:533–598.
OpenUrl Abstract/FREE Full Text
↵
Langmead, B. and S. L. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359.
OpenUrl CrossRef PubMed Web of Science
↵
Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 1303.3997v1.
↵
Malinsky, M., R. J. Challis, A. M. Tyers, S. Schiffels, Y. Terai, B. P. Ngatunga, E. A. Miska, R. Durbin, M. J. Genner, and G. F. Turner. 2015. Genomic islands of speciation separate cichlid ecomorphs in an East African crater lake. Science 350:1493–1498.
OpenUrl Abstract/FREE Full Text
↵
Marques, D. A., J. I. Meier, and O. Seehausen. 2019. A Combinatorial View on Speciation and Adaptive Radiation. Trends in Ecology & Evolution.
↵
Martin, S. H., K. K. Dasmahapatra, N. J. Nadeau, C. Salazar, J. R. Walters, F. Simpson, M. Blaxter, A. Manica, J. Mallet, and C. D. Jiggins. 2013. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Research 23:1817–1828.
OpenUrl Abstract/FREE Full Text
↵
Martin, S. H., J. W. Davey, and C. D. Jiggins. 2015. Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Molecular Biology and Evolution 32:244–257.
OpenUrl CrossRef PubMed
↵
Martin, S. H., J. W. Davey, C. Salazar, and C. D. Jiggins. 2019. Recombination rate variation shapes barriers to introgression across butterfly genomes. Plos Biology 17.
↵
Matthey-Doret, R. and M. C. Whitlock. 2018. Background selection and the statistics of population differentiation: consequences for detecting local adaptation. BiorXiv.
↵
Maynard-Smith, J. and J. Haigh. 1974. Hitch-Hiking Effect of a Favorable Gene. Genetics Research 23:23–35.
OpenUrl
↵
McMinn, H. E. 1951. Studies in the genus Diplacus, Scrophulariaceae. Madrono 11:33–128.
OpenUrl
↵
Miles, A. and N. Harding. 2017. cggh/scikit-allel: v1.1.8.
↵
Nei, M. and W. H. Li. 1979. Mathematical-Model for Studying Genetic-Variation in Terms of Restriction Endonucleases. Proc Natl Acad Sci U S A 76:5269–5273.
OpenUrl Abstract/FREE Full Text
↵
Nelson, T. C. and W. A. Cresko. 2018. Ancient genomic variation underlies repeated ecological adaptation in young stickleback populations. Evolution Letters 2:9–21.
OpenUrl CrossRef
↵
Nordberg, H., M. Cantor, S. Dusheyko, S. Hua, A. Poliakov, I. Shabalov, T. Smirnova, I. V. Grigoriev, and I. Dubchak. 2013. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Res 42:D26–D31.
OpenUrl PubMed
↵
Nosil, P., D. J. Funk, and D. Ortiz-Barrientos. 2009. Divergent selection and heterogeneous genomic divergence. Molecular Ecology 18:375–402.
OpenUrl CrossRef PubMed Web of Science
↵
Ohta, T. 1973. Slightly Deleterious Mutant Substitutions in Evolution. Nature 246:96–98.
OpenUrl CrossRef PubMed Web of Science
↵
Paradis, E., J. Claude, and K. Strimmer. 2004. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 20:289–290.
OpenUrl CrossRef PubMed Web of Science
↵
Parra, G., K. Bradnam, and I. Korf. 2007. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genornes. Bioinformatics 23:1061–1067.
OpenUrl CrossRef PubMed Web of Science
↵
Pease, J. B., D. C. Haak, M. W. Hahn, and L. C. Moyle. 2016. Phylogenomics Reveals Three Sources of Adaptive Variation during a Rapid Radiation. Plos Biology 14.
↵
Pease, J. B. and M. W. Hahn. 2013. More Accurate Phylogenies Inferred from Low-Recombination Regions in the Presence of Incomplete Lineage Sorting. Evolution 67:2376–2384.
OpenUrl CrossRef PubMed
↵
Poelstra, J. W., N. Vijay, C. M. Bossu, H. Lantz, B. Ryll, I. Muller, V. Baglione, P. Unneberg, M. Wikelski, M. G. Grabherr, and J. B. W. Wolf. 2014. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows. Science 344:1410–1414.
OpenUrl Abstract/FREE Full Text
↵
Quevillon, E., V. Silventoinen, S. Pillai, N. Harte, N. Mulder, R. Apweiler, and R. Lopez. 2005. InterProScan: protein domains identifier. Nucleic Acids Res 33:W116–W120.
OpenUrl CrossRef PubMed Web of Science
↵
Rastas, P., F. C. F. Calboli, B. C. Guo, T. Shikano, and J. Merila. 2016. Construction of Ultradense Linkage Maps with Lep-MAP2: Stickleback F-2 Recombinant Crosses as an Example. Genome Biol Evol 8:78–93.
OpenUrl CrossRef PubMed
↵
Ravinet, M., R. Faria, R. K. Butlin, J. Galindo, N. Bierne, M. Rafajlovic, M. A. F. Noor, B. Mehlig, and A. M. Westram. 2017. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. Journal of Evolutionary Biology 30:1450–1477.
OpenUrl CrossRef
↵
Renaut, S., C. J. Grassa, S. Yeaman, B. T. Moyers, Z. Lai, N. C. Kane, J. E. Bowers, J. M. Burke, and L. H. Rieseberg. 2013. Genomic islands of divergence are not affected by geography of speciation in sunflowers. Nature Communications 4.
↵
Rettelbach, A., A. Nater, and H. Ellegren. 2019. How Linked Selection Shapes the Diversity Landscape in Ficedula Flycatchers. Genetics 212:277–285.
OpenUrl Abstract/FREE Full Text
↵
Rockman, M. V. 2012. The Qtn Program and the Alleles That Matter for Evolution: All That’s Gold Does Not Glitter. Evolution 66:1–17.
OpenUrl CrossRef PubMed Web of Science
↵
Schluter, D. 2000. The Ecology of Adaptive Radiation. Oxford University Press, Oxford.
↵
Schumer, M., C. L. Xu, D. L. Powell, A. Durvasula, L. Skov, C. Holland, J. C. Blazier, S. Sankararaman, P. Andolfatto, G. G. Rosenthal, and M. Przeworski. 2018. Natural selection interacts with recombination to shape the evolution of hybrid genomes. Science 360:656–659.
OpenUrl Abstract/FREE Full Text
↵
Simao, F. A., R. M. Waterhouse, P. Ioannidis, E. V. Kriventseva, and E. M. Zdobnov. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212.
OpenUrl CrossRef PubMed
↵
Slatkin, M. 1991. Inbreeding Coefficients and Coalescence Times. Genetical Research 58:167–175.
OpenUrl CrossRef PubMed Web of Science
↵
Sobel, J. M., S. Stankowski, and M. A. Streisfeld. 2019. Variation in ecophysiological traits might contribute to ecogeographic isolation and divergence between parapatric ecotypes of Mimulus aurantiacus. J Evol Biol.
↵
Sobel, J. M. and M. A. Streisfeld. 2015. Strong premating reproductive isolation drives incipient speciation in Mimulus aurantiacus. Evolution 69:447–461.
OpenUrl CrossRef PubMed
↵
Soria-Carrasco, V., Z. Gompert, A. A. Comeault, T. E. Farkas, T. L. Parchman, J. S. Johnston, C. A. Buerkle, J. L. Feder, J. Bast, T. Schwander, S. P. Egan, B. J. Crespi, and P. Nosil. 2014. Stick Insect Genomes Reveal Natural Selection’s Role in Parallel Speciation. Science 344:738–742.
OpenUrl Abstract/FREE Full Text
↵
Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313.
OpenUrl CrossRef PubMed Web of Science
↵
Stanke, M. and S. Waack. 2003. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19:Ii215–Ii225.
OpenUrl CrossRef PubMed
↵
Stankowski, S., J. M. Sobel, and M. A. Streisfeld. 2015. The geography of divergence with gene flow facilitates multitrait adaptation and the evolution of pollinator isolation in Mimulus aurantiacus. Evolution 69:3054–3068.
OpenUrl
↵
Stankowski, S., J. M. Sobel, and M. A. Streisfeld. 2017. Geographic cline analysis as a tool for studying genome-wide variation: a case study of pollinator-mediated divergence in a monkeyflower. Molecular Ecology 26:107–122.
OpenUrl
↵
Stankowski, S. and M. A. Streisfeld. 2015. Introgressive hybridization facilitates adaptive divergence in a recent radiation of monkeyflowers. Proceedings of the Royal Society B-Biological Sciences 282:154–162.
OpenUrl
↵
Streisfeld, M. A. and J. R. Kohn. 2005. Contrasting patterns of floral and molecular variation across a cline in Mimulus aurantiacus. Evolution 59:2548–2559.
OpenUrl CrossRef PubMed Web of Science
↵
Streisfeld, M. A. and J. R. Kohn. 2007. Environment and pollinator-mediated selection on parapatric floral races of Mimulus aurantiacus. Journal of Evolutionary Biology 20:122–132.
OpenUrl CrossRef PubMed Web of Science
↵
Streisfeld, M. A., W. N. Young, and J. M. Sobel. 2013. Divergent Selection Drives Genetic Differentiation in an R2R3-MYB Transcription Factor That Contributes to Incipient Speciation in Mimulus aurantiacus. Plos Genetics 9.
↵
Thompson, D. M. 2005. Systematics of Mimulus subgenus Schizoplacus (Scrophulariaceae). Systematic Botany Monographs 75:1–213.
OpenUrl
↵
Turner, T. L., M. W. Hahn, and S. V. Nuzhdin. 2005. Genomic islands of speciation in Anopheles gambiae. Plos Biology 3:1572–1578.
OpenUrl CrossRef Web of Science
↵
Van Doren, B. M., L. Campagna, B. Helm, J. C. Illera, I. J. Lovette, and M. Liedvogel. 2017. Correlated patterns of genetic diversity and differentiation across an avian family. Molecular Ecology 26:3982–3997.
OpenUrl CrossRef PubMed
↵
Vickery, R. K. 1995. Speciation by Aneuploidy and Polyploidy in Mimulus (Scrophulariaceae). Great Basin Naturalist 55:174–176.
OpenUrl Web of Science
↵
Wolf, J. B. and H. Ellegren. 2017. Making sense of genomic islands of differentiation in light of speciation. Nature Reviews Genetics 18:87–100.
OpenUrl CrossRef
↵
Wu, C. I. 2001. The genic view of the process of speciation. Journal of Evolutionary Biology 14:851–865.
OpenUrl CrossRef Web of Science
↵
Zerbino, D. R. and E. Birney. 2008. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research 18:821–829.
OpenUrl Abstract/FREE Full Text

View the discussion thread.

Posted May 22, 2019.

Download PDF

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5210)
Biochemistry (11736)
Bioengineering (8746)
Bioinformatics (29186)
Biophysics (14964)
Cancer Biology (12084)
Cell Biology (17401)
Clinical Trials (138)
Developmental Biology (9418)
Ecology (14176)
Epidemiology (2067)
Evolutionary Biology (18299)
Genetics (12235)
Genomics (16793)
Immunology (11863)
Microbiology (28066)
Molecular Biology (11580)
Neuroscience (60925)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4956)
Plant Biology (10422)
Scientific Communication and Education (1683)
Synthetic Biology (2883)
Systems Biology (7338)
Zoology (1650)

[1] ↵
Abbott, R., D. Albach, S. Ansell, J. W. Arntzen, S. J. E. Baird, N. Bierne, J. W. Boughman, A. Brelsford, C. A. Buerkle, R. Buggs, R. K. Butlin, U. Dieckmann, F. Eroukhmanoff, A. Grill, S. H. Cahan, J. S. Hermansen, G. Hewitt, A. G. Hudson, C. Jiggins, J. Jones, B. Keller, T. Marczewski, J. Mallet, P. Martinez-Rodriguez, M. Most, S. Mullen, R. Nichols, A. W. Nolte, C. Parisod, K. Pfennig, A. M. Rice, M. G. Ritchie, B. Seifert, C. M. Smadja, R. Stelkens, J. M. Szymura, R. Vainola, J. B. W. Wolf, and D. Zinner. 2013. Hybridization and speciation. Journal of Evolutionary Biology 26:229–246.
OpenUrl CrossRef PubMed

[2] ↵
Amores, A., J. Catchen, I. Nanda, W. Warren, R. Walter, M. Schartl, and J. H. Postlethwait. 2014. A RAD-tag genetic map for the platyfish (Xiphophorus maculatus) reveals mechanisms of karyotype evolution among teleost fish. Genetics 197:625–641.
OpenUrl Abstract/FREE Full Text

[3] ↵
Baird, S. J. E. 2015. Exploring linkage disequilibrium. Mol Ecol Resour 15:1017–1019.
OpenUrl

[4] ↵
Barrett, R. D. H., S. M. Rogers, and D. Schluter. 2008. Natural selection on a major armor gene in threespine stickleback. Science 322:255–257.
OpenUrl Abstract/FREE Full Text

[5] ↵
Barton, N. H. and G. M. Hewitt. 1985. Analysis of Hybrid Zones. Annual Review of Ecology and Systematics 16:113–148.
OpenUrl CrossRef Web of Science

[6] ↵
Bassham, S., J. Catchen, E. Lescak, F. A. von Hippel, and W. A. Cresko. 2018. Repeated Selection of Alternatively Adapted Haplotypes Creates Sweeping Genomic Remodeling in Stickleback. Genetics 209:921–939.
OpenUrl Abstract/FREE Full Text

[7] ↵
Berner, D. and W. Salzburger. 2015. The genomics of organismal diversification illuminated by adaptive radiations. Trends in Genetics 31:491–499.
OpenUrl CrossRef PubMed

[8] ↵
Bouckaert, R. R. 2010. DensiTree: making sense of sets of phylogenetic trees. Bioinformatics 26:1372–1373.
OpenUrl CrossRef PubMed Web of Science

[9] ↵
Brandvain, Y., A. M. Kenney, L. Flagel, G. Coop, and A. L. Sweigart. 2014. Speciation and Introgression between Mimulus nasutus and Mimulus guttatus. Plos Genetics 10.

[10] ↵
Browning, S. R. and B. L. Browning. 2007. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. American Journal of Human Genetics 81:1084–1097.
OpenUrl CrossRef PubMed Web of Science

[11] ↵
Burri, R. 2017a. Dissecting differentiation landscapes: a linked selection’s perspective. Journal of Evolutionary Biology 30:1501–1505.
OpenUrl

[12] ↵
Burri, R. 2017b. Interpreting differentiation landscapes in the light of long-term linked selection. Evolution Letters 1:118–131.
OpenUrl

[13] ↵
Burri, R., A. Nater, T. Kawakami, C. F. Mugal, P. I. Olason, L. Smeds, A. Suh, L. Dutoit, S. Bures, L. Z. Garamszegi, S. Hogner, J. Moreno, A. Qvarnstrom, M. Ruzic, S. A. Saether, G. P. Saetre, J. Torok, and H. Ellegren. 2015. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Research 25:1656–1665.
OpenUrl Abstract/FREE Full Text

[14] ↵
Campbell, C. R., J. W. Poelstra, and A. D. Yoder. 2018. What is Speciation Genomics? The roles of ecology, gene flow, and genomic architecture in the formation of species. Biological Journal of the Linnean Society 124:561–583.
OpenUrl

[15] ↵
Cantarel, B. L., I. Korf, S. M. C. Robb, G. Parra, E. Ross, B. Moore, C. Holt, A. S. Alvarado, and M. Yandell. 2008. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research 18:188–196.
OpenUrl Abstract/FREE Full Text

[16] ↵
Catchen, J., P. A. Hohenlohe, S. Bassham, A. Amores, and W. A. Cresko. 2013. Stacks: an analysis tool set for population genomics. Molecular Ecology 22:3124–3140.
OpenUrl CrossRef PubMed Web of Science

[17] ↵
Catchen, J. M., A. Amores, P. Hohenlohe, W. Cresko, and J. H. Postlethwait. 2011. Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences. G3-Genes Genomes Genetics 1:171–182.
OpenUrl

[18] ↵
Chang, C. C., C. C. Chow, L. C. A. M. Tellier, S. Vattikuti, S. M. Purcell, and J. J. Lee. 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4.

[19] ↵
Charlesworth, B. 1998. Measures of divergence between populations and the effect of forces that reduce variability. Molecular Biology and Evolution 15:538–543.
OpenUrl CrossRef PubMed Web of Science

[20] ↵
Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289–1303.
OpenUrl Abstract/FREE Full Text

[21] ↵
Chase, M. A., S. Stankowski, and M. A. Streisfeld. 2017. Genomewide variation provides insight into evolutionary relationships in a monkeyflower species complex (Mimulus sect. Diplacus). American Journal of Botany 104:1510–1521.
OpenUrl

[22] ↵
Coop, G. and P. Ralph. 2012. Patterns of neutral diversity under general models of selective sweeps. Genetics 192:205–224.
OpenUrl Abstract/FREE Full Text

[23] ↵
Corbett-Detig, R. B., D. L. Hartl, and T. B. Sackton. 2015. Natural selection constrains neutral diversity across a wide range of species. Plos Biology 13:e1002112.
OpenUrl CrossRef PubMed

[24] ↵
Coyne, J. A. and H. A. Orr. 1989. Patterns of Speciation in Drosophila. Evolution 43:362–381.
OpenUrl CrossRef Web of Science

[25] ↵
Cruickshank, T. E. and M. W. Hahn. 2014. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Molecular Ecology 23:3133–3157.
OpenUrl CrossRef PubMed Web of Science

[26] ↵
Delmore, K. E., J. S. L. Ramos, B. M. Van Doren, M. Lundberg, S. Bensch, D. E. Irwin, and M. Liedvogel. 2018. Comparative analysis examining patterns of genomic differentiation across multiple episodes of population divergence in birds. Evolution Letters 2:76–87.
OpenUrl

[27] ↵
Durand, E. Y., N. Patterson, D. Reich, and M. Slatkin. 2011. Testing for ancient admixture between closely related populations. Molecular Biology and Evolution 28:2239–2252.
OpenUrl CrossRef PubMed Web of Science

[28] ↵
Durrett, R., L. Buttel, and R. Harrison. 2000. Spatial models for hybrid zones. Heredity (Edinb) 84 ( Pt 1):9–19.
OpenUrl

[29] ↵
Ellegren, H., L. Smeds, R. Burri, P. I. Olason, N. Backstrom, T. Kawakami, A. Kunstner, H. Makinen, K. Nadachowska-Brzyska, A. Qvarnstrom, S. Uebbing, and J. B. W. Wolf. 2012. The genomic landscape of species divergence in Ficedula flycatchers. Nature 491:756–760.
OpenUrl CrossRef PubMed Web of Science

[30] ↵
English, A. C., S. Richards, Y. Han, M. Wang, V. Vee, J. Qu, X. Qin, D. M. Muzny, J. G. Reid, K. C. Worley, and R. A. Gibbs. 2012. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. Plos One 7:e47768.
OpenUrl CrossRef PubMed

[31] ↵
Feder, J. L., S. P. Egan, and P. Nosil. 2012. The genomics of speciation-with-gene-flow. Trends in Genetics 28:342–350.
OpenUrl CrossRef PubMed Web of Science

[32] ↵
Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125:1–15.
OpenUrl CrossRef Web of Science

[33] ↵
Gao, S., D. Bertrand, B. K. Chia, and N. Nagarajan. 2016. OPERA-LG: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees. Genome Biol 17:102.
OpenUrl CrossRef PubMed

[34] ↵
Gillespie, J. H. 2000. Genetic drift in an infinite population: The pseudohitchhiking model. Genetics 155:909–919.
OpenUrl Abstract/FREE Full Text

[35] ↵
Gnerre, S., I. Maccallum, D. Przybylski, F. J. Ribeiro, J. N. Burton, B. J. Walker, T. Sharpe, G. Hall, T. P. Shea, S. Sykes, A. M. Berlin, D. Aird, M. Costello, R. Daza, L. Williams, R. Nicol, A. Gnirke, C. Nusbaum, E. S. Lander, and D. B. Jaffe. 2011. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 108:1513–1518.
OpenUrl Abstract/FREE Full Text

[36] ↵
Goodstein, D. M., S. Shu, R. Howson, R. Neupane, R. D. Hayes, J. Fazo, T. Mitros, W. Dirks, U. Hellsten, N. Putnam, and D. S. Rokhsar. 2012. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40:D1178–1186.
OpenUrl CrossRef PubMed Web of Science

[37] ↵
Grant, V. 1981. Plant Speciation. 2 edition. Columbia University Press, New York.

[38] Grant, V. 1993a. Effects of hybridization and selection on floral isolation. Proc Natl Acad Sci U S A 90:990–993.
OpenUrl Abstract/FREE Full Text

[39] ↵
Grant, V. 1993b. Origin of floral isolation between ornithophilous and sphingophilous plant species. Proc Natl Acad Sci U S A 90:7729–7733.
OpenUrl Abstract/FREE Full Text

[40] ↵
Green, R. E., J. Krause, A. W. Briggs, T. Maricic, U. Stenzel, M. Kircher, N. Patterson, H. Li, W. Zhai, M. H. Fritz, N. F. Hansen, E. Y. Durand, A. S. Malaspinas, J. D. Jensen, T. Marques-Bonet, C. Alkan, K. Prufer, M. Meyer, H. A. Burbano, J. M. Good, R. Schultz, A. Aximu-Petri, A. Butthof, B. Hober, B. Hoffner, M. Siegemund, A. Weihmann, C. Nusbaum, E. S. Lander, C. Russ, N. Novod, J. Affourtit, M. Egholm, C. Verna, P. Rudan, D. Brajkovic, Z. Kucan, I. Gusic, V.B. Doronichev, L. V. Golovanova, C. Lalueza-Fox, M. de la Rasilla, J. Fortea, A. Rosas, R. W. Schmitz, P. L. F. Johnson, E. E. Eichler, D. Falush, E. Birney, J. C. Mullikin, M. Slatkin, R. Nielsen, J. Kelso, M. Lachmann, D. Reich, and S. Paabo. 2010. A draft sequence of the Neandertal genome. Science 328:710–722.
OpenUrl Abstract/FREE Full Text

[41] ↵
Hahn, M. W. 2008. Toward a selection theory of molecular evolution. Evolution 62:255–265.
OpenUrl CrossRef PubMed Web of Science

[42] ↵
Haller, B. C., J. Galloway, J. Kelleher, P. W. Messer, and P. L. Ralph. 2019. Tree-sequence recording in SLiM opens new horizons for forward-time simulation of whole genomes. Mol Ecol Resour 19:552–566.
OpenUrl

[43] ↵
Haller, B. C. and P. W. Messer. 2019. SLiM 3: Forward Genetic Simulations Beyond the Wright-Fisher Model. Molecular Biology and Evolution 36:632–637.
OpenUrl

[44] ↵
Han, F., S. Lamichhaney, B. R. Grant, P. R. Grant, L. Andersson, and M. T. Webster. 2017. Gene flow, ancient polymorphism, and ecological adaptation shape the genomic landscape of divergence among Darwin’s finches. Genome Research 27:1004–1015.
OpenUrl Abstract/FREE Full Text

[45] ↵
Hohenlohe, P. A., S. Bassham, P. D. Etter, N. Stiffler, E. A. Johnson, and W. A. Cresko. 2010. Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags. Plos Genetics 6.

[46] ↵
Holt, C. and M. Yandell. 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491.
OpenUrl CrossRef PubMed

[47] ↵
Hudson, R. R. 1990. Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology 7:44.

[48] ↵
Hudson, R. R., D. D. Boos, and N. L. Kaplan. 1992. A Statistical Test for Detecting Geographic Subdivision. Molecular Biology and Evolution 9:138–151.
OpenUrl PubMed Web of Science

[49] ↵
Hudson, R. R. and N. L. Kaplan. 1995. Deleterious Background Selection with Recombination. Genetics 141:1605–1617.
OpenUrl Abstract/FREE Full Text

[50] ↵
Jiggins, C. D. and S. H. Martin. 2017. Glittering gold and the quest for Isla de Muerta. Journal of Evolutionary Biology 30:1509–1511.
OpenUrl CrossRef

[51] ↵
Jurka, J., V. V. Kapitonov, A. Pavlicek, P. Klonowski, O. Kohany, and J. Walichiewicz. 2005. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467.
OpenUrl CrossRef PubMed Web of Science

[52] ↵
Kelleher, J., A. M. Etheridge, and G. McVean. 2016. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. Plos Computational Biology 12.

[53] ↵
Kelleher, J., K. R. Thornton, J. Ashander, and P. L. Ralph. 2018. Efficient pedigree recording for fast population genetics simulation. Plos Computational Biology 14.

[54] ↵
Kersey, P. J., J. E. Allen, I. Armean, S. Boddu, B. J. Bolt, D. Carvalho-Silva, M. Christensen, P. Davis, L. J. Falin, C. Grabmueller, J. Humphrey, A. Kerhornou, J. Khobova, N. K. Aranganathan, N. Langridge, E. Lowy, M. D. McDowall, U. Maheswari, M. Nuhn, C. K. Ong, B. Overduin, M. Paulini, H. Pedro, E. Perry, G. Spudich, E. Tapanari, B. Walts, G. Williams, M. Tello-Ruiz, J. Stein, S. Wei, D. Ware, D. M. Bolser, K. L. Howe, E. Kulesha, D. Lawson, G. Maslen, and D. M. Staines. 2016. Ensembl Genomes 2016: more genomes, more complexity. Nucleic Acids Res 44:D574–580.
OpenUrl CrossRef PubMed

[55] ↵
Kimura, M. 1968. Evolutionary rate at the molecular level. Nature 217:624–626.
OpenUrl CrossRef PubMed Web of Science

[56] ↵
Koch, M., B. Haubold, and T. Mitchell-Olds. 2001. Molecular systematics of Brassicaceae: evidence from plastidic matK and nuclear Chs sequences. American Journal of Botany 88:534–544.
OpenUrl Abstract/FREE Full Text

[57] ↵
Korf, I. 2004. Gene finding in novel genomes. BMC Bioinformatics 5:59.
OpenUrl CrossRef PubMed

[58] ↵
Kronforst, M. R., M. E. B. Hansen, N. G. Crawford, J. R. Gallant, W. Zhang, R. J. Kulathinal, D. D. Kapan, and S. P. Mullen. 2013. Hybridization Reveals the Evolving Genomic Architecture of Speciation. Cell Reports 5:666–677.
OpenUrl

[59] ↵
Lamichhaney, S., J. Berglund, M. S. Almen, K. Maqbool, M. Grabherr, A. Martinez-Barrio, M. Promerova, C. J. Rubin, C. Wang, N. Zamani, B. R. Grant, P. R. Grant, M. T. Webster, and L. Andersson. 2015. Evolution of Darwin’s finches and their beaks revealed by genome sequencing. Nature 518:371–375.
OpenUrl CrossRef PubMed

[60] ↵
Langley, C. H., K. Stevens, C. Cardeno, Y. C. Lee, D. R. Schrider, J. E. Pool, S. A. Langley, C. Suarez, R. B. Corbett-Detig, B. Kolaczkowski, S. Fang, P. M. Nista, A. K. Holloway, A. D. Kern, C. N. Dewey, Y. S. Song, M. W. Hahn, and D. J. Begun. 2012. Genomic variation in natural populations of Drosophila melanogaster. Genetics 192:533–598.
OpenUrl Abstract/FREE Full Text

[61] ↵
Langmead, B. and S. L. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359.
OpenUrl CrossRef PubMed Web of Science

[62] ↵
Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 1303.3997v1.

[63] ↵
Malinsky, M., R. J. Challis, A. M. Tyers, S. Schiffels, Y. Terai, B. P. Ngatunga, E. A. Miska, R. Durbin, M. J. Genner, and G. F. Turner. 2015. Genomic islands of speciation separate cichlid ecomorphs in an East African crater lake. Science 350:1493–1498.
OpenUrl Abstract/FREE Full Text

[64] ↵
Marques, D. A., J. I. Meier, and O. Seehausen. 2019. A Combinatorial View on Speciation and Adaptive Radiation. Trends in Ecology & Evolution.

[65] ↵
Martin, S. H., K. K. Dasmahapatra, N. J. Nadeau, C. Salazar, J. R. Walters, F. Simpson, M. Blaxter, A. Manica, J. Mallet, and C. D. Jiggins. 2013. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Research 23:1817–1828.
OpenUrl Abstract/FREE Full Text

[66] ↵
Martin, S. H., J. W. Davey, and C. D. Jiggins. 2015. Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Molecular Biology and Evolution 32:244–257.
OpenUrl CrossRef PubMed

[67] ↵
Martin, S. H., J. W. Davey, C. Salazar, and C. D. Jiggins. 2019. Recombination rate variation shapes barriers to introgression across butterfly genomes. Plos Biology 17.

[68] ↵
Matthey-Doret, R. and M. C. Whitlock. 2018. Background selection and the statistics of population differentiation: consequences for detecting local adaptation. BiorXiv.

[69] ↵
Maynard-Smith, J. and J. Haigh. 1974. Hitch-Hiking Effect of a Favorable Gene. Genetics Research 23:23–35.
OpenUrl

[70] ↵
McMinn, H. E. 1951. Studies in the genus Diplacus, Scrophulariaceae. Madrono 11:33–128.
OpenUrl

[71] ↵
Miles, A. and N. Harding. 2017. cggh/scikit-allel: v1.1.8.

[72] ↵
Nei, M. and W. H. Li. 1979. Mathematical-Model for Studying Genetic-Variation in Terms of Restriction Endonucleases. Proc Natl Acad Sci U S A 76:5269–5273.
OpenUrl Abstract/FREE Full Text

[73] ↵
Nelson, T. C. and W. A. Cresko. 2018. Ancient genomic variation underlies repeated ecological adaptation in young stickleback populations. Evolution Letters 2:9–21.
OpenUrl CrossRef

[74] ↵
Nordberg, H., M. Cantor, S. Dusheyko, S. Hua, A. Poliakov, I. Shabalov, T. Smirnova, I. V. Grigoriev, and I. Dubchak. 2013. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Res 42:D26–D31.
OpenUrl PubMed

[75] ↵
Nosil, P., D. J. Funk, and D. Ortiz-Barrientos. 2009. Divergent selection and heterogeneous genomic divergence. Molecular Ecology 18:375–402.
OpenUrl CrossRef PubMed Web of Science

[76] ↵
Ohta, T. 1973. Slightly Deleterious Mutant Substitutions in Evolution. Nature 246:96–98.
OpenUrl CrossRef PubMed Web of Science

[77] ↵
Paradis, E., J. Claude, and K. Strimmer. 2004. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 20:289–290.
OpenUrl CrossRef PubMed Web of Science

[78] ↵
Parra, G., K. Bradnam, and I. Korf. 2007. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genornes. Bioinformatics 23:1061–1067.
OpenUrl CrossRef PubMed Web of Science

[79] ↵
Pease, J. B., D. C. Haak, M. W. Hahn, and L. C. Moyle. 2016. Phylogenomics Reveals Three Sources of Adaptive Variation during a Rapid Radiation. Plos Biology 14.

[80] ↵
Pease, J. B. and M. W. Hahn. 2013. More Accurate Phylogenies Inferred from Low-Recombination Regions in the Presence of Incomplete Lineage Sorting. Evolution 67:2376–2384.
OpenUrl CrossRef PubMed

[81] ↵
Poelstra, J. W., N. Vijay, C. M. Bossu, H. Lantz, B. Ryll, I. Muller, V. Baglione, P. Unneberg, M. Wikelski, M. G. Grabherr, and J. B. W. Wolf. 2014. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows. Science 344:1410–1414.
OpenUrl Abstract/FREE Full Text

[82] ↵
Quevillon, E., V. Silventoinen, S. Pillai, N. Harte, N. Mulder, R. Apweiler, and R. Lopez. 2005. InterProScan: protein domains identifier. Nucleic Acids Res 33:W116–W120.
OpenUrl CrossRef PubMed Web of Science

[83] ↵
Rastas, P., F. C. F. Calboli, B. C. Guo, T. Shikano, and J. Merila. 2016. Construction of Ultradense Linkage Maps with Lep-MAP2: Stickleback F-2 Recombinant Crosses as an Example. Genome Biol Evol 8:78–93.
OpenUrl CrossRef PubMed

[84] ↵
Ravinet, M., R. Faria, R. K. Butlin, J. Galindo, N. Bierne, M. Rafajlovic, M. A. F. Noor, B. Mehlig, and A. M. Westram. 2017. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. Journal of Evolutionary Biology 30:1450–1477.
OpenUrl CrossRef

[85] ↵
Renaut, S., C. J. Grassa, S. Yeaman, B. T. Moyers, Z. Lai, N. C. Kane, J. E. Bowers, J. M. Burke, and L. H. Rieseberg. 2013. Genomic islands of divergence are not affected by geography of speciation in sunflowers. Nature Communications 4.

[86] ↵
Rettelbach, A., A. Nater, and H. Ellegren. 2019. How Linked Selection Shapes the Diversity Landscape in Ficedula Flycatchers. Genetics 212:277–285.
OpenUrl Abstract/FREE Full Text

[87] ↵
Rockman, M. V. 2012. The Qtn Program and the Alleles That Matter for Evolution: All That’s Gold Does Not Glitter. Evolution 66:1–17.
OpenUrl CrossRef PubMed Web of Science

[88] ↵
Schluter, D. 2000. The Ecology of Adaptive Radiation. Oxford University Press, Oxford.

[89] ↵
Schumer, M., C. L. Xu, D. L. Powell, A. Durvasula, L. Skov, C. Holland, J. C. Blazier, S. Sankararaman, P. Andolfatto, G. G. Rosenthal, and M. Przeworski. 2018. Natural selection interacts with recombination to shape the evolution of hybrid genomes. Science 360:656–659.
OpenUrl Abstract/FREE Full Text

[90] ↵
Simao, F. A., R. M. Waterhouse, P. Ioannidis, E. V. Kriventseva, and E. M. Zdobnov. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212.
OpenUrl CrossRef PubMed

[91] ↵
Slatkin, M. 1991. Inbreeding Coefficients and Coalescence Times. Genetical Research 58:167–175.
OpenUrl CrossRef PubMed Web of Science

[92] ↵
Sobel, J. M., S. Stankowski, and M. A. Streisfeld. 2019. Variation in ecophysiological traits might contribute to ecogeographic isolation and divergence between parapatric ecotypes of Mimulus aurantiacus. J Evol Biol.

[93] ↵
Sobel, J. M. and M. A. Streisfeld. 2015. Strong premating reproductive isolation drives incipient speciation in Mimulus aurantiacus. Evolution 69:447–461.
OpenUrl CrossRef PubMed

[94] ↵
Soria-Carrasco, V., Z. Gompert, A. A. Comeault, T. E. Farkas, T. L. Parchman, J. S. Johnston, C. A. Buerkle, J. L. Feder, J. Bast, T. Schwander, S. P. Egan, B. J. Crespi, and P. Nosil. 2014. Stick Insect Genomes Reveal Natural Selection’s Role in Parallel Speciation. Science 344:738–742.
OpenUrl Abstract/FREE Full Text

[95] ↵
Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313.
OpenUrl CrossRef PubMed Web of Science

[96] ↵
Stanke, M. and S. Waack. 2003. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19:Ii215–Ii225.
OpenUrl CrossRef PubMed

[97] ↵
Stankowski, S., J. M. Sobel, and M. A. Streisfeld. 2015. The geography of divergence with gene flow facilitates multitrait adaptation and the evolution of pollinator isolation in Mimulus aurantiacus. Evolution 69:3054–3068.
OpenUrl

[98] ↵
Stankowski, S., J. M. Sobel, and M. A. Streisfeld. 2017. Geographic cline analysis as a tool for studying genome-wide variation: a case study of pollinator-mediated divergence in a monkeyflower. Molecular Ecology 26:107–122.
OpenUrl

[99] ↵
Stankowski, S. and M. A. Streisfeld. 2015. Introgressive hybridization facilitates adaptive divergence in a recent radiation of monkeyflowers. Proceedings of the Royal Society B-Biological Sciences 282:154–162.
OpenUrl

[100] ↵
Streisfeld, M. A. and J. R. Kohn. 2005. Contrasting patterns of floral and molecular variation across a cline in Mimulus aurantiacus. Evolution 59:2548–2559.
OpenUrl CrossRef PubMed Web of Science

[101] ↵
Streisfeld, M. A. and J. R. Kohn. 2007. Environment and pollinator-mediated selection on parapatric floral races of Mimulus aurantiacus. Journal of Evolutionary Biology 20:122–132.
OpenUrl CrossRef PubMed Web of Science

[102] ↵
Streisfeld, M. A., W. N. Young, and J. M. Sobel. 2013. Divergent Selection Drives Genetic Differentiation in an R2R3-MYB Transcription Factor That Contributes to Incipient Speciation in Mimulus aurantiacus. Plos Genetics 9.

[103] ↵
Thompson, D. M. 2005. Systematics of Mimulus subgenus Schizoplacus (Scrophulariaceae). Systematic Botany Monographs 75:1–213.
OpenUrl

[104] ↵
Turner, T. L., M. W. Hahn, and S. V. Nuzhdin. 2005. Genomic islands of speciation in Anopheles gambiae. Plos Biology 3:1572–1578.
OpenUrl CrossRef Web of Science

[105] ↵
Van Doren, B. M., L. Campagna, B. Helm, J. C. Illera, I. J. Lovette, and M. Liedvogel. 2017. Correlated patterns of genetic diversity and differentiation across an avian family. Molecular Ecology 26:3982–3997.
OpenUrl CrossRef PubMed

[106] ↵
Vickery, R. K. 1995. Speciation by Aneuploidy and Polyploidy in Mimulus (Scrophulariaceae). Great Basin Naturalist 55:174–176.
OpenUrl Web of Science

[107] ↵
Wolf, J. B. and H. Ellegren. 2017. Making sense of genomic islands of differentiation in light of speciation. Nature Reviews Genetics 18:87–100.
OpenUrl CrossRef

[108] ↵
Wu, C. I. 2001. The genic view of the process of speciation. Journal of Evolutionary Biology 14:851–865.
OpenUrl CrossRef Web of Science

[109] ↵
Zerbino, D. R. and E. Birney. 2008. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research 18:821–829.
OpenUrl Abstract/FREE Full Text

Widespread selection and gene flow shape the genomic landscape during a radiation of monkeyflowers

Abstract

Introduction

Results and Discussion

A chromosome-level genome assembly, map, and annotation for the bush monkeyflower

Phylogenetic relationships among taxa and patterns of discordance across the genome

Correlated patterns of genome-wide variation across the radiation

Common genomic landscapes have been shaped by heterogeneous indirect selection

A known adaptive locus shows a strong deviation from the common pattern of differentiation

Correlated differentiation landscapes emerge rapidly following a population split

Simulations suggest that adaptation has played an important role in genomic landscape evolution

Radiation-wide evidence for selection against gene flow

Conclusions and implications for understanding genomic landscape evolution and the basis of adaptation and speciation

Data Accessibility

Materials and Methods

Genome Assembly

Construction of high-density linkage map

Genome annotation

Genome re-sequencing and variant calling

Phylogenetic and principal component analyses

Population genomic analyses

Simulations

Tests for genome-wide admixture

Acknowledgments

Footnotes

References

Citation Manager Formats

Subject Area