ABSTRACT
Maternally transmitted Wolbachia infect about half of insect species, yet the predominant mode(s) of Wolbachia acquisition remains uncertain. Species-specific associations could be old, with Wolbachia and hosts co-diversifying (i.e., cladogenic acquisition), or relatively young and acquired by horizontal transfer or introgression. The three Drosophila yakuba-clade hosts ((D. santomea, D. yakuba), D. teissieri) diverged about three million years ago and currently hybridize on Bioko and São Tomé, west African islands. Each species is polymorphic for nearly identical Wolbachia that cause weak cytoplasmic incompatibility (CI)–reduced egg hatch when uninfected females mate with infected males. D. yakuba-clade Wolbachia are closely related to wMel, globally polymorphic in D. melanogaster. We use draft Wolbachia and mitochondrial genomes to demonstrate that D. yakuba-clade phylogenies for Wolbachia and mitochondria tend to follow host nuclear phylogenies. However, roughly half of D. santomea individuals, sampled both inside and outside of the São Tomé hybrid zone, have introgressed D. yakuba mitochondria. Both mitochondria and Wolbachia possess far more recent common ancestors than the bulk of the host nuclear genomes, precluding cladogenic Wolbachia acquisition. General concordance of Wolbachia and mitochondrial phylogenies suggests that horizontal transmission is rare, but varying relative rates of molecular divergence complicate chronogram-based statistical tests. Loci that cause CI in wMel are disrupted in D. yakuba-clade Wolbachia; but, a second set of loci predicted to cause CI are located in the same WO prophage region. These alternative CI loci seem to have been acquired horizontally from distantly related Wolbachia, with transfer mediated by flanking Wolbachia-specific ISWpi1 transposons.
INTRODUCTION
Endosymbiotic Wolbachia bacteria infect many arthropods (Bouchon et al. 1998; Hilgenboecker et al. 2008), including about half of all insect species (Werren and Windsor 2000; Weinert et al. 2015). Wolbachia often manipulate host reproduction, facilitating spread to high frequencies within host species (Laven 1951; Yen and Barr 1971; Turelli and Hoffmann 1991; Rousset et al. 1992; O’Neill et al. 1998; Weeks et al. 2007; Kriesner et al. 2016; Turelli et al. 2018). In Drosophila, reproductive manipulations include cytoplasmic incompatibility (CI) and male killing (Hoffmann et al. 1986; Hoffmann and Turelli 1997; Hurst and Jiggins 2000). CI reduces the egg hatch of uninfected females mated with Wolbachia-infected males, and recent work has demonstrated that WO prophage-associated loci cause CI (Beckmann and Fallon 2013; Beckmann et al. 2017; LePage et al. 2017; Beckmann et al. 2019). Although reproductive manipulations are common, some Wolbachia show little or no reproductive manipulation (e.g., wMel in D. melanogaster, Hoffmann 1988; Hoffmann et al. 1994; Kriesner et al. 2016; wMau in D. mauritiana, Giordano et al. 1995; Meany et al. 2019; wAu in D. simulans, Hoffmann et al. 1996; wSuz in D. suzukii and wSpc in D. subpulchrella, Hamm et al. 2014; Cattel et al. 2018). These Wolbachia presumably spread by enhancing host fitness in various ways, with some support for viral protection, fecundity enhancement, and supplementation of host nutrition (Weeks et al. 2007; Teixeira et al. 2008; Hedges et al. 2008; Brownlie et al. 2009; Martinez et al. 2014; Gill et al. 2014; Moriyama et al. 2015; Kriesner and Hoffmann 2018). Better understanding of Wolbachia effects, transmission and evolution should facilitate using Wolbachia for biocontrol of human diseases by either transforming vector populations with virus-blocking Wolbachia (e.g., McMeniman et al. 2009; Hoffmann et al. 2011; Schmidt et al. 2017; Ritchie 2018) or using male-only releases of CI-causing Wolbachia to suppress vector populations (Laven 1967; O’Connor et al. 2012).
There is a burgeoning literature on Wolbachia frequencies and dynamics in natural populations (e.g., Kriesner et al. 2013; Kriesner et al. 2016; Cooper et al. 2017; Bakovic et al. 2018; Meany et al. 2019), but fewer studies elucidate the modes and time scales of Wolbachia acquisition by host species (O’Neill et al. 1992; Rousset and Solignac 1995; Huigens et al. 2004; Baldo et al. 2008; Raychoudhury et al. 2009; Ahmed et al. 2015; Schuler et al. 2016; Turelli et al. 2018). Sister hosts could acquire Wolbachia from their most recent ancestors. Such cladogenic acquisition seems to be the rule for the obligate Wolbachia found in filarial nematodes (Bandi et al. 1998); and there are also examples in at least two insect clades, Nasonia wasps (Raychoudhury et al. 2009) and Nomada bees (Gerth and Bleidorn 2016). In contrast, infections can be relatively young and acquired through introgressive or horizontal transfer (Raychoudhury et al. 2009; Schuler et al. 2016; Conner et al. 2017; Turelli et al. 2018).
Comparisons of host nuclear and mitochondrial genomes with the associated Wolbachia genomes enable discrimination among cladogenic, introgressive, and horizontal acquisition (Raychoudhury et al. 2009; Turelli et al. 2018; see Figure S1). Concordant nuclear, mitochondrial, and Wolbachia cladograms—including consistent divergence-time estimates for all three genomes—support cladogenic acquisition and co-divergence of host and Wolbachia lineages. Concordant Wolbachia and mitochondrial phylogenies and consistent Wolbachia and mitochondrial divergence-time estimates that are more recent than nuclear divergence support introgressive acquisition. In this case, mitochondrial and Wolbachia relationships may or may not recapitulate the host phylogeny. Finally, if Wolbachia diverged more recently than either nuclear or mitochondrial genomes, horizontal transfer (or paternal transmission; Hoffmann and Turelli 1988; Turelli and Hoffmann 1995) is indicated. This is often associated with discordance between host and Wolbachia phylogenies (O’Neill et al. 1992; Rousset and Solignac 1995; Turelli et al. 2018).
Introgressive Wolbachia acquisition may be common in Drosophila. About half of all closely related Drosophila species have overlapping geographical ranges and show pervasive evidence of reinforcement (Coyne and Orr 1989, 1997; Yukelevich 2012; Nosil 2013), indicating that hybridization must be common (Turelli et al. 2014). Several instances of sporadic contemporary hybridization and interspecific gene flow have been documented in the genus (Carson et al. 1989; Shoemaker et al. 1999; Jaenike et al. 2006; Kulathinal et al. 2009; Garrigan et al. 2012; Brand et al. 2013; Matute and Ayroles 2014; Lohse et al. 2015); but only two stable hybrid zones have been well described, and both involve D. yakuba-clade species. In west Africa, D. yakuba hybridizes with endemic D. santomea on the island of São Tomé, and with D. teissieri on the island of Bioko (Lachaise et al. 2000; Llopart et al. 2005; Comeault et al. 2016; Cooper et al. 2018). The ranges of D. yakuba and D. teissieri overlap throughout continental Africa, but contemporary hybridization has not been observed outside of Bioko (Cooper et al. 2018). Genomic analyses support both mitochondrial and nuclear introgression in the D. yakuba clade (Lachaise et al. 2000; Bachtrog et al. 2006; Llopart et al. 2014; Turissini and Matute 2017; Cooper et al. 2018), and the Wolbachia infecting all three species (wSan, wTei, and wYak) are at intermediate frequencies and identical with respect to commonly used typing loci (Lachaise et al. 2000; Charlat et al. 2004; Cooper et al. 2017).
Horizontal Wolbachia transmission has been repeatedly demonstrated since its initial discovery by O’Neill et al. (1992) (e.g., Baldo et al. 2008; Schuler et al. 2016), and it may be common in some systems (e.g., Huigens et al. 2000; Huigens et al. 2004; Ahmed et al. 2015; Li et al. 2017). Phylogenomic analyses indicate recent horizontal Wolbachia transmission, on the order of 5,000–27,000 years, among relatively distantly related Drosophila species of the D. melanogaster species group (Turelli et al. 2018); but there is little evidence for non-maternal transmission within Drosophila species (Richardson et al. 2012; Turelli et al. 2018). Horizontal Wolbachia transfer could occur via a vector (Vavre et al. 2009; Ahmed et al. 2015) and/or via shared food sources during development (Huigens et al. 2000; Li et al. 2017). Rare paternal Wolbachia transmission has been documented in D. simulans (Hoffmann and Turelli 1988; Turelli and Hoffmann 1995). However, the most common mode of Wolbachia acquisition by Drosophila species (and most other hosts) remains unknown, and all three modes seem plausible in the D. yakuba clade (Lachaise et al. 2000; Bachtrog et al. 2006; Cooper et al. 2017).
Distinguishing acquisition via introgression versus horizontal or paternal transmission requires estimating phylogenies and relative divergence times of mtDNA and Wolbachia (Raychoudhury et al. 2009; Conner et al. 2017; Turelli et al. 2018). To convert sequence divergence to divergence-time estimates requires understanding relative rates of divergence for mtDNA, nuclear genomes and Wolbachia. If each genome followed a constant-rate molecular clock, taxa with cladogenic Wolbachia transmission, such as Nasonia wasps (Raychoudhury et al. 2009) and Nomada bees (Gerth and Bleidorn 2016), would provide reference rates for calibration. However, like Langley and Fitch (1974), Turelli et al. (2018) found significantly varying relative rates of Wolbachia and mtDNA divergence. This variation, which we document across Drosophila, confounds attempts to understand Wolbachia acquisition, as discussed below.
The discovery of weak CI in the D. yakuba clade (Cooper et al. 2017) motivates detailed comparative analysis of loci associated with CI (CI factors or cifs) in WO prophage regions of Wolbachia genomes. Beckmann and Fallon (2013) first associated wPip_0282 and wPip_0283 proteins in wPip-infected Culex pipiens with Wolbachia-modified sperm. Later work confirmed that these proteins induce toxicity and produce rescue when expressed/co-expressed in Saccharomyces cerevisiae (Beckmann et al. 2017), lending support to a toxin-antidote model of CI (Beckmann et al. 2019; but see, Shropshire et al. 2019). Homologs of these genes in wMel (WD0631 and WD0632) recapitulate CI when transgenically expressed in D. melanogaster (LePage et al. 2017), and transgenic expression of WD0631 in D. melanogaster rescues CI (Shropshire et al. 2018). A distantly related WO-prophage-associated pair, present in wPip, wPip_0294 and wPip_0295, causes similar toxicity/rescue in S. cerevisiae (Beckmann et al. 2017), and causes CI when placed transgenically into a D. melanogaster background (unpublished data presented by Mark Hochstrasser at the 2018 Wolbachia meeting in Salem, MA). We adopt Beckmann et al. (2019)’s nomenclature, which assigns names based on enzymatic activity of the predicted toxin [deubiquitylase (DUB) and nuclease (Nuc)], with superscripts denoting focal Wolbachia strains when needed. Specifically, we refer to wPip_0282-wPip_0283 and the wPip_0294-wPip_0295 pairs as cidA-cidBwPip and cinA-cinBwPip, respectively; and we refer to WD0631-WD632 as cidA-cidBwMel. This distinguishes cid (CI-inducing DUB) from cin (CI-inducing Nuc) pairs, with the predicted antidote and toxin denoted “A” and “B”, respectively (Beckmann et al. 2019). We acknowledge ongoing disagreement in the literature and direct readers to Beckmann et al. (2019) and Shropshire et al. (2019) for details. However, none of these debates on terminology or mechanism affect our findings.
Here, we use host and Wolbachia genomes from the D. yakuba clade to demonstrate introgressive and horizontal Wolbachia acquisition. General concordance of mitochondrial and Wolbachia phylogenies indicates that horizontal acquisition is rare within this clade. However, tests involving divergence-time estimates are complicated by varying relative rates and patterns of Wolbachia, mtDNA, and nuclear sequence divergence, as illustrated by data from more distantly related Drosophila (Clark et al. 2007). Finally, we demonstrate that cid loci underlying CI in closely related wMel (LePage et al. 2017) are disrupted in all D. yakuba-clade Wolbachia. However, these Wolbachia also contain a set of cin loci, absent in wMel, but very similar to those found in wPip (Beckmann et al. 2017), a B-group Wolbachia strain that diverged 6–46 million years ago (mya) from A-group wYak and wMel (Werren et al. 1995; Meany et al. 2019). This is the first discovery of two sets of loci implicated in CI co-occuring within the same prophage region. Several analyses implicate Wolbachia-specific insertion sequence (IS) transposable elements, specifically ISWpi1, in the horizontal transfer of these loci (and surrounding regions) between distantly related Wolbachia. Horizontal movement of incompatibility factors between prophage regions of Wolbachia variants adds another layer to what is already known about horizontal movement of prophages within and between Wolbachia variants that themselves move horizontally between host species.
MATERIALS AND METHODS
Genomic data
The D. yakuba-clade isofemale lines included in our study were sampled over several years in west Africa (Comeault et al. 2016; Turissini and Matute 2017; Cooper et al. 2017). Each line within each species used in our analyses exhibits little nuclear introgression (< 1%) (Turissini and Matute 2017), and no hybrids were included. Reads from D. yakuba (N = 56), D. santomea (N = 11), and D. teissieri (N = 13) isofemale lines were obtained from the data archives of Turissini and Matute (2017) and aligned to the D. yakuba nuclear and mitochondrial reference genomes (Clark et al. 2007) with bwa 0.7.12 (Li and Durbin 2009), requiring alignment quality scores of at least 50. Because many of the read archives were single end, all alignments were completed using single-end mode for consistency.
mtDNA
Consensus mtDNA sequences for each of the 80 lines were extracted with samtools v. 1.3.1 and bcftools v 1.3.1 (Li 2011). Coding sequences for the 13 protein-coding genes were extracted, based on their positions in the D. yakuba reference. We also extracted the 13 protein-coding genes from additional unique D. yakuba (N = 28) and D. santomea (N = 15) mitochondrial genomes (Llopart et al. 2014), and from the D. melanogaster reference (Hoskins et al. 2015). Genes were aligned using MAFFT v. 7 and concatenated (Katoh and Standley 2013). Lines identical across all 13 genes were represented by one sequence for the phylogenetic analyses.
Wolbachia
Reads from each of the 80 Turissini and Matute (2017) lines were aligned using bwa 0.7.12 (Li and Durban 2009) to the D. yakuba reference genome (Clark et al. 2007) combined with the wMel reference genome (Wu et al. 2004). We calculated the average depth of coverage across the wMel genome. We considered lines with < 1× coverage uninfected and lines with >10× coverage infected. No lines had between 1× and 10× coverage. To test our genomic analyses of infection status, we used a polymerase chain reaction (PCR) assay on a subset of lines. We extracted DNA using a “squish” buffer protocol (Gloor et al. 1993), and infection status was determined using primers for the Wolbachia-specific wsp gene (Braig et al. 1998; Baldo et al. 2006). We amplified a Drosophila-specific region of chromosome 2L as a positive control (Kern et al. 2015) (primers are listed in Supplemental Material, Table S1). For each run, we also included a known Wolbachia-positive line and a water blank as controls.
To produce a draft Wolbachia “pseudoreference” genome for each species, we first determined the isofemale line from each species that produced the greatest average coverage depth over wMel. We trimmed the reads with Sickle v. 1.33 (Joshi and Fass 2011) and assembled with ABySS v. 2.0.2 (Jackman et al. 2017). K values of 51, 61…91 were tried. Scaffolds with best nucleotide BLAST matches to known Wolbachia sequences with E-values less than 10−10 were extracted as the draft Wolbachia assembly. For each species, the assembly with the highest N50 and fewest scaffolds was kept as our Wolbachia pseudoreference genome for that host species, denoted wYak, wSan, and wTei (Table S2). To assess the quality of these three draft assemblies, we used BUSCO v. 3.0.0 to search for homologs of the near-universal, single-copy genes in the BUSCO proteobacteria database (Simão et al. 2015). For comparison, we followed Conner et al. (2017) and performed the same search on the reference genomes for wMel (Wu et al. 2004), wRi (Klasson et al. 2009), wAu (Sutton et al. 2014), wHa and wNo (Ellegaard et al. 2013) (Table S3).
Using these draft Wolbachia pseudoreference genomes, reads from all other genotypes were aligned to the D. yakuba reference (nuclear and mitochondrial) plus the species-specific Wolbachia draft assembly with bwa 0.7.12 (Li and Durbin 2009). Consensus sequences were extracted with samtools v. 1.3.1 and bcftools v. 1.3.1 (Li 2011).
To test whether the choice of Wolbachia pseudoreference influenced the Wolbachia draft sequence obtained for each infected line, we arbitrarily selected three infected lines of each host species and aligned reads from each of those lines independently to our wYak, wSan, and wTei pseudoreferences. This resulted in 27 alignments (9 per host species). Among the three Wolbachia consensus sequences generated for each line, we assessed single-nucleotide variants between different mappings of the same line within loci used for downstream analyses. Irrespective of the pseudoreference used, we obtained identical Wolbachia sequences. This indicates the robustness of our approach to generating population samples of draft Wolbachia genomes, without having a high-quality Wolbachia reference genome from any of the host species. Because Llopart et al. (2014) released only assembled mitochondrial genomes, we could not assess the Wolbachia infection status of their lines.
Loci for phylogenetic and comparative-rate analyses
Wolbachia genes
For each phylogenetic analysis of Wolbachia data, the draft genomes were annotated with Prokka v. 1.11 (Seemann 2014), which identifies homologs to known bacterial genes. To avoid pseudogenes and paralogs, we used only genes present in a single copy and with identical lengths in all of the sequences analyzed. Genes were identified as single copy if they uniquely matched a bacterial reference gene identified by Prokka v. 1.11. By requiring all homologs to have identical length in all of our draft Wolbachia genomes, we removed all loci with indels across any of our sequences.
Given that many loci accumulate indels over time, the number of loci included in our phylogenetic analyses depended on the number of strains included. We first estimated phylograms for the A-group Wolbachia from: the D. yakuba-clade, wMel from D. melanogaster (Wu et al. 2004), wInc from Drosophila incompta (Wallau et al. 2016), wSuz from D. suzukii (Siozios et al. 2013), wAna from D. anannasae (Choi et al. 2015), Wolbachia that infect Nomada bees (wNFe, wNPa, wNLeu, and wNFa; Gerth and Bleidorn 2016), and Wolbachia that infect D. simulans (wRi, wAu and wHa; Klasson et al. 2009; Sutton et al. 2014; Ellegaard et al. 2013). We also analyzed distantly related B-group Wolbachia: wNo from D. simulans (Ellegaard et al. 2013), wPip_Pel from Culex pipiens (Klasson et al. 2008), and wAlbB from Aedes albopictus (Mavingui et al. 2012). To increase our phylogenetic resolution within the D. yakuba Wolbachia clade (by increasing the number of loci), we estimated a phylogram that included only D. yakuba-clade Wolbachia and wMel. Our phylogram with both A- and B-group Wolbachia included 146 genes (containing 115,686 bp), and our phylogram that included only D. yakuba-clade Wolbachia and wMel included 643 genes (containing 644,586 bp).
We also constructed an absolute chronogram to estimate divergence between D. yakuba-clade Wolbachia strains. To illustrate how divergence-time estimates vary with changing patterns of molecular divergence, we estimated two additional Wolbachia chronograms: one that considered only the wMel reference plus the two most-diverged wMel lines included in Richardson et al. (2012), and one that included these three wMel variants and our D. yakuba-clade Wolbachia. Our chronogram that included only D. yakuba-clade Wolbachia included 678 genes (containing 695,118 bp), our chronogram that included only wMel Wolbachia included 692 genes (containing 709,599 bp), and our chronograms with wMel plus D. yakuba-clade Wolbachia included 621 genes (containing 624,438 bp). Independent estimates were obtained using relaxed-clock analyses with Γ(2,2) and Γ(7,7) branch-rate priors. We also estimated divergence between D. yakuba-clade Wolbachia and wMel using a strict-clock analysis, corresponding to Γ(n,n) as n → ∞.
mtDNA and nuclear genes from diverse Drosophila
To assess variation in the relative rates of divergence of mitochondrial and nuclear genes, we analyzed the canonical 12 Drosophila genomes (Clark et al. 2007) plus D. suzukii (Chiu et al. 2013). We excluded D. sechellia and D. persimilis which show evidence of introgression with D. simulans and D. pseudoobscura, respectively (Kulathinal et al. 2009; Schrider et al. 2018; Brand et al. 2013). Coding sequences for 20 nuclear genes used in the analyses of Turelli et al. (2018) (aconitase, aldolase, bicoid, ebony, enolase, esc, g6pdh, glyp, glys, ninaE, pepck, pgi, pgm, pic, ptc, tpi, transaldolase, white, wingless, and yellow) were obtained from FlyBase for each species. The genes were aligned with MAFFT v. 7 (Katoh and Standley 2013). Coding sequences for the 13 protein-coding mitochondrial genes in the inbred reference strains were also obtained from FlyBase for each species and were aligned with MAFFT v. 7 and concatenated.
Phylogenetic analyses
All of our analyses used RevBayes v. 1.0.9 (Hohna et al. 2016), following the procedures of Turelli et al. (2018). For completeness, we summarize those methods below. For additional details on the priors and their justifications, consult Turelli et al. (2018). Four independent runs were performed for each phylogenetic tree we estimated; and in all cases, the runs converged to the same topologies. Nodes with posterior probability less than 0.95 were collapsed into polytomies.
Wolbachia phylograms
We estimated a phylogram for A- and B-group Wolbachia and for only D. yakuba-clade and wMel Wolbachia using the same methodology as Turelli et al. (2018). We used a GTR + Γ model with four rate categories, partitioning by codon position. Each partition had an independent rate multiplier with prior Γ(1,1) (i.e., Exp(1)), as well as stationary frequencies and exchangeability rates drawn from flat, symmetrical Dirichlet distributions (i.e., Dirichlet(1,1,1…). The model used a uniform prior over all possible topologies. Branch lengths were drawn from a flat, symmetrical Dirichlet distribution, thus they summed to 1. Since the expected number of substitutions along a branch equals the branch length times the rate multiplier, the expected number of substitutions across the entire tree for a partition is equal to the partition’s rate multiplier.
Wolbachia chronograms
We first created a relaxed-clock relative chronogram with the root age fixed to 1 using the GTR + Γ model, partitioned by codon position, using the same birth-death prior as Turelli et al. (2018). Each partition had an independent rate multiplier with prior Gamma(1,1), as well as stationary frequencies and exchangeability rates drawn from flat, symmetrical Dirichlet distributions. The branch-rate prior for each branch was Γ(2,2), normalized to a mean of 1 across all branches (Table S4). We also tried a strict-clock tree and a relaxed-clock tree with branch-rate prior Γ(7,7), which produced no significant differences. We used the scaled distribution Γ(7,7) × 6.87 × 10−9 to model substitutions per third-position site per year. This transforms the relative chronogram into an absolute chronogram. This scaled distribution was chosen to replicate the upper and lower credible intervals of the posterior distribution estimated by Richardson et al. (2012), assuming 10 Drosophila generations per year, normalized by their median substitution-rate estimate. Branch lengths in absolute time were calculated as the relative-branch length times the third-position rate multiplier divided by the substitutions per third-position site per year estimate above.
We illustrate how divergence-time estimates depend on changing patterns of Wolbachia molecular evolution observed over different time scales, specifically the relative rates of third-site substitutions versus first- and second-site substitutions. We compare divergence times estimated for variants within melanogaster-group host species, with a time scale of only hundreds or thousands of years, with the same divergence times estimated when more distantly related Wolbachia, with divergence times over tens of thousands of years, are included in the analyses. We present three separate analyses: one using only D. yakuba-clade Wolbachia variants, a second using only wMel variants analyzed by Richardson et al. (2012), and a third simultaneously analyzing both sets of data.
Drosophila phylogeny for 11 references species
We estimated a phylogram from our 20 nuclear loci using the GTR + Γ model, with partitioning by gene and codon position. The model was identical to the Wolbachia phylogram model above except for the partitioning (there are too few Wolbachia substitutions to justify partitioning by gene).
mtDNA phylogeny for the 11 reference species
We estimated a phylogram from the 13 protein-coding mitochondrial loci using the GTR + Γ model, with partitioning only by codon position. The model was identical to the Wolbachia phylogram model above.
mtDNA chronogram for the D. yakuba clade
We estimated a relative chronogram for the D. yakuba-clade mitochondria from the 13 protein-coding loci using the GTR + Γ model, partitioning only by codon position. To test the sensitivity of our results to priors, we ran a strict clock, a relaxed clock with a Γ(7,7) branch-rate prior, and a relaxed clock with a Γ(2,2) branch-rate prior, corresponding to increasing levels of substitution-rate variation across branches. The model was identical to the Wolbachia chronogram model above, except that we did not transform it into an absolute chronogram.
Ratios of divergence rates for mtDNA versus nuclear loci
To quantify ratios of mtDNA to nuclear substitution rates, we estimated relative substitution rates for host nuclear genes versus mtDNA using the GTR + Γ model. The (unrooted) topology was fixed to the consensus topology for the nuclear and mitochondrial data. The data from 20 nuclear loci were partitioned by locus and codon position, the mtDNA data were partitioned only by codon position. All partitions shared the same topology, but the nuclear partitions were allowed to have branch lengths different from the mtDNA partitions. The sum of the branch lengths for each partition was scaled to 1. Assuming concurrent nuclear-mtDNA divergence (because we used only species showing no evidence of introgression), we imposed the same absolute ages for all nodes of the nuclear and mtDNA chronograms. For each nuclear locus, the third-position rate-ratio for the mtDNA versus nuclear genomes was calculated as: (mitochondrial branch length × mitochondrial 3rd position rate multiplier)/(nuclear branch length × nuclear 3rd position rate multiplier). We summarized the relative rates of mtDNA versus nuclear substitutions along each branch using the arithmetic average of the 20 ratios obtained from the individual nuclear loci.
Introgressive versus horizontal Wolbachia transfer–concordance of phylograms
Horizontal transfer of Wolbachia was initially detected by discordance between Wolbachia and host phylogenies (O’Neill et al. 1992). When considering samples within species and closely related species, a comparison of mitochondrial and Wolbachia phylogenies provides a natural test for horizontal transmission. We first look for significant discrepancies between strongly supported nodes in the mitochondrial versus Wolbachia phylogenies. To test for less obvious differences, we follow Richardson et al. (2012) and compute Bayes Factors to assess the support for models that assume that mitochondrial versus Wolbachia follow the same topology versus distinct phylogenies. For these calculations, we omit san-Quija37 due to its clear discordance.
We calculated the marginal likelihood in RevBayes for two models; one with a shared mitochondrial and Wolbachia phylogeny, another with independent topologies. The mtDNA and Wolbachia were each partitioned by codon position, for a total of six partitions. In the shared phylogeny model, all six share the same topology; in the independent model, mitochondria and Wolbachia have separate topologies. All priors were the same as the Wolbachia phylogram model. Lines that were identical across the mtDNA and Wolbachia were collapsed into one sample. We ran two independent replicates of each model with 50 stepping stones per run. The Bayes factor is computed as the difference between the marginal likelihoods of each model.
Introgressive versus horizontal Wolbachia transfer–relative rates
Within species, the phylogenies of mitochondria and Wolbachia are fully resolved. Thus we consider an alternative approach for distinguishing between introgressive versus horizontal transfer that depends on estimating divergence times for Wolbachia genomes and host mtDNA. This is complicated by variation in the relative rates of mtDNA versus Wolbachia divergence and by systematic changes over time in the relative rates of substitutions at the three codon positions for Wolbachia. Following the procedures in Turelli et al. (2018), we estimated the mtDNA-to-Wolbachia third-position substitution-rate ratio for each branch in the D. yakuba clade. For each analysis, we computed the marginal likelihood of the model where all branches shared the same ratio and the same model except allowing different ratios on each branch. We then calculated Bayes factors (i.e. differences in the log of the marginal likelihoods) to determine which model was favored. We repeated these analyses after including three wMel-infected D. melanogaster lines with the D. yakuba-clade dataset. Because D. yakuba-clade species and D. melanogaster do not produce fertile hybrids (Sánchez and Santamaria 1997; Turissini et al. 2017), introgressive transfer of Wolbachia is not possible between these species. Hence, including wMel provides a control for our proposed test of introgressive transfer. For these controls, we calculated the mitochondrial versus Wolbachia substitution-rate ratio for the interspecific branches separately, regardless of whether the shared rate-ratio model was favored.
Wolbachia loci associated with CI
Wolbachia that infect all three D. yakuba-clade hosts cause weak intra-and interspecific CI (Cooper et al. 2017), but the genetic basis of CI in this clade remains unknown. We used tBLASTN to search for cif homologs in each of our D. yakuba-clade Wolbachia assemblies, querying cidA-cidB variants and the cinA-cinBwRi pair (wRi_006720 and wRi_006710) found in wRi and some wRi-like Wolbachia (Beckmann and Fallon 2013; Turelli et al. 2018), and the cinA-cinBwPip pair (Beckmann and Fallon 2013; Beckmann et al. 2017; LePage et al. 2017; Lindsey et al. 2018). We identified both cid and cin homologs, but we found no close matches to the cinA-cinBwRi pair (denoted Type II cif loci by Lepage et al. 2017) in any of our genomes. For all of the samples, we extracted consensus sequences from our assemblies and alignments for the cidA-cidBwYak-clade and cinA-cinBwYak-clade gene pairs. The genes were aligned with MAFFT v. 7 (Katoh and Standley 2013). We examined variation in these regions relative to cidA-cidBwMel and cinA-cinBwPip respectively.
Because we unexpectedly identified homologs of cinA-cinB in all wYak-clade Wolbachia, we took additional measures to understand the origin and placement of these loci in the genomes. The cinA-cinBwYak-clade open reading frames are located on a ∼11,500 bp scaffold in the fragmented wYak assembly (wYak scaffold “702380” from ABySS output––we use quotes to indicate names assigned by ABySS). The first ∼4,000 bp of this scaffold contain cinA and cinB genes that have ∼97% identity with those in wPip. We performed a BLAST search using all contigs in the wYak assembly as queries against the wMel genome (Camacho et al. 2009). The cinA-cinBwYak-clade scaffold was placed on the wMel genome (∼617,000–623,000) adjacent to and downstream of the wYak contig containing cidA-cidBwYak-clade loci (wYak contig “187383”). However, only ∼7,000 bp of the 11,500 bp of the scaffold containing cinA-cinBwYak-clade align to this region of wMel. The ∼4,000 bp sequence that contains the ORFs for cinA-cinBwYak-clade has no significant hit against the wMel genome.
To verify the placement of the scaffold containing cinA-cinBwYak-clade, we first performed targeted assembly using the cinA-cinBwYak-clade contig and the adjacent contigs as references (Hunter et al. 2015). Iterative targeted assembly often extends assembled scaffolds (Langmead and Salzberg 2012). Reads are mapped to the target of interest, and the mapped reads are then assembled using SPADES (Bankevich et al. 2012; Nurk et al. 2013). The newly assembled scaffolds serve as the target in a subsequent round of mapping. The procedure is repeated until no new reads are recruited to the assembled scaffold from the previous round. The extended scaffolds enabled us to merge flanking wYak contigs downstream of the cinA-cinBwYak-clade scaffold but failed to connect the fragment with the cidA-cidBwYak-clade contig. We used PCR to amplify the intervening region (Supplemental Information, Table S1), and Sanger sequencing to evaluate this product (Sanger et al. 1977).
Subsequent mapping of paired-end reads to the merged scaffolds confirmed the correct order and orientation of the contigs containing the cidA-cidBwYak-clade and cinA-cinBwYak-clade genes. Observed, uncorrected, pairwise distances between focal regions in wYak, wMel, and wPip and between focal regions in wYak, wPip, wAlbB, and wNleu were calculated using a sliding window (window size = 200 bp, step size = 25 bp).
Data Availability
The Wolbachia assemblies (BioProject PRJNA543889, Accession VCEF00000000) and Sanger sequences (MK950151 and MK950152) are archived in GenBank. All three assemblies were manually corrected using Sanger sequence to accurately represent variation in the cin region. All relevant code will be uploaded to Dryad prior to publication.
RESULTS
Unidirectional introgression of D. yakuba mitochondria into D. santomea
Previous analyses suggest that D. yakuba and D. santomea carry very similar mtDNA due to ongoing hybridization and introgression (Lachaise et al. 2000; Bachtrog et al. 2006; Llopart et al. 2014). However, our mtDNA relative chronogram (Figure 1), based on mitochondrial whole-proteome data sampled from throughout the ranges of all three species, supports three mtDNA clades that largely agree with the nuclear topology—D. teissieri mtDNA sequences are outgroup to sister D. yakuba and D. santomea mtDNA (Figure 1). Consistent with introgression, 12 out of 26 D. santomea isofemale lines have D. yakuba-like mtDNA (indicated by blue branches and letters in Figure 1). Yet, our samples showed no evidence of D. santomea mtDNA introgressed into D. yakuba and no mtDNA introgression involving D. teissieri. The 12 D. santomea isofemales with introgressed D. yakuba-like mtDNA were sampled from both within (N = 8) and outside (N = 4) the well-described Pico de São Tomé hybrid zone (Llopart et al. 2014; Turissini and Matute 2017). The D. santomea hybrid-zone samples are indicated by “HZ” in Figure 1 in blue, and they are found throughout the clade that includes all D. yakuba mtDNA. This suggests that hybridization occurs in other areas of the island or that introgressed D. santomea genotypes migrate outside of the hybrid zone.
Previous analysis of nuclear genomes (Turissini and Matute 2017) also found more introgression from D. yakuba into D. santomea when populations of D. yakuba from near the Gulf of Guinea, Cameroon, and Kenya were included; however, excluding Cameroon and Kenya indicated similar amounts of introgression in each direction. Matings between D. santomea females and D. yakuba males are rare relative to the reciprocal cross (Coyne et al. 2002, Matute 2010). When these matings do occur, F1 females produce fewer progeny than do F1 females produced by D. yakuba females and D. santomea males (Matute and Coyne 2010). F1 hybrids produced by D. santomea females also have a shortened lifespan (Matute and Coyne 2010), due to copulatory wounds inflicted by D. yakuba males during mating (Kamimura 2012). These observations are consistent with our finding of preferential introgression of D. yakuba mitochondria into D. santomea backgrounds.
Wolbachia frequencies and draft genomes
As expected, we found Wolbachia in all three D. yakuba-clade species sampled by Turissini and Matute (2017). For D. yakuba, 21 of 56 lines were infected, yielding an infection frequency of p = 0.36, with 95% binomial confidence interval (0.25, 0.51). For D. santomea, 10 of 11 were infected, p = 0.91 (0.59, 1.0); and for D. teissieri, 11 of 13 were infected, p = 0.85 (0.55, 0.98). Additional frequency estimates are reported in Cooper et al. (2017), which found that Wolbachia frequencies vary through time and space in west Africa. All PCR tests for Wolbachia matched our coverage-based genomic analyses of infection status.
The average coverage across the genome, calculated as the total number of bases aligned to the wMel reference divided by its length, was 1,940 for our wYak pseudoreference genome (yak-CY17C), 40 for wSan (san-Quija630.39), and 489 for wTei (teis-cascade_4_2). These pseudoreference genomes were included in the phylogram that includes A- and B-group Wolbachia (Figure 2), and they cluster with the Wolbachia from their respective hosts (Figure 3), as expected if introgression is rare. In general, the wSan data are lower quality due to the relatively small size of the D. santomea libraries (Turissini and Matute 2017). The scaffold count, N50, and total assembly sizes are reported in Table S2.
Introgressive and horizontal Wolbachia transfer among hosts
Our phylogram that includes A- and B-group Wolbachia places the three weak-CI-causing D. yakuba-clade Wolbachia together and sister to wMel in the A group (Figure 2A). In contrast, D. simulans carries diverse Wolbachia that do (wRi, wHa, wNo) and do not (wAu) cause CI, spanning Wolbachia groups A (wRi, wHa, wAu) and B (wNo). All wSan, wYak, and wTei Wolbachia variants included in our analysis are identical at the MLST loci often used to type Wolbachia variants (Cooper et al. 2017; Baldo et al. 2006). Due to the relatively small number of genes that meet our inclusion criteria (146 genes across 115,686 bp), the resulting phylogram does not resolve relationships among wSan, wYak, and wTei (Figure 2A). Our phylogram that includes only D. yakuba-clade and wMel Wolbachia (based on 643 genes across 644,586 bp) resolves these relationships, placing wTei variants outgroup to sister wSan and wYak in distinct clades (Figure 2B). We observe 0.0039% third-position pairwise differences between wTei and wYak and between wTei and wSan, and 0.0017% between wSan and wYak. The third-position pairwise differences between any of the D. yakuba-clade Wolbachia and wMel is 0.11%. For reference, the third-position pairwise difference between wYak and wRi is 2.9%, whereas wRi and wSuz differ by only 0.014% (Turelli et al. 2018).
Our absolute chronogram that includes wMel and D. yakuba-clade Wolbachia also indicates three monophyletic groups (Figure 3), whose topology––(wTei, (wYak, wSan))––agrees with the nuclear topology of the hosts (Turissini and Matute 2017). However, one D. santomea line, D. santomea Quija 37, sampled from the south side of São Tomé (Matute 2010), carries Wolbachia identical to the other eight wSan samples across 695,118 bp, yet its mtDNA is nested in the D. yakuba mtDNA cluster (bold blue in Figure 1). This single example of Wolbachia-mtDNA phylogenetic discordance can be explained by either horizontal Wolbachia transfer or paternal transfer of Wolbachia or mtDNA. While rare, paternal Wolbachia transmission has been observed in other Drosophila (Hoffmann and Turelli 1988; Turelli and Hoffmann 1995), as has paternal transmission of mtDNA (Kondo et al. 1990). The only other Wolbachia-screened D. santomea isofemale line with introgressed mtDNA was Wolbachia uninfected.
In Figure 3, we present alternative estimates of the divergence times for Wolbachia within and between D. yakuba-clade species, within D. melanogaster, and between the D. yakuba clade and D. melanogaster. The estimates from the chronogram that includes all of the indicated Wolbachia variants are in bold. To demonstrate how these time estimates depend on the sequences included in the analyses (through estimates of relative divergence for each codon position), estimates that included only D. yakuba-clade Wolbachia or only wMel variants are superimposed and not in bold. These estimates were all obtained using a relaxed-clock analysis with a Γ(2,2) branch-rate prior (results using Γ(7,7) and a strict clock are mentioned below). When considering only the D. yakuba-clade Wolbachia (Figure 3), we estimate the root at 2,510 years (95% credible interval = 773 to 5,867 years) and the wYak-wSan split at 1,556 years (95% credible interval = 411 to 3,662 years). The wMel chronogram estimates the most recent common ancestor (MRCA) of wMel at 4,890 years ago (95% credible interval = 1,258 to 11,392 years). Our wMel result is consistent with the Richardson et al. (2012) estimate of 8,008 years (95% credible interval = 3,263, 13,998). The alternative branch-rate Γ(7,7) prior has little effect on these divergence-times estimates, with credible intervals that overlap with those produced using a Γ(2,2) branch-rate prior (Figure 3 and Table S5). In contrast, Figure 3 shows that including more divergent variants influences our time estimates, although the credible intervals are overlapping in each case. Table S4 shows how estimates of the relative amounts of the divergence for the three codon positions vary across the data sets. The underlying model for divergence times assumes constant relative rates at the three positions. Hence, changes in relative rates affect divergence-time estimates.
As noted above, our estimates of the time of the MRCA for Wolbachia within the D. yakuba clade or within D. melanogaster are relatively insensitive to whether the branch-rate prior is Γ(2,2) or Γ(7,7). The choice of genes used to estimate divergence also has little effect—maximal distance between D. yakuba-clade strains was identical for analyses that included only the D. yakuba-clade (N = 678 genes) or wYak-clade plus wMel (N = 621 genes) gene sets (wYak-wTei distance = 0.0035%). The estimated divergence time between wMel and the D. yakuba-clade Wolbachia is less robust, although again the credible intervals generated using alternative branch-rate priors are overlapping (Figure 3 and Table S5). A strict-clock analysis produces an estimate of 72,612, with 95% credible interval (23,412, 132,276), again overlapping with our relaxed-clock estimates. An alternative point estimate can be obtained from pairwise differences. Averaging over the D. yakuba-clade sequences, we observe an average third-position difference of 0.107% between the D. yakuba-clade Wolbachia and wMel, or 0.0535% from tip to root. The “short-term evolutionary rate” of divergence within wMel estimated by Richardson et al. (2012) produces a point estimate of 78,000 years, which overlaps with the credible interval of our relaxed-tree estimate with a Γ(7,7) branch-rate prior (Table S5). This molecular-clock estimate closely agrees with our strict-clock chronogram analysis (Figure S2). Given that these estimates are much shorter than the divergence time between their reproductively isolated hosts, horizontal Wolbachia transmission must have occurred. As discussed below, the horizontal transmission probably involved intermediate hosts.
mtDNA and nuclear concordance across Drosophila
The topologies of the nuclear and mtDNA trees for the 11 reference Drosophila species are completely concordant (Figure 4) and agree with the neighbor-joining result presented in Clark et al. (2007). We find the average ratio of mtDNA to nuclear substitution rates across branches vary from about 1× to 10×, consistent with the results of Havird and Sloan (2016) across Diptera.
mtDNA and Wolbachia concordance in the D. yakuba clade
The discordance between the placement of the mitochondria of D. santomea line Quija 37, which has mitochondria belonging to the clade associated D. yakuba and its Wolbachia, which is clearly wSan, demonstrates horizontal or paternal transmission. Apart from this, we find no nodes in either the mitochondrial or Wolbachia tree with posterior probability at least 0.95 that is inconsistent with a strongly supported node (i.e., posterior probability at least 0.95) in the other tree (Figure S3). For a more refined test of concordance, we removed the Quija 37 line from our mitochondrial and Wolbachia data and computed Bayes Factors. The shared topology model was favored by a Bayes factor of e55, indicating strong support for concordance of the mitochondrial and Wolbachia phylogenies.
Variation in the ratio of mtDNA to Wolbachia substitution rates
Following the methods of Turelli et al. (2018) to look for different mitochondrial versus Wolbachia branch lengths, we also found no evidence of variation in the ratio of mtDNA:Wolbachia substitution rates within the D. yakuba clade (median 3rd position ratio = 223, quartiles = 186 and 268); the model with all branches sharing the same ratio is favored over the model where each branch has its own ratio by a Bayes factor of e7 (Figure S4). For comparison, Turelli et al. (2018) found a median ratio of 566 within D. suzukii and 406 within wRi-infected D. ananassae-subgroup species. These differences in relative rates are broadly compatible with the tenfold variance in relative rates noted above for nuclear versus mtDNA divergence across Drosophila species. Hence, our data seem compatible with Wolbachia transmission via introgression within the D. yakuba clade.
Although this result seems to strongly support purely maternal transmission of Wolbachia within species and introgressive transfer between species (with more recent introgression between the sister species D. yakuba and D. santomea), this interpretation is severely weakened by our “control” analysis that includes wMel-infected D. melanogaster (for which introgression is impossible). Including the D. melanogaster data, the constant-ratio model was still favored over variable ratios by a Bayes factor of e54 (median 3rd position ratio = 297, quartiles = 276 and 323). We discuss the implications of this anomalous result below. The key observation is that significantly declining rates of substitution for Wolbachia (and mtDNA, see Ho et al. 2005) over time, together with heterogeneity of relative rates (as illustrated in Figure 4), limit the power of relative substitution ratios to differentiate between introgression and horizontal transmission.
Transposon-mediated transfer of CI factors independent of WO phage
In contrast to wMel, which contains only the cidA-cidB gene pair (LePage et al. 2017), the D. yakuba-clade Wolbachia, which also belong to Wolbachia group A, have both cidA-cidB and a cinA-cinB pair homologous to CI loci originally identified in group-B wPip (Beckmann et al. 2017). The A and B Wolbachia groups diverged about 6-46 mya (Werren et al. 1995; Meany et al. 2019). Among our wYak-clade sequences, there are no single-nucleotide variants within any of the cif loci. The cidBwYak-clade locus has an inversion from amino acids 37–103 relative to the same region in cidBwMel in every wYak-clade Wolbachia variant we analyzed. This introduces several stop codons, which might render this gene nonfunctional. On the other hand, RNA polymerase should still transcribe a complete polycistronic transcript. Therefore translation of an N-terminally truncated CidBwYak-clade protein cannot be ruled out. With the exception of a 236 bp tandem duplication in cinBwYak-clade (Figures 5 and 6), the sequence differences between cinAwYak-clade and cinBwYak-clade regions compared to cinAwPip and cinBwPip homologs are 1.35% and 0.59%, respectively. In contrast, the average difference between wYak and wPip genomes across all 194 genes (161,655 bp) present in single copy and with identical lengths in both genomes is 11.60%. Conversely, outside of the prophage regions, wMel and wYak differ by about 1%. This is consistent with data that indicate WO phage regions have a different evolutionary history than the bulk of the Wolbachia genome (LePage et al. 2017; Lindsey et al. 2018). The 236 bp tandem duplication in the cinBwYak-clade introduces a frame shift in the transcript at position 588. It is unclear whether the cinBwYak-clade protein retains functionality.
The absence of cin loci in wMel, combined with the similarity of this region between wYak and distantly related wPip, led us to assess how these Wolbachia acquired cin loci. Targeted assembly extended the scaffold wYak “702380” containing the cin loci (Figure 5) but did not definitively place it relative to the wMel genome. BLAST searches indicated that contig wYak “187383” was likely flanked by wYak “702380”. PCR primers designed in both contigs amplified the intervening region (labeled “A” in Figure 5), confirming we have discovered the first WO prophage with two sets of cif loci. Subsequent Sanger sequencing revealed that this region contains an ISWpi1 element (Supplemental Information), found in many Wolbachia genomes, but not present at this location in the wMel reference sequence (Cordeaux et al. 2008).
IS elements encode a transposase gene that mediates their movement (Chandler and Mahillon 2002). ISWpi1 elements are related to the IS5 family, which seems to be restricted to Wolbachia (Duron et al. 2005; Cordaux et al. 2008). ISWpi1 elements occur in more than half of the Wolbachia strains that have been evaluated, including wYak and wMel (Cordaux et al. 2008); and these elements occur at variable copy number, potentially facilitating horizontal transmission of ISWpi1 elements and their intervening sequence between Wolbachia variants/strains (Cordaux et al. 2008). Following placement of contig wYak “702380” relative to wMel, we aligned these regions and calculated pairwise differences along the chromosome using a sliding window.
In addition to containing the cinA-cinB loci, absent in wMel, contig wYak “702380” is on average about 10% different from wMel. In contrast, most of wYak is less than 1% different from wMel (for instance, across the 650,559 bp used in our phylogenetic analyses, wMel and wYak differ by only 0.09%). Downstream of the cinA-cinBwYak-clade region, targeted assembly enabled us to join several more contigs. The junctions were corroborated by mapping paired-end reads (Langmead and Salzberg 2012) and visually inspecting the resulting bam files around joined contigs for reads spanning the new junctions and for concordant read-pair mappings. However, our attempts to fully bridge the gap downstream of the cinA-cinBwYak-clade genes via targeted assembly, scaffolding, and PCR were unsuccessful (see “wYak unassembled gap” in Figure 5). The unassembled region in wYak contains an ISWpi1 element in wMel, Mel #9 (labeled “B” in Figure 5; Cordaux 2008). Although not part of our wYak assembly, homologs of this ISWpi1 element appear in assemblies of this region of the WO prophage in several other A-group Wolbachia (Cordaux et al. 2008; Bordenstein and Bordenstein 2016), specifically wInc from D. incompta (Wallau et al. 2016) and wRi from D. simulans (see Figure 2). In both, we find orthologs to Mel#9 ISWpi1 with corresponding flanking sequence assembled, indicating that this IS element probably occurs in this unassembled region of wYak, wSan, and wTei. The wYak sequence between these two ISWpi1 elements is highly diverged from wMel in comparison to the rest of the wYak genome (about 10% difference versus 1%). It therefore seems plausible that this region experienced a horizontal transfer event in the ancestor of wYak, wTei, and wSan, mediated by the flanking ISWpi1 elements.
We conjectured that horizontal transfer occurred via the excision of the two ISWpi1 elements and the intervening DNA from the donor followed by homologous recombination within the IS elements. To assess the plausibility of this scenario, we used BLAST to compare cinA-cinBwPip genes against all published Wolbachia genomes in the NCBI assembly database (https://www.ncbi.nlm.nih.gov/assembly/?term=Wolbachia). We found homologs of the cinA-cinBwPip in the group-A Wolbachia associated with Nomada bees (Gerth and Bleidorn 2016). An unrooted tree of the ORFs for cinA and cinB (Figure 6) indicates that these genes in wYak, wSan, and wTei are more similar to cinA-cinB from group-B wPip and wAlbB than to cinA-cinB found in the fellow group-A Wolbachia associated with Nomada bees (wNFe, wNPa, wNLeu, and wNFa), which each harbor two different cinA-cinB copies. This indicates that cinA-cinBwYak-clade were acquired via a horizontal transfer event across Wolbachia groups B and A that is independent of the event(s) that placed cinA-cinB in the Wolbachia associated with Nomada bees, suggesting repeated transfers of cin loci.
The two cinA-cinB copies (denoted wNLeu1 and wNLeu2 in Figure 6) in Nomada Wolbachia are nearly as distinct from each other as they are from the homologs in wPip, wAlbB, and wYak (∼7% diverged from these strains, Figure 6). However, among the four Wolbachia-infected Nomada species, the orthologs are very similar, with cinAwNLeu1 having only 0–0.15% pairwise differences among the four strains and cinAwNLeu2 having 0–0.56% pairwise differences. (Reconstructions for cinB gene copies were more complicated as the cinBwNLeu1 copy fails to assemble into a single contig on the 3-prime end.) This pattern suggests that wNLeu1 and wNLeu2 cin copies were acquired by the common ancestor of the four Nomada Wolbachia strains analyzed, followed by cladogenic transfer across Nomada species (Gerth and Bleidorn 2016). The highly fragmented assemblies of the four Nomada Wolbachia strains, with duplicate copies confounding assembly, make it difficult to determine the relative positions of the cinA and cinB copies and if they are likewise flanked by ISWpi1 elements.
To determine whether these genes were potentially moved by ISWpi1 elements into the Wolbachia of the D. yakuba clade and Nomada, we searched each genome using both the cinA-cinBwYak-clade contig and the flanking ISWpi1 elements. Long repeated elements like ISWpi1 (916bp) break most short-read assemblies. Despite this, there is often a small fragment of the element, the length of the short read, on either end of the broken contig, indicative of the repeat element being responsible for terminating contig extension. We looked for the footprint of these elements at the edges of the contigs on which the cin genes were found. We found ISWpi1 elements in the region flanking both cinAwNLeu1 and cinAwNLeu2 copies, consistent with our wYak assembly in which we verified the ISWpi1 element with Sanger sequencing. These data support a role for ISWpi1 in the acquisition of the cinA-cinB genes by the Wolbachia in the D. yakuba clade and the Nomada bees. We conjecture that future work will fully confirm ISWpi1 in the horizontal movement of incompatibility loci between Wolbachia.
DISCUSSION
Our results indicate introgressive and horizontal Wolbachia acquisition in the D. yakuba clade. Evidence for horizontal Wolbachia transfer here and elsewhere (Turelli et al. 2018) suggests that double infections must be common, even if ephemeral. Such double infections provide an opportunity for ISWpi1 transposable elements to mediate horizontal transfer of incompatibility loci among divergent Wolbachia. Importantly, our results highlight that incompatibility factors may move independently of prophage, as evidenced by our discovery of the first prophage documented to have two sets of cif loci. We discuss these conclusions below.
mtDNA Introgression
Our relative mitochondrial chronogram provides strong support for three mitochondrial clades, including a monophyletic D. teissieri clade that is outgroup to two sister clades: one clade consisting of mitochondria from 14 D. santomea individuals, and the other contains all D. yakuba mitochondria plus mitochondria from 12 D. santomea individuals. Our results suggest less mitochondrial introgression in the D. yakuba clade than a past report that used sequence data from only two mitochondrial loci (COII and ND5, 1777 bp) and reported a clade with mitochondria sampled from each species represented (Figure 3 in Bachtrog et al. 2006). (Using sequence data from only COII and ND5, we can replicate this result, indicating data from additional loci are needed to add resolution.) Our results also agree with recent work that demonstrated little nuclear introgression in the D. yakuba clade (Turissini and Matute 2017).
Our results broadly agree with the results produced by Llopart et al. (2014) who assessed mitochondrial introgression between D. santomea and D. yakuba using whole mitochondrial genomes. They generated a neighbor-joining tree that produced a clade consisting of all D. yakuba individuals and 10 D. santomea individuals, and this clade is sister to a clade with 6 D. santomea individuals; one D. santomea haplotype is outgroup to all other haplotypes included in their analysis (Figure 3 in Llopart et al. 2014). Nested within the mixed D. yakuba clade, Llopart et al. (2014) identified a “hybrid zone clade” that includes D. yakuba individuals sampled from São Tomé and from continental Africa and D. santomea individuals sampled from both within and outside the Pico de São Tomé hybrid zone (HZ). The sister D. santomea clade also contains both HZ and non-HZ individuals. Thus, their analysis and ours provide support for hybridization within and outside of the HZ, leading us to question both the existence of a HZ clade and the claim that these species “share the same mitochondrial genome” (Llopart et al. 2014). Instead, both our results and theirs suggest unidirectional introgression of D. yakuba mitochondria into D. santomea; we find 59% of D. santomea individuals having D. yakuba-like mitochondria and they found 46%.
Llopart et al. (2014) used a strict molecular clock to estimate the mitochondrial MRCA of D. santomea and D. yakuba at 10,792–17,888 years by calibrating their tree with the D. yakuba-D. erecta split, estimated at 10.4 mya (Tamura et al. 2004). A high level of mitochondrial saturation over time, with an expected value of 1.44 substitutions per synonymous site for the D. yakuba-D. erecta split, could influence this estimate (Llopart et al. 2014). Moreover, Ho et al. (2005) demonstrated that the mtDNA substitution rate resembles an exponential curve, with high short-term substitution rates that approach the mutation rate, then slowing to a long-term rate after about 1–2 million years of divergence, far younger than the inferred D. yakuba-D. erecta nuclear and mtDNA divergence. Hence, using the slow long-term D. yakuba-D. erecta calibration is likely to underestimate the more rapid rates of divergence experienced by D. yakuba-clade mtDNA, inflating divergence-time estimates (Ho et al. 2005). If we assume that Wolbachia and mitochondria were transferred by introgression, which our analyses support, our estimates in Figure 3 suggest that D. santomea and D. yakuba mtDNA diverged more recently, with our point estimates ranging from about 1,500 to 2,800 years.
Wolbachia placement, divergence, and acquisition
Wolbachia placement
Despite their similarity, wSan, wTei, and wYak form monophyletic clades with wTei outgroup to sisters wSan and wYak, recapitulating host relationships (Figure 1; Turissini and Matute 2017). wMel is sister to the D. yakuba-clade strains in the A group (Figure 2), which also includes D. simulans strains (wRi, wAu, and wHa), wAna, wSuz, and the Nomada Wolbachia (wNFa, wNLeu, wNPa, and wNFe). Our phylogram (Figure 2A) places wAlbB outgroup to wNo and wPip strains that diverged from A-group Wolbachia about 6–46 mya (Meany et al. 2019).
Wolbachia divergence
Our chronogram analyses (Figure 3) estimate that D. yakuba-clade Wolbachia and three wMel variants diverged about 29,000 years ago, and that wTei split from sisters wSan and wYak about 2,500–4,500 years ago, with wSan and wYak diverging about 1,600-2,800 years ago. We estimate that the two most divergent wMel variants from Richardson et al. (2012) and the reference wMel genome split about 4,900–7,200 years ago, indicating more divergence among wMel variants than among D. yakuba-clade Wolbachia strains. All of these results depend on the calibration provided by Richardson et al. (2012) and the relative accuracy of the underlying models of molecular evolution, which assume constant relative rates of change across data partitions. For the deepest divergence in Figure 3, between D. yakuba-clade Wolbachia and wMel, we find that the estimated time depends on the variance in our prior distribution for substitution rates across branches, with a strict clock putting the divergence at about 73,000 years rather than 29,000 years obtained with the most variable prior. Despite this uncertainty, the quantitative differences of our Wolbachia divergence-time estimates do not alter the qualitative conclusion that these Wolbachia did not co-diverge with these hosts, which split several million years ago.
Our findings here and in Turelli et al. (2018) suggest that for several Drosophila species, their current Wolbachia infections have been in residence for only hundreds to tens of thousands of years. Bailly-Bechet et al. (2017) estimated Wolbachia residence times using data from more than 10,000 arthropod specimens spanning over 1,000 species. However, they analyzed DNA sequences from only part of the fast-evolving fbpA Wolbachia locus and the host CO1 mtDNA locus. From an initial model-based meta-analysis, they concluded that “… most infections are very recent …” – consistent with our results. However, they also fit a more complex model with “short” and “long” time scales for acquisition and loss, conjecturing that short-term rates were associated with imperfect maternal transmission. Focusing on long-time rates, they concluded that Wolbachia infections persisted in lineages for approximately 7 million years on average, whereas lineages remained uninfected for about 9 million years. For Drosophila, such long infection durations would imply that Wolbachia-host associations often persist through speciation (see Coyne and Orr 1997 and Turelli et al. 2014 for estimates of speciation times in Drosophila, generally 105–106 years). Extrapolating from limited Wolbachia sequence data, Hamm et al. (2014) conjectured that cladogenic Wolbachia transmission might be common among Drosophila; but this extrapolation is refuted by our genomic analyses. The long Wolbachia durations proposed Bailly-Bechet et al. (2017) depend on their conjecture that their long-term rate estimates accurately reflect acquisition and loss of Wolbachia infections across species. This is worth testing with additional analyses of Wolbachia, mitochondrial and nuclear genomes from a broad range of arthropods.
Wolbachia acquisition––introgression versus horizontal
Our divergence-time estimates for the D. yakuba-clade Wolbachia versus their hosts preclude cladogenic acquisition. Unlike introgression, horizontal (or paternal) Wolbachia acquisition should produce discordance between phylogenies inferred for Wolbachia and the associated mitochondria. With the notable exception of D. santomea line Quija 37, which has mitochondria belonging to the clade associated with D. yakuba, but has wSan Wolbachia, we find no evidence of discordance between the estimated mitochondrial and Wolbachia phylogenies. Hence our data indicate that acquisition by introgression is far more common than horizontal transmission between closely related species, consistent with data on acquisition of wRi-like Wolbachia in the D. melanogaster species group (Turelli et al. 2018). Similarly, consistent with extensive data from D. melanogaster (Richardson et al. 2012) and D. simulans (Turelli et al. 2018) and smaller samples from D. suzukii and D. ananassae (Turelli et al. 2018), we find only one possible example of paternal transmission or horizontal transmission within D. yakuba-clade species.
We also investigated an alternative approach to distinguish between introgressive and horizontal Wolbachia acquisition by estimating substitution ratios for mtDNA versus Wolbachia (Turelli et al. 2018). Because we could estimate this ratio on each branch, we conjectured that this approach might have greater resolving power than our incompletely resolved mitochondrial and Wolbachia phylogenies. We expect higher ratios with horizontal transmission because mtDNA would have been diverging longer than recently transferred Wolbachia. However, this approach assumes that mtDNA and Wolbachia substitution rates remain relatively constant. This is contradicted by the finding of Ho et al. (2005) that mtDNA substitution rates decline substantially with increasing divergence time, reaching an asymptote after around 1–2 million years. To calibrate their Wolbachia substitution rate estimates, Richardson et al. (2012) used an experimentally observed mitochondrial mutation rate in D. melanogaster (6.2 × 10−8 mutations per third-position site per generation) that extrapolates to 62% third-position divergence per million years. This extrapolation is nonsensical as a long-term substitution rate. As summarized by Ho et al. (2005), typical mtDNA substitution rates are 0.5–1.5% per coding site per million years. Nevertheless, using the ratio of short-term rates for mtDNA and Wolbachia, Richardson et al. (2012) produced an estimate of the long-term Wolbachia substitution rate that agrees with independent estimates from Nasonia wasps (Raychoudhury et al. 2009) and Nomada bees (Gerth and Bleidorn 2016) derived from much longer divergence times (assuming cladogenic Wolbachia acquisition) (Conner et al. 2017). This paradox is resolved if Wolbachia molecular evolution is not subject to the dramatic slowdown in rates seen for mtDNA substitution rates.
The apparent difference between mtDNA molecular evolution (dramatic slowdown over longer time scales, Ho et al. 2005) and Wolbachia molecular evolution (relative constancy, as inferred from similar rates of differentiation over very different time scales) suggests why our relative-rate test does not reject introgressive transmission of Wolbachia between D. melanogaster and the D. yakuba clade, even though it is clearly impossible. A roughly 50-fold slowdown in mtDNA substitution rates over the time scale of the divergence of D. melanogaster from the D. yakuba clade, relative to the rate of differentiation within the D. yakuba clade, produces comparable mtDNA-Wolbachia substitution ratios for comparisons within the D. yakuba clade and between D. melanogaster and the D. yakuba clade. Because of this complication and our conjecture that relative rates of mtDNA versus Wolbachia substitutions over longer periods are likely to mirror the ten-fold differences we see for mtDNA versus nuclear genes, phylogenetic discordance between mitochondria and Wolbachia is clearly a much more robust indicator of horizontal (or paternal) Wolbachia transmission. Nevertheless, additional examples of cladogenic Wolbachia acquisition are needed to better understand relative rates and patterns of Wolbachia, mtDNA and nuclear differentiation over different time scales.
Our divergence-times estimate of the D. yakuba-clade Wolbachia versus their hosts precludes cladogenic transmission; and our phylogenetic analyses suggest that these species share very similar Wolbachia because of introgression, as originally argued by Lachaise et al. (2000). However, under either introgressive or horizontal transfer of Wolbachia, we expect the donor species Wolbachia sequences would appear paraphyletic when analyzed jointly with the Wolbachia from the recipient. Paraphyly allowed Turelli et al. (2018) to infer that D. simulans likely obtained its Wolbachia from D. ananassae and D. subpulchrella likely obtained its Wolbachia from D. suzukii. Paraphyly is generally expected soon after gene flow stops between populations. As noted by Hudson and Coyne (2002, Fig. 1), the time scale expected to produce reciprocal monophyly for mitochondria (and Wolbachia) under a neutral model of molecular evolution is on the order of the effective size of the species. Our results in Figure 3 indicate that, at least for our small samples, reciprocal monophyly for the Wolbachia in these three species has been achieved within a few thousand years. This suggests that reciprocal monophyly has been accelerated by species-specific selective sweeps within the Wolbachia or mitochondria of these species. This conjecture may be testable from estimates of host fitness using transinfected versus native Wolbachia.
IS transposable elements mediate horizontal transfer of incompatibility loci between divergent Wolbachia
Wolbachia in all three D. yakuba-clade hosts cause both intra-and interspecific CI (Cooper et al. 2017), despite originally being characterized as non-CI causing (Zabalou et al. 2004; Charlat et al. 2004). CI is relatively weak, and its strength can vary among wTei variants and D. teissieri backgrounds (Table 3 in Cooper et al. 2017). Differences in CI among Wolbachia variants has also been demonstrated in interspecific backgrounds where wTei caused stronger CI in a D. simulans background (97.2 ± 1.3 SE percent embryo mortality) than either wYak (26.5 ± 4.2 SE, percent embryo mortality) or wSan (24.0 ± 4.1 SE, percent embryo mortality) (Zabalou et al. 2008). Both wYak and wSan induced CI in D. simulans comparable in intensity to that found by Cooper et al. (2017, Figure 3) in their original hosts. Surprisingly, Zabalou et al. (2008) found that the strength of CI induced by wTei in D. simulans even eclipsed that of wRi (89.8 ± 4.5 SE percent embryo mortality). These results must be reconciled with the fact that loci known to underlie CI do not vary within or among D. yakuba-clade Wolbachia variants we examined. The nearly complete CI induced by wTei in D. simulans may depend on CI-causing factors yet to be identified or differences in gene expression.
In each D. yakuba-clade Wolbachia variant included in our analyses, we find a disruption of cidBwYak-clade with an inversion from amino acids 37–103 relative to the same region in sister wMel. The inversion introduces multiple stop codons that could render this gene nonfunctional. Fixation of loss-of-function mutations in CI-causing loci is consistent with theoretical analyses showing that selection on Wolbachia within host lineages does not act to increase or maintain CI (Prout 1994; Turelli 1994; Haygood and Turelli 2009); indeed, we have also recently observed a single mutation that disrupts cidB in non-CI causing wMau Wolbachia that infect D. mauritiana on the island of Mauritius (Meany et al. 2019). In both wMau and the D. yakuba-clade Wolbachia, we find fixation of defects in the putative toxin gene. We expect that future genomic analyses will produce additional examples.
All D. yakuba-clade Wolbachia genomes included in our analysis harbor cinA-cinB loci originally discovered in the wPip strain that diverged from A-group Wolbachia, including the D. yakuba-clade variants, about 6–46 mya (Meany et al. 2019). cin loci are also present in B-group wAlbB that infects Ae. albopictus and in A-group wNFe, wNPa, wNLeu, and wNFa Wolbachia that infect Nomada bees. cin loci are absent from wMel, but the wYak contig containing these loci is about 10% diverged from wMel, while observed divergence between wYak and wMel across the rest of the genome is less than 1%. The wYak-clade cin loci share about 97% similarity with the divergent B-group wPip strain. wYak-clade cin loci are more similar to cinA-cinB from the B-group Wolbachia wPip and wAlbB than to those in A-group Nomada Wolbachia strains, which have two sets of cin loci that are as diverged from each other as they are from these regions in wYak and in B-group Wolbachia. These observations suggest independent horizontal transfer of cin loci into wYak and Nomada Wolbachia.
Our results indicate that independent of prophage movement, ISWpi1-element paralogs can move incompatibility loci via the excision of flanking ISWpi1 elements, followed by homologous recombination within the elements. Horizontal Wolbachia acquisition is common in Drosophila (Turelli et al. 2018) and other species (O’Neill et al. 1992), suggesting that double infections, which could provide the opportunity for ISWpi1-mediated transfer of incompatibility loci, may be common, even if transient. (A second infection need not become stably transmitted for horizontal gene transfer via ISWpi1 elements to occur.) In contrast, phage particles or virions could be introduced by a vector and provide the opportunity for ISWpi1 mediated transfer (Ahmed et al. 2015; Brown and Lloyd 2015), without the presence of a double Wolbachia infection. Determining whether the insertion of these loci was derived from a prophage region of the Wolbachia genome, or from a phage genome encapsulated in a phage particle, remains an open question. While the ISWpi1 element in wMel (Mel #9 labeled “B” in Figure 5; Cordaux 2008) is not part of our wYak assembly, homologs of this element are present in assemblies of several other A-group Wolbachia including wInc and wRi (Cordaux et al. 2008; Bordenstein and Bordenstein 2016). We predict this element occurs in the unassembled region of our wYak assembly. Footprints of ISWpi1 elements in the region flanking the cinA genes for both copies of the gene in the Nomada Wolbachia provide further support for our hypothesis. Long-read-based Wolbachia assemblies from many infected host systems will elucidate the role of ISWpi1 elements in horizontal transfer of CI loci. Overall, the ecology of horizontal Wolbachia transmission is crucial to understanding Wolbachia acquisition; and the transfer and dynamics of CI loci are crucial to understanding Wolbachia evolution.