Convergent evolution of the army ant syndrome and congruence in big-data phylogenetics

Marek L. Borowiec

doi:10.1101/134064

Abstract

The evolution of the suite of morphological and behavioral adaptations underlying the ecological success of army ants has been the subject of considerable debate. This “army ant syn-drome” has been argued to have arisen once or multiple times within the ant subfamily Dorylinae. To address this question I generated data from 2,166 loci and a comprehensive taxon sampling for a phylogenetic investigation. Most analyses show strong support for convergent evolution of the army ant syndrome in the Old and New World but certain relationships are sensitive to analytics. I examine the signal present in this data set and find that conflict is diminished when only loci less likely to violate common phylogenetic model assumptions are considered. I also provide a temporal and spatial context for doryline evolution with timecalibrated, biogeographic, and diversification rate shift analyses. This study underscores the need for cautious analysis of phylogenomic data and calls for more efficient algorithms employing better-fitting models of molecular evolution.

Significance Recent interpretation of army ant evolution holds that army ant behavior and morphology originated only once within the subfamily Dorylinae. An inspection of phylogenetic signal in a large new data set shows that support for this hypothesis may be driven by bias present in the data. Convergent evolution of the army ant syndrome is consistently supported when sequences violating assumptions of a commonly used model of sequence evolution are excluded from the analysis. This hypothesis also fits with a simple scenario of doryline biogeography. These results highlight the importance of careful evaluation of signal and conflict within phylogenomic data sets, even when taxon sampling is comprehensive.

Introduction

Army ants (Figure 1) are a charismatic group of organisms that inspire research in such disparate fields as behavioral ecology (Schöning et al., 2005), biodiversity conservation (Peters et al., 2008), and computational biology (Garnier et al., 2013). These ants are distributed throughout warm temperate and tropical regions of the world and belong to a more inclusive clade known as the subfamily Dorylinae (Brady et al., 2014). They are characterized by a suite of morphological and behavioral adaptations, together dubbed the army ant syndrome (Brady, 2003). This syndrome includes obligate collective foraging, frequent colony relocation, and highly specialized wingless queens. In contrast to many other ant species, army ants never forage individually. Nests of army ants are temporary and in some species colonies undergo cycles of stationary and nomadic phases (Schneirla, 1945). Unable to fly, the queens must disperse on foot and are adapted to producing enormous quantities of brood (Raignier et al., 1955). Other peculiarities of army ants include colony reproduction by fission and highly derived male morphology. No army ant species are known to lack any of the components of the syndrome (Gotwald, 1995). Because of its antiquity and persistence, the army ant syndrome has been cited as an example of remarkable long-term evolutionary stasis (Brady, 2003). Several distantly related lineages of ants evolved one or more of the components of the army ant syndrome but did not reach the degree of social complexity or ecological dominance of the “true army ants” in the Dorylinae (Kronauer, 2009).

Figure 1:

Striking morphological similarity of Old World (left) and New World (right) army ants. A: Aenictus male, B: Neivamyrmex male, C: Dorylus soldier, D: Labidus soldier. Photographs courtesy of Alex Wild/alexanderwild.com.

A major question of army ant biology is whether the army ant syndrome originated only once or if it arose independently in New World and Old World army ants (Kronauer, 2009). The answer requires a robust phylogenetic framework, but Dorylinae are an example of an ancient rapid radiation and elucidating its phylogeny has proven difficult (Brady et al., 2014). Before the advent of quantitative phylogenetic methods, the army ants were thought to have arisen more than once within the subfamily (Brown, 1975; Gotwald, 1979; Bolton, 1990). This view was based on the observation that army ants are poor at dispersal and on the assumption that they evolved after the breakup of Gondwana around 100 Ma ago. More recent studies (Baroni Urbani et al., 1992; Brady, 2003), however, reported monophyly of army ant lineages, even though statistical support for this grouping was often low (Brady et al., 2014). Furthermore, the most recent divergence time estimates suggest that army ants indeed originated after the breakup of Gondwana, implying either independent origins or long-distance dispersal (Kronauer, 2009).

Here I reassess the phylogeny of the Dorylinae including New World and Old World army ants. I use a large phylogenomic data set and taxon sampling increased twofold compared to a previous phylogeny (Brady et al., 2014), including 155 taxa representing more than 22% of described doryline species diversity and all 27 currently recognized extant genera (Borowiec, 2016b). The sequence data comes from a total of 2,166 loci centered around Ultraconserved Elements or UCEs (Faircloth et al., 2012; Branstetter et al., 2017) distributed throughout the ant genomes, comprising 892,761 nucleotide sites with only 15% of missing data and gaps.

Analyses of the complete data set and loci with high average bootstrap support or “high phylogenetic signal” (Salichos and Rokas, 2013) produce results that are sensitive to partitioning scheme and inference method. Analysis of data less prone to systematic bias, such as slow-evolving (Rodríguez-Ezpeleta et al., 2007; Betancur-R. et al., 2013; Goremykin et al., 2015) or compositionally homogeneous loci (Jermiin et al., 2004) and amino acid sequences (Hasegawa and Hashimoto, 1993), consistently support the hypothesis of independent origins of the army ant syndrome in the Old and New World.

Results

Relationships Among Doryline Lineages Inferred From Combined Data Matrix

The concatenated dataset produces different topologies depending on the partitioning scheme and method used to infer the phylogeny (Figure 2A–D, P). The true army ants are not monophyletic in the maximum likelihood tree under partitioning by locus. The tree shows a clade that includes almost all New World dorylines, including New World army ants (hereafter called “New World Clade”; Figure 2A; Supplementary Figure 1). The latter include the genera Cheliomyrmex, Eciton, Labidus, Neivamyrmex, and Nomamyrmex. Apart from New World army ants, the New World Clade unites all exclusively New World lineages of the Dorylinae and includes Acanthostichus, Cylindromyrmex, Leptanilloides, Neocerapachys, and Sphinctomyrmex. Old World driver ants Dorylus and the poorly known genus Aenictogiton are sister to Aenictus. Other well-supported clades comprising multiple genera include a clade of South-East Asian and Malagasy lineages Cerapachys, Chrysapace, and Yunodorylus, a well-resolved clade of several Old World genera (Lioponera, Lividopone, Parasyscia, Zasphinctus) that was also recovered in (Brady et al., 2014). Another South-East Asian clade consists of the mostly subterranean genera Eusphinctus, Ooceraea, and Syscia. The backbone of the tree shows multiple short internodes and generally lower resolution, consistent with a previous study (Brady et al., 2014). The maximum likelihood tree inferred from the same data set but using k-means partitioning (Frandsen et al., 2015) shows a stronglysupported Old World Aenictus and New World army ants clade (Figure 2B; Supplementary Figure 2). Army ants are not monophyletic in this tree either, because Dorylus and Aenictogiton are not sister to the Aenictus plus New World army ants clade. Unpartitioned analysis results in a similar topology (Figure 2C; Supplementary Figure 3).

Supplementary Figure 1:

Maximum likelihood tree obtained from the combined data matrix partitioned by locus. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 2:

Maximum likelihood tree obtained from the combined data matrix under k-means partitions. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 3:

Maximum likelihood tree obtained from unpartitioned combined data matrix. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Figure 2:

Summary of phylogenetic position of army ant lineages recovered in different analyses. In each figure, letter A signifies Aenictus, D: Aenictogiton+Dorylus, E: New World army ants (Cheliomytmex, Eciton, Labidus, Neivamyrmex, Nomamyrmex), N: all other New World dorylines (Acanthostichus, Cylindromyrmex, Leptanilloides, Neocerapachys, Sphinctomyrmex) except Syscia. Note increased incongruence in analysis of combined data set (A–D) and “high signal” data set (E–H) compared to slow-evolving (I–L) and compositionally homogeneous (M) loci. Asterisk indicates support ≤ 99%. Dashed lines signify polyphyly. See Supplementary Figures 1–18 for complete trees.

Coalescent species tree inferred from all loci is similar to the concatenation results with respect to poor resolution at deep nodes (Figure 2D; Supplementary Figure 4). The species tree resembles the unpartitioned and k-means partitioned concatenated trees because of moderate support for the Aenictus plus New World army ants clade, and contrast most with both concatenated trees by supporting army ant monophyly, the hypothesis favored by previous studies (Brady, 2003; Brady and Ward, 2005; Brady et al., 2014). Overall support for the deep nodes is lower in species trees and this approach does not recover some of the clades present across all concatenated analyses, namely the New World or South-East Asian/Malagasy groups mentioned above.

Supplementary Figure 4:

Summary species tree obtained from all 2,166 loci using ASTRAL. Scale is in coalescent units. Nodal support in local posterior probabilities.

Evidence for Bias in Maximum Likelihood Tree Based on All and “High Signal” Loci

The conflict among ML topologies derived from different partitioning schemes and between concatenated and species tree warranted further investigation. I constructed five data sets in addition to the combined data matrix: matrix of “high signal” loci (Salichos and Rokas, 2013) equivalent in length to 1/5 of the sites in combined data set matrix, matrix composed of slow-evolving loci equivalent in length to 1/5 of the sites in combined data set matrix, matrix composed of only compositionally homogeneous loci, and matrix that excluded the long-branched Aenictus species but is otherwise identical to combined data matrix (See Supplementary Table 2 for more information on these matrices). Additionally, I developed a workflow to extract putatively protein-coding regions from the UCE data set (see Extended Methods) and analyzed those regions separately, coded as amino acids.

Analyses of loci whose trees have highest mean bootstrap support (high phylogenetic signal sensu Salichos and Rokas (2013)) shows even more discordance among analyses than that of the combined data set (Figure 2E–H; Supplementary Figures 5–8). Army ant lineages show different topologies in each of the three used partitioning schemes.

Using only the slowest evolving one-fifth of the data (”slow-evolving” hereafter), there is universal support for a clade that unites all exclusively New World lineages, including the New World army ants under all partitioning schemes (Figure 2I–K; Supplementary Figures 9–11). Species tree analyses of slow-evolving loci also support New World Clade and the Aenictus plus (Aenictogiton+Dorylus) clade (Figure 2L; Supplementary Figure 12).

Supplementary Figure 5:

Maximum likelihood tree obtained from high-bootstrap data matrix partitioned by locus. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 6:

Maximum likelihood tree obtained from high-bootstrap data matrix under k-means partitions. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 7:

Maximum likelihood tree obtained from unpartitioned high-bootstrap data matrix. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 8:

Summary species tree obtained from 271 highest bootstrap loci using ASTRAL. Scale is in coalescent units. Nodal support in local posterior probabilities.

Supplementary Figure 9:

Maximum likelihood tree obtained from slow-evolving data matrix partitioned by locus. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 10:

Maximum likelihood tree obtained from slow-evolving data matrix under k-means partitions. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 11:

Maximum likelihood tree obtained from unpartitioned slow-evolving data matrix. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 12:

Summary species tree obtained from 580 slow-evolving loci using ASTRAL. Scale is in coalescent units. Nodal support in local posterior probabilities.

Examining locus properties in different rate bins reveals that more rapidly evolving sequences and “high signal” loci exhibit qualities previously associated with systematic bias, namely higher compositional heterogeneity (Lockhart et al., 1994; Jermiin et al., 2004) and saturation (Philippe and Forterre, 1999). The most slowly evolving loci have lower overall among-taxon sequence heterogeneity than more rapidly evolving and high bootstrap loci (Supplementary Figure 28A; RCFV two-sample t-test slow-evolving vs. all other loci: t = 28.564, slow-evolving mean = 0.0358, all other loci mean = 0.0575, p « 0.01; slow-evolving vs. high bootstrap: t = 27.538 high bootstrap mean = 0.0647, p « 0.01). Saturation is also greater in more rapidly evolving and high bootstrap loci ((Supplementary Figure 28B; slope of regression two-sample t-test slow-evolving vs. all other loci: t = 21.682, slow-evolving mean = 0.474, all other loci mean = 0.361, p « 0.01; slowevolving vs. high bootstrap: t = 10.129, high bootstrap mean = 0.393, p « 0.01). Of the 379 loci that pass the homogeneity test (Foster, 2004), 244 are found among slow-evolving loci and zero are present among the high bootstrap loci.

Analysis of loci that pass the phylogeny-corrected compositional homogeneity test (Foster, 2004) shows universal support for the New World Clade, regardless of partitioning scheme (Figure 2M; Supplementary Figure 13–15). Among-taxon compositional heterogeneity is a relatively well-understood source of bias in phylogenetic analysis (Jermiin et al., 2004) and is not accounted for by the general time-reversible model of sequence evolution used in RAxML, the program used here for maximum likelihood inference.

Supplementary Figure 13:

Maximum likelihood tree obtained from compositionally homogeneous data matrix partitioned by locus. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Amino acid alignments are also known to be more robust against compositional bias (Hasegawa and Hashimoto, 1993) and saturation, although not free from it when distantly related lineages are considered (Foster and Hickey, 1999). In this study the amino acid matrix of protein-coding sequences is highly conserved compared to the nucleotide matrix of combined data with 14% proportion of parsimony informative sites in the former compared to 48% in the latter. The amino acid matrix analyzed under maximum likelihood recovers monophyletic New World Clade and Old World army ants (Figure 2N; Supplementary Figure 16).

Supplementary Figure 14:

Maximum likelihood tree obtained from compositionally homogeneous data matrix under k-means partitions. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 15:

Maximum likelihood tree obtained from unpartitioned compositionally homogeneous data matrix. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Supplementary Figure 16:

Maximum likelihood tree obtained from amino acid data matrix partitioned by locus. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

More evidence for the artefactual nature of the grouping of Old World and New World army ants comes from an analysis where Aenictus is removed from the combined data matrix partitioned under the k-means scheme. If Aenictus was not affecting the position of New World army ants on the tree, one would expect to see no change of the position of the latter if the former is removed. This is not the case here; the phylogeny recovered from the alignment without Aenictus has New World army ants nested within the New World Clade (Figure 2O; Supplementary Figure 17).

Supplementary Figure 17:

Maximum likelihood tree obtained under from complete data matrix with Aenictus removed under k-means partitioning. Scale is in number of substitutions per site. Nodal support in percent bootstrap.

Bayesian analysis of k-means partitioned combined data matrix strongly supports the New World Clade and monophyly of Aenictogiton+Dorylus, unlike the ML analysis of the same dataset under the same partitioning scheme and sequence evolution model (Figure 2P; Supplementary Figure 18).

Supplementary Figure 18:

Bayesian posterior consensus tree obtained from complete data matrix under k-means partitioning. Scale is in number of substitutions per site. Nodal support in posterior probability.

In summary, although concatenated ML and species tree analysis of combined data matrix and high-bootstrap loci support grouping of Old World army ants Aenictus with New World army ants under some conditions, analyses using more reliable data are congruent in their support for independent origins of the army ant syndrome in New World and Old World.

The Timeline of Doryline Evolution and Diversification

Fossilized birth-death (FBD) process divergence dating (Heath et al., 2014) employed here shows that crown doryline ants started diversifying in the Cretaceous, around 74 Ma (53–101 Ma 95% highest probability density interval or HPD) ago according to Bayesian inference (97 Ma under penalized likelihood or PL; Supplementary Figure 19) (Figure 3). The FBD results near the root are characterized by high uncertainty but the mean age suggests a younger crown age than 87 Ma recovered in a previous study (Brady et al., 2014). The difference is likely at least in part due to a different calibration approach used in the present study (FBD versus node dating) and a revised, younger age estimate of Baltic amber (Aleksandrova and Zaporozhets, 2008). Similar to concatenation, divergence dating shows a tree highly compressed during early evolution, with 16 splits occurring within the first 20 Ma. According to penalized likelihood this period of early diversification lasted longer, accounting for about 35 Ma. The most recent common ancestor of the New World Clade is resolved at 60 Ma (43–83 Ma 95% HPD interval; 60 Ma under PL). The split of Old World army ant lineages of Aenictus and Dorylus+Aenictogiton is apparently very ancient at 58 Ma (41–79 Ma 95% HPD; 54 Ma under PL) and occurred during the initial diversification period. The old age of this node and long branch subtending extant Aenictus help explain why this relationship is difficult to recover. The five currently recognized genera of New World army ants share an ancestor at about 28 Ma (20–37 Ma 95% HPD; 27 Ma under PL) ago. These dates are younger than those inferred for several of the non-army ant doryline genera such as Eburopone or Leptanilloides. The conspicuous above-ground foraging seen in some Dorylus driver ants and in New World Eciton is remarkably young, as it appears that these groups diversified within the last 5–6 Ma, a result robust across different dating analyses (see Supplementary Figures 19-21). The driver ant genus Dorylus is shown to be young relative to earlier analyses (Brady, 2003; Kronauer et al., 2007) at ca. 16 Ma (11–22 95% HPD; 15 Ma under PL).

Supplementary Figure 19:

Summary tree with mean ages from 100 analyses under penalized likelihood in Chronos.

Figure 3:

Timeline of doryline evolution and biogeographic history. The highest relative probabilities of ancestral ranges are shown as inferred using BioGeoBEARS under DEC+J, averaged over 100 trees from the BEAST posterior. The tree is the BEAST consensus. Selected mean divergence time estimates are given at nodes. Stars signify rate shifts. All dates in Ma. See Supplementary Figures 20, 21, and 26 for average node dates, posterior probabilities from the BEAST analysis, and pie charts of relative probabilities of all possible ancestral ranges, respectively.

The Bayesian divergence time estimates, especially those early in the tree and outside the New World Clade, are associated with considerable uncertainty. This is likely due to both topological uncertainty and the fact that only seven fossil calibrations were available for the dorylines, all except one located within the New World Clade (Supplementary Table 4; Supplementary Figure 20).

Supplementary Figure 20:

Mean ages and their 95% confidence intervals on the consensus BEAST tree inferred under fossilized birth-death process. All ages in Ma.

Diversification shift analyses in BAMM (Rabosky, 2014) identified a three shift-scenario for dorylines, with shifts occurring separately on the branches subtending Aenictus, Dorylus, and Neivamyrmex (Figure 3). Rate shift configurations in which only two shifts were identified, however, were also common in the posterior sampling (Supplementary Figure 25), either on branches leading to Aenictus and Neivamyrmex only, or on the branches subtending the clade of Aenictus, Aenictogiton, and Dorylus and Neivamyrmex (see Supplementary Figures 22-25).

Supplementary Figure 21:

Posterior probabilities on the consensus BEAST tree inferred under fossilized birth-death process. Red dots reflect monophyly constraints used in the dating analysis. All ages in Ma.

Supplementary Figure 22:

BAMM rate shift tree showing the overall best fit configuration. Redfilled circles signify placement of the shifts.

Supplementary Figure 23:

BAMM rate shift tree showing net diversification rates. A: Aenictus, D: Dorylus, L: Lioponera, N: Neivamyrmex.

Supplementary Figure 24:

BAMM plot showing nine most common shift configurations in the credible set. The “f” number corresponds to the proportion of the posterior samples in which this configuration is present. A: Aenictus, D: Dorylus, L: Lioponera, N: Neivamyrmex.

Supplementary Figure 25:

BAMM cohort plot. Blocks signify comparisons of shift regimes among species and clades, except across the diagonal which represents the comparison of a species to itself. A: Aenictus, D: Dorylus, L: Lioponera, N: Neivamyrmex.

Biogeographic History

Strong geographic affinities are apparent within the doryline phylogeny. Large clades are mostly confined to only one or two adjacent realms, although movement within both the Old and New World appears to have been common (Figure 3). There is strong evidence for Old World origins of the dorylines and the analyses summarized across a sample of trees from the posterior suggest an Afrotropical ancestral range (Figure 3; Supplementary Figure 26), although the analysis under BEAST consensus only results in high uncertainty and combined Afrotropical-Malagasy-Oriental as the most likely ancestral range (Supplementary Figure 27). Two lineages are confined to the New World. One gave rise to the radiation of almost all extant New World forms, including New World army ants and their kin. The dates estimated for the origin of this New World Clade coincide with warm climatic conditions and multiple land bridges connecting the Old and New World in northern latitudes (Brikiatis, 2014). The other New World group is much younger and appeared to arrive some time after 28 Ma ago. The presumably SE Asian or Palearctic ancestor of the New World species of Syscia either crossed the Beringian land bridge or dispersed across the Pacific further south, since by that time North Atlantic connections were closed (Sanmartín et al., 2001). The Old World Aenictus and New World genus Neivamyrmex, although superficially similar in appearance and biology, illustrate different scenarios of biogeographic history for generic lineages within Dorylinae. Crown Aenictus is older at 23 Ma (34 Ma under PL) and originated in the Indomalayan region. It then subsequently moved into the Afrotropics, dispersed back into Indomalaya, and moved into Australasia with possible movement back into Indomalaya. Some species also range into the Palearctic. In contrast, Neivamyrmex remained largely confined to the Neotropics where it originated around 13 Ma (20 Ma under PL) ago, with at least one clade moving into and diversifying in the southern Nearctic and with some species returning to the Neotropics or straddling the boundary of the two adjacent regions. Madagascar is a center of doryline diversity with seven overall and two endemic genera but no true army ants (Figure 3).

Supplementary Figure 26:

Relative likelihoods of ranges estimations from BioGeoBEARS under DEC+J, averaged over 100 posterior BEAST trees. Pie charts at the nodes correspond to ancestral state estimations and pie charts on the corners correspond to ranges immediately following speciation. The region names are abbreviated as follows: Neotropical (T), Nearctic (N), Palearctic (P), Afrotropical (E), Malagasy (M), Indomalayan (O), and Australasian (A). All ages in Ma.

Supplementary Figure 27:

Relative likelihoods of ranges from BioGeoBEARS under DEC+J estimated on the BEAST consensus tree. Pie charts at the nodes correspond to ancestral state estimations and pie charts on the corners correspond to ranges immediately following speciation. The region names are abbreviated as follows: Neotropical (T), Nearctic (N), Palearctic (P), Afrotropical (E), Malagasy (M), Indomalayan (O), and Australasian (A). All ages in Ma.

Supplementary Figure 28:

Box plots comparison of properties of slow-evolving, compositionally homogeneous, and “high signal” or high average bootstrap loci. A: Relative composition frequency variability (RCFV), B: Slope of regression of p-distances against distances on ML tree from a locus. Higher RCFV signifies more compositional heterogeneity and higher slope of regression signifies less potential for saturation.

Concluding Remarks

Doryline Biology and Evolution Need Further Study

The new phylogeny presented here reveals that the army ant syndrome can be viewed as both an example of long-term evolutionary stasis and a remarkable case of convergent evolution. Brady (2003) argued that the army ant syndrome originated only once around 100 Ma ago and has since persisted without loss in any descendant lineages. While the present study suggests that this set of behavioral and morphological traits evolved at least twice in the Dorylinae, it also shows that the syndrome has been conserved within Old and New World army ants. The alternative scenario of single origin requires multiple losses on lineages leading to both Old and New World army ants (Figure 3), an unlikely proposition given that no species are known to have lost any of the syndrome components in the large and diverse genera such as Aenictus, Dorylus, or Neivamyrmex.

Despite the improved resolution of the army ant tree, much work remains to be done with the regard to doryline evolution. A particularly vexing matter is poor knowledge of the Afrotropical genus Aenictogiton. Although based on male morphology and its phylogenetic affinity to Dorylus it has been assumed that it is a subterranean army ant, there is no direct evidence of army ant behavior or queen morphology (Borowiec, 2016b). If Aenictogiton is not an army ant, our views on the evolution of the army ant syndrome have to be adjusted. In general, the current knowledge of doryline ecology and behavior is mostly limited to the minority of species that are conspicuous above-ground foragers, although the clonal Ooceraea biroi is a notable exception (Tsuji and Yamauchi, 1995; Ravary and Jaisson, 2002; Oxley et al., 2014). Better understanding of doryline behavior and morphology will undoubtedly yield further insights into the evolution of the army ant syndrome. For example, independent evolution of army ants is perhaps less surprising given that certain components of the syndrome (e.g., derived queen and male morphologies) appear multiple times within the subfamily. Unfortunately, too little data exists for a rigorous study of these trends in a comparative framework (Borowiec, 2016b). Early anatomical research implied army ant polyphyly by emphasizing differences in sting and Dufour gland morphologies (Hermann, 1969; Billen, 1985; Billen and Gotwald, 1988) between the Old and New World army ants. The new phylogenetic and taxonomic (Borowiec, 2016b) framework should reinvigorate comparative work on doryline morphology and biology.

Densely Sampled Phylogenomic Data Sets Are Not Robust to Artefacts and Bias

Genome-scale data offers powerful tools for reconstructing phylogenies. This study, however, demonstrates that caution is necessary when evaluating hypotheses generated by these new data sets, even when taxon sampling is comprehensive. Empirical studies that emphasize the importance of bias often recommend improving taxon sampling but usually also deal with cases where sampling could be increased relative to first attempts, such as in broad-scale phylogenies of eukaryotes or Metazoa (Delsuc et al., 2005). The doryline phylogeny represents a case where the scope for improvement by additional sampling of currently known lineages is limited. This is because adding more species is likely to fail at significantly shortening long branches in the tree (Borowiec, 2016b). Researchers should be thus wary of systematic bias whenever combination of very short and very long branches is encountered, regardless of taxon sampling. Two general strategies are available for exploring sensitivity to model mis-specifications: using only data less prone to bias or applying better-fitting heterogeneous models (Rodríguez-Ezpeleta et al., 2007). The latter approach is ideal but computationally expensive, becoming prohibitive with data sets such as the one used here, including over 150 taxa and thousands of loci.

In this study, reduction of phylogenetic noise resulting from compositional heterogeneity and saturation increased congruence among topologies obtained using different analytics. In the case of the complete data set, mutually exclusive and strongly supported results were be obtained depending on the statistical framework (e.g., maximum likelihood vs. Bayesian) or sequence evolution model (e.g., k-means partitioning vs. partitioning by locus) chosen for analysis. Using only “high signal” loci (Salichos and Rokas, 2013) with highest average bootstrap exacerbated incongruence among analyses. These loci also exhibited undesirable properties such as higher potential for saturation and violation of the among-taxon compositional heterogeneity. This indicates that using additional measures of data quality is needed in phylogenomics, such as direct and indirect measures of model mis-specification (Brown, 2014). Other recent research suggests that analysis results can be strongly affected by a tiny proportion of highly biased loci or sites (Shen et al., 2017). In conclusion, phylogenomic studies should always perform sensitivity analyses to test the robustness of the result to different analytics.

Methods

Taxon Sampling and Data Generation

For a detailed description of methods and additional references see Supplementary Materials. I chose doryline species for the analyses based on a recent generic revision of the subfamily (Borowiec, 2016b) and other recent taxonomic and phylogenetic work. I maximized the breadth of sampling by including at least one representative from each biogeographic region in which a genus occurs and aiming to sample across morphologically disparate groups within genera. I extracted the genomic DNA, prepared libraries and enriched them with molecular probes designed as described in (Faircloth et al., 2014), targeting 2,524 loci (Branstetter et al., 2017). Dual-indexed (Faircloth and Glenn, 2012), enriched sequences were sequenced on two lanes of Illumina HiSeq 2500 platform. I cleaned demultiplexed reads using Illumiprocessor (Faircloth, 2011) and performed assembly with Trinity (Grabherr et al., 2011). I then mapped resulting contigs to probe sequences and mined published genomes for outgroup and one ingroup sequence using the Phyluce pipeline (Faircloth, 2015), aligned orthologous sequences with Upp (Nguyen et al., 2015), and trimmed using Gblocks (Talavera and Castresana, 2007) with stringency settings relaxed from the default. I discarded any sequences with fewer than 114 taxa (70%) from further analyses. This resulted in 2,166 orthologous loci, 412 bases long on average.

Extraction of protein-coding sequences was done by blasting (Camacho et al., 2009) UCE loci against published proteins of three reference ant species (Acromyrmex echinatior, Harpegnathos saltator, and Ooceraea biroi), followed by collecting of protein queries and their nucleotide equivalents from best BLASTX hits for each locus using a custom bioinformatics pipeline. Separate alignment and trimming of resulting data set to include no fewer than 70% of all taxa produced an amino acid matrix of 1,103 loci 89.5 k amino acids long.

Phylogenetics

I first estimated a gene tree for each locus using RAxML (Stamatakis, 2014) under GTR+4Γ model with 200 rapid bootstrap replicates. I then used individual locus characteristics to assess whether loci with different properties produce different phylogenies. Most basic statistics, such as alignment length, proportion of variable sites, and missing data were calculated using AMAS (Borowiec, 2016a). Using a custom R script I computed average branch lengths and average bootstrap support, used here as a proxy for rate of evolution and phylogenetic signal, respectively. With a custom Python script I calculated RCFV (Zhong et al., 2011), a measure of compositional heterogeneity, and performed a simulation-based compositional heterogeneity test with the Python p4 phylogenetics toolkit (Foster, 2004). Following this, I constructed a concatenated matrix of 271 loci with highest average bootstrap support, equal in length to 1/5 sites present in the combined data set. I also prepared a matrix of 379 compositionally homogeneous loci only and an alignment of identical to the combined data matrix but with Aenictus removed. I also constructed an amino acid matrix from extracted protein-coding sequences. For each of the concatenated nucleotide UCE data sets I used three partitioning schemes: 1) partitioning by UCE locus, 2) partitioning using the k-means algorithm (Frandsen et al., 2015) implemented in PartitionFinder2 (Lanfear et al., 2017), and 3) an unpartitioned model. The protein-coding sequences were analyzed as partitioned by locus. I estimated the phylogeny with RAxML under the partitioned GTR+4Γ model and 500 rapid bootstrap replicates (see Extended Methods) for all nucleotide matrices except for combined data partitioned by locus, which was analyzed with 100 bootstrap replicates. The amino acid matrix was analyzed under JTT+4Γ model and 100 bootstrap replicates. In addition to maximum likelihood analysis on all matrices, I also ran Bayesian analysis using ExaBayes (Aberer et al., 2014) on combined dataset. Consistent with the recent criticism of the k-means algorithm (Baca et al., 2017), this partitioning scheme appears to result in incorrect topology under ML analysis of the combined data and “high signal” loci matrices but not in analyses of slow-evolving or compositionally homogeneous loci or the Bayesian analysis of combined data. The concatenated analyses were done on CIPRES (Miller et al., 2010) and on the University of Rochester Center for Integrated Research Computing BlueHive computer cluster. I also used the statistical binning pipeline (Bayzid et al., 2015) and ASTRAL (Mirarab and Warnow, 2015) to estimate summary species trees on the combined data matrix, “high signal”, and slow-evolving loci.

Divergence Time Estimation

I employed two different strategies to estimate the time-calibrated tree. The first approach used penalized likelihood and the fixed tree topology and branch length from the slow-evolving data. I used penalized likelihood as implemented in the chronos function of the R package ape (Paradis, 2013). I used an outgrouprooted tree from the concatenated analysis of slow-evolving loci and included calibrations for nodes in the outgroup. Hard bounds were placed on node calibrations using fossil ages for the upper bound and previous estimates for node ages as lower bounds (For more details see Supplementary Table 3). The strict molecular clock could not be rejected for this tree, and I performed the analysis for 100 replicates in order to assess robustness to random starting values of the algorithm. The second approach utilized Bayesian inference with only ingroup taxa included, with several clades constrained as monophyletic and the root position and backbone relationships of the tree sampled from the posterior. Because Bayesian divergence time estimation is computationally very expensive, that approach was limited to 109 loci evenly distributed across the spectrum of rate of evolution. I used BEAST2 (Bouckaert et al., 2014) under the unpartitioned GTR+4Γ model, uncorrelated molecular clock with branch lengths drawn from a lognormal distribution, and fossilized birthdeath process (Heath et al., 2014) conditioning on the root to obtain divergence time estimates and a posterior sample of trees for biogeographic analyses. Calibrations included seven doryline fossils (see Supplementary Table 4 for more details): A Neivamyrmex sp. from Chiapas Amber, five species from Dominican Amber (Acanthosticus hispaniolicus, Cylindromyrmex inopinatus, C. electrinus, C. antillanus, and Neivamyrmex ectopus), and Chrysapace sp. from Baltic Amber. Another Baltic Amber fossil, Procerapachys spp. was not used because of uncertainty in its placement among crown or stem dorylines. I ran four independent MCMC chains for over 250 million generations until convergence was reached. Combined effective sample size (ESS) for all parameters was above 200.

Diversification Analyses

To assess diversification rate shifts on the doryline phylogeny I used BAMM v2.5 (Rabosky, 2014). For input I used the consensus tree from BEAST analyses and sampling probabilities based on extant species diversity estimates (Borowiec, 2016b) for each genus in order to correct for uneven sampling of extant taxa across the phylogeny. I ran the MCMC for 20 million generations, sampling every 2,000 generations.

Biogeographic History Estimation

I used the R package BioGeoBEARS (Matzke, 2013, 2014) for model selection and estimation of biogeographic history of the dorylines. As input I used the consensus tree from BEAST runs and a sample of 100 trees from the posterior of the same analysis. I compared six models commonly used for biogeographic inference on the consensus tree, with the DEC+J model emerging as the best-fitting. I then estimated biogeographic history on the 100 trees under this model to account for uncertainty in the deeper relationships and the timeline of early divergences. These results were then summarized on the consensus tree.

Extended Methods

Data availability

Trimmed reads generated for this study are available at the NCBI Sequence Read Archive (to be submitted upon acceptance for publication). Sequence files, alignments, configuration files, and output of analyses, including phylogenetic trees, are available on Zenodo: https://doi.org/10.5281/zenodo.569071. Custom scripts used in this study are available on GitHub: https://doi.org/10.5281/zenodo.571246. Pipeline for extracting protein-coding sequences is available in a separate GitHub repository at https://doi.org/10.5281/zenodo.571247.

Taxon sampling

Taxon sampling included 154 newly sequenced ingroup species from all 27 currently recognized genera of Dorylinae ants and was guided by previous taxonomic and phylogenetic studies (De Andrade, 1998a; Borgmeier, 1955; Borowiec and Longino, 2011; Brady, 2003; Brady and Ward, 2005; Brady et al., 2014; Jaitrong and Yamane, 2011; MacKay, 1996; Bolton and Fisher, 2012) and expertise acquired preparing the global genus-level taxonomic revision of the group (Borowiec, 2016b). In addition to sampling all doryline genera, species from all biogeographic regions (as defined in (Cox, 2001) but treating the Malagasy region separate from Afrotropical) were included for each major lineage. Nine outgroup and one ingroup species (Ooceraea biroi) were also included based on publicly available ant genomes: Atta cephalotes (Suen et al., 2011), Camponotus foridanus (Bonasio et al., 2010), Cardiocondyla obscurior (Schrader et al., 2014), Harpegnathos saltator (Bonasio et al., 2010), Linepithema humile (Smith et al., 2011a), Ooceraea biroi (Oxley et al., 2014), Pogonomyrmex barbatus (Smith et al., 2011b), Solenopsis invicta (Wurm et al., 2011), and Vollenhovia emeryi (Smith et al., 2015).

Molecular data collection and sequencing

I extracted DNA from all newly sequenced specimens (Supplementary Table 1) using a DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA, USA). Most specimens were extracted non-destructively, with extraction voucher retained. For several extractions the DNA collection was done destructively and a voucher specimen from the same colony was kept. I quantified DNA for each extraction using a Qubit fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) and sheared <5–50 ng of DNA to a target size of approximately 400–600 bp. The shearing was done by sonication on a Bioruptor machine (Diagenode Inc., Philadelphia, PA, USA).

The library preparation protocol that follows was slightly modified from Blaimer et al. (2015). I used a KAPA Hyper Prep Library Kit (Kapa Biosystems, Inc., Wilmington, MA, USA) with magnetic bead cleanup (Fisher et al., 2011) and a SPRI substitute (Rohland and Reich, 2012) as described in (Faircloth et al., 2014). I used TruSeq adapters (Faircloth and Glenn, 2012) for ligation followed by PCR amplification of the library using a mix of HiFi HotStart polymerase reaction mix (Kapa Biosystems), Illumina TruSeq primers, and nuclease-free water. The following thermal cycler program was used for the PCR: 98 °C for 45 s; 13 or 14 cycles of 98 °C for 15 s, 65 °C for 30 s, 72 °C for 60 s, and final extension at 72 °C for 5 m. After rehydrating in 23 μl pH 8 Elution Buffer (EB hereafter) and purifying reactions using 1.1–1.2× speedbeads, I pooled nine to eleven libraries at equimolar ratios for final concentrations of 132–212 n/μl.

I enriched each pool with 9,446 custom-designed probes (MYcroarray, Inc.) targeting 2,524 UCE loci in Hymenoptera (Branstetter et al., 2017). I followed library enrichment procedures for the MYcroarray MYBaits kit (Blumenstiel et al., 2010), except I used a 0.1 × of the standard MYBaits concentration, and added 0.7 μl of 500 μmol custom blocking oligos designed against the custom sequence tags. I ran the hybridization reaction for 24 h at 65 °C, subsequently bound all pools to streptavidin beads (MyOne C1; Life Technologies), and washed bound libraries according to a standard target enrichment protocol (Blumenstiel et al., 2010). I used the with-bead approach for PCR recovery of enriched libraries as described in (Faircloth, 2015). I combined 15 μL of streptavidin bead-bound, enriched library with 25 μL HiFi HotStart Taq (Kapa Biosystems), 5 μL of Illumina TruSeq primer mix (5 μmol each) and 5 μL of nuclease-free water. Postenrichment PCR had the following profile: 98 °C for 45 s; 18 cycles of 98 °C for 15 s, 60 °C for 30 s, 72 °C for 60 s; and a final extension of 72 °C for 5 m. I purified resulting reactions using 1.1–1.2 × speedbeads, and rehydrated the enriched pools in 22 μL EB.

Following enrichment I quantified 2 μL of each pool using a Qubit fluorometer (broad range kit). I verified if the enrichment was successful by amplifying four UCE loci targeted by the probe set. I set up a relative qPCR by amplifying two replicates of 1 ng of enriched DNA from each pool for the four loci and comparing those results to two replicates of 1 ng unenriched DNA from each pool. I performed qPCR using a SYBR FAST qPCR kit (Kapa Biosystems) on CFX Connect Real-Time PCR Detection System (Bio-Rad). Following data collection, I calculated fold-enrichment values, assuming an efficiency of 1.78 and using the formula 1.78 × abs(enrichedCp – unenrichedCp). I then performed qPCR quantification by creating dilutions of each pool (1:200,000, 1:800,000, 1:1.000,000, 1:10.000,000) and assuming an average library fragment length of 600 bp. Based on the size-adjusted concentrations estimated by qPCR, I pooled libraries at equimolar concentrations.

The pooled libraries were then subjected to further quality control on Bioanalyzer and sequenced using one full and one partial lane of a HiSeq 125 Cycle Paired-End Sequencing v4 run. QC and sequencing were performed at the University of Utah High Throughput Genomics Core Facility. Quality-trimmed sequence reads generated as part of this study are available from the NCBI Sequence Read Archive (to be submitted upon acceptance for publication).

Processing of UCE data

Read cleaning, assembly, matching of contigs to probes, construction of the unaligned data set, and alignment trimming were done using the Phyluce pipeline scripts (Faircloth, 2015). I trimmed the FASTQ data using illumiprocessor, a wrapper around Trimmomatic (Bolger et al., 2014), with default settings (LEADING:5, TRAILING:15, SLIDINGWINDOW:4:15, MINLEN:40). Assemblies were done using Trinity v20140717 (Grabherr et al., 2011) with the phyluce_assembly_assemblo_trinity wrapper. The orthology assessment was then done by matching the assembled contigs to enrichment probe sequences with phyluce_assembly_match_contigs_to_probes (min_coverage=50, min_identity=80). This step generated a sqlite database which was then used to build FASTA files for the 2,524 orthologous loci with phyluce_assembly_get_match_counts, phyluce_assembly_get_fastas_from_match_counts, and phyluce_assembly_explode_get_fastas_file.

Extraction of protein-coding sequences

For the purpose of extracting protein-coding data from the sequenced UCE loci I developed a custom bioinformatics workflow that consists of three major components: using NCBI BLASTX (Camacho et al., 2009) to match unaligned UCE sequences to reference proteins and 2) choosing the best hit for each sequence followed by 3) extraction of protein queries and their nucleotide equivalents from those hits. Using makeblastdb program of the BLAST package I prepared a database from three publicly available collections of protein sequences of Acromyrmex echinatior (Nygaard et al., 2011), Harpegnathos saltator (Bonasio et al., 2010), and Ooceraea biroi (Oxley et al., 2014). Using BLASTX against this database resulted in one BLAST output file per UCE locus, containing multiple matches (hits) for each UCE sequence (taxon). Each hit, in turn, may be composed of one or more ranges that correspond to protein fragments (exons). I used BLAST scores for those ranges to identify best hits for each taxon and UCE. This was done with custom Python code using Biopython’s (Cock et al., 2009) module for parsing BLAST XML output and the following logic:

For each sequence (taxon) and hit within, both total and maximum scores are tallied. If a hit’s total score is equal to the maximum score of its ranges and it corresponds to the maximum score for the taxon, such hit is considered best and is kept. This means that this hit was composed of a single range and its score was not exceeded by any other hits, composed of one or multiple ranges. If a hit’s total score is higher than the maximum score of any one of its ranges, and that hit’s total score is the best hit score for a species, this hit is kept unless it contains overlapping ranges. Finally, if the total hit score is equal to its maximum score but not to the best hit score for a species, its total score is checked to see if it corresponds to the highest individual range score for a species. If this is true, the hit is kept. Such hit would be composed of single range and considered best even if hits with higher total scores but overlapping ranges are present. If composed of multiple ranges, a best hit is concatenated into a query in the order based on its coordinates and its presence on either forward or reverse strand. These protein queries are then matched to corresponding input nucleotide sequence or its reverse complement.

If introns that do not change reading frame are present, translations of the query sequence may span across them. Because of this I performed additional trimming if long (4 sites or more) gaps in the subject protein sequence were found. All sites corresponding to those long gaps were trimmed from the protein query and its nucleotide equivalent. If at this point there is still a stop codon in the protein query, such record was discarded.

For each record the resulting protein queries and their corresponding nucleotides were considered ready for downstream analyses.

Alignment and trimming

Assembly and contig matching resulted in sequences of varying lengths across taxa, as evidenced by summaries produced with phyluce_assembly_get_fasta_lengths. Because of this I used UPP (Nguyen et al., 2015), a phylogeny-aware alignment tool designed to align fragmentary sequences to a backbone of longer sequences. Based on the performance (recovered final post-trimming alignment length) of different settings, for the backbone I chose the cutoff of 30% of the longest sequences present in each locus. Because the UPP version used did not have an option to specify a fixed number of longest sequences in the backbone, a locus-specific command was printed for each locus based on its fragment size distribution with UPP’s -M option set to longest sequence and threshold (option -T) calculated to encompass 30% of taxa. UPP was also set to filter sequences from the backbone if their branches were 5 times longer than the median for all backbone sequences (-l 5 argument):

run_upp.py -s [input_alignment] -M [longest_sequence_length] -T [locus_threshold] -d [alignment_output_directory] -o [output] -p [temporary_output_directory] -l 5

These custom commands were printed with print_upp_command.py, a custom script utilizing code from phyluce_assembly_get_fasta_lengths.

Although alignment trimming has been recently criticized (Tan et al., 2015), the untrimmed alignments contained on average more than 75% of gaps and missing data. Because of the substantial computational burden of gap-rich data analysis, I trimmed the alignments using Gblocks (Talavera and Castresana, 2007) under settings relaxed from the default (b1=0.5 b2=0.5 b3=12 b4=7). I calculated alignment statistics and manipulated the files using AMAS v0.98 (Borowiec, 2016a). All loci with fewer than 114 taxa (less than 70%) were discarded, resulting in 2,166 out of 2,524 loci for downstream analyses. These loci had on average 151.7 (92.5%) taxa, were 412.2 nucleotides long, and had 7.7% missing data (gaps). Due to computing time constraints, loci with protein-coding sequences extracted were aligned using MAFFT v7.300b with --leavegappyout setting turned on. These alignments were trimmed using Gblocks with settings as above and further trimmed for outlier taxa using a custom R script and AMAS, removing any ingroup sequences whose uncorrected p-distance was more than 3σ from the mean for a locus.

Partitioning

PartitionFinder 2 (Lanfear et al., 2017) was used to partition concatenated alignments using the k-means clustering of sites based on evolutionary rates (Frandsen et al., 2015). A starting tree for model fit and site clustering algorithm was generated with RAxML Pthreads v8.2.3 (Stamatakis, 2014) using the fast tree search algorithm (-f E flag). Because of recent criticism of the k-means algirithm (Baca et al., 2017), I performed additional ML analyses for combined data set under unpartitioned and partitioned by locus schemes. Protein-coding sequences were analyzed as partitioned by locus. K-means tends to result in relatively low number of partitions and other partitioning by locus scheme was not computationally feasible for Bayesian inference.

Phylogenetic analyses using maximum likelihood

For each locus I estimated a gene tree with RAxML Pthreads v8.2.3 under a general time-reversible model of sequence evolution with rate modeled using a gamma distribution discretized into four bins (GTR+4Γ model). 200 rapid bootstraps were followed by a thorough search of the maximum likelihood tree:

raxml -T [no_cores] -m GTRGAMMA -f a -# 200 -p 12345 -x 12345 -s [input_alignment] -n [output_file_name] The same mode of inference was applied to supergenes created by the statistical binning pipeline (see Species Tree analyses below) but each supergene was partitioned by constituent loci using a partitions file (-q) and the -M flag was added for a fully partitioned analysis with branch lengths optimized separately for each partition (Bayzid et al., 2015): raxml -T [no_cores] -m GTRGAMMA -f a -# 200 -p 12345 -x 12345 -M -s [input_alignment] -q [partitions_file] -n [output_file_name]

I ran the analyses of concatenated matrices with RAxML Hybrid v8.2.4 and v8.2.9 on CIPRES Portal (Miller et al., 2010) using the same model and bootstrap settings but with a pre-defined partitioning scheme (see above), no -M option due to the high number of partitions, and either 100 (amino acid analysis and combined data partitioned by locus) or 500 bootstrap replicates (all other analyses). The amino acid analysis was ran under the JTT+4Γ model.

Phylogenetic analyses using Bayesian inference

I used ExaBayes v1.4.1 (Aberer et al., 2014) to perform analysis on k-means partitioned matrix of combined data set under GTR+4Γ. The analysis was ran with two runs, four MCMC chains each for 5 million generations. I disabled the default of parsimony starting tree such that analysis was initiated with a random topology. Convergence and mixing of the MCMC were determined by monitoring average standard deviation of split frequencies, which are considered acceptable below 5% (final value 1.75% for combined data set) and effective sample sizes (ESS) for all parameters, considered acceptable above 200 (min ESS: 900). I used 25% burn-in to construct consensus trees.

Species tree analyses

In addition to phylogenetic inference on concatenated loci I performed species tree analyses that attempt to reconcile gene tree incongruencies arising due to incomplete lineage sorting (Edwards, 2009). I used Accurate Species TRee Algorithm, ASTRAL v4.10.12 (Mirarab and Warnow, 2015; Sayyari and Mirarab, 2016). I used a weighted statistical binning pipeline (Bayzid et al., 2015) to create supergene alignments and trees. Locus trees used as input for the pipeline were considered under 75 bootstrap support threshold. Summary methods for species tree inference such as those used here have been shown to be negatively impacted by error in estimated input gene trees (Roch and Warnow, 2015). The binning approach was devised to alleviate this (Mirarab et al., 2014). The data sets analyzed with species tree approaches included binned supergenes of all loci (514 supergenes), supergenes identified in the high bootstrap loci (147 supergenes), and supergenes from the slow-evolving loci (100 supergenes).

Measures of compositional heterogeneity

Most models commonly used for phylogenetic inference, including the partitioned GTR+4Γ model used here, assume that alignments are compositionally homogeneous among taxa (Moran et al., 2015). To quantify among-taxon compositional heterogeneity of the data, I used two approaches: 1) statistical tests of compositional heterogeneity and 2) a continuous measure of relative composition frequency variability (RCFV) (Zhong et al., 2011). The former included a phylogeny-corrected statistical test that compares compositions in data sets simulated under a model (the null distribution) to the compositions in the observed alignment (Foster, 2004). For the purpose of this study the test was done on 200 simulated alignments for each observed alignment, assuming a GTR+4Γ model and a neighbor joining tree calculated using BioNJ (Gascuel, 1997). The often used but less appropriate χ² test for compositional heterogeneity was also performed for comparison. The two tests were carried out using the p4 program for phylogenetic inference (Foster, 2004). Relative composition of frequency variability (RCFV) is the other measure used here to compare compositional heterogeneity among data (Zhong et al., 2011). RCFV is the sum of absolute values of differences observed among frequencies of all four nucleotides, divided by the number of taxa. The differences are calculated by subtracting the overall frequency of the character in a matrix from the frequency of that character in an individual sequence (taxon). The sum of these differences is then divided by the number of taxa and this number in turn is summed for each sequence/taxon in the alignment: where m is the number of distinct character states (four for nucleotide sequences), A_ij is the frequency of nucleotide i in taxon j and A_i is the frequency of character (nucleotide) i across n taxa. RCFV thus gives a relative measure of compositional heterogeneity for a data set, and as the sum of differences between frequencies is calculated for each sequence, it also allows for comparison among taxa within an alignment.

Tree-based locus statistics

Following maximum likelihood estimation of gene trees, I computed average branch length for each of the 2,166 loci. The average branch length is a proxy for the rate of evolution of each locus. Saturation is another property that is potentially correlated with poor model fit. This I calculated by plotting p-distances of an alignment against distances on the tree from model-based maximum likelihood inference (Philippe and Forterre, 1999). These distances would show a perfect fit to simple linear regression in the absence of saturation. When there is a need of correction for multiple substitutions, however, the curve will depart from linearity. I sorted each locus by slope of regression for a relative measure of saturation.

I computed all the tree-based measures with a custom R (v3.0.2 (2013-09-25)) script leveraging packages ape v3.1-1 (Paradis et al., 2004) and seqinr v3.1-2 (Charif and Lobry, 2007), modified from code originally developed for Borowiec et al. (2015).

Divergence time estimation

To build a time-calibrated chronogram of the Dorylinae I used the R (v3.2.3 (Team, 2014) package ape v3.4 and its function chronos (Paradis et al., 2004; Paradis, 2013). Chronos uses the penalized likelihood method (Sanderson, 2002) and allows selection of the molecular clock model best fitting the data using an information criterion introduced in (Paradis, 2013). I used the maximum likelihood tree with branch lengths, rooted with Harpegnathos saltator estimated from slow-evolving loci partitioned under k-means as input. The method requires that nodes are calibrated with hard bounds of minimum and maximum ages. The calibration scheme is given in Supplementary Table 3. The information criterion implemented in chronos identified the strict molecular clock as the best fitting. Because unknown dates are initialized with a random algorithm it is possible to assess the robustness of the node ages to these initial ages by running multiple independent analyses. I ran 100 chronos replicates and summarized output trees using the sumtrees.py script distributed as a part of Denropy package v4.0.3, (Sukumaran and Holder, 2010). The summarized tree has mean branch lengths mean node ages, as well as uncertainty captured as node age ranges obtained across the 100 replicates. I visualized the tree with mean node ages and age ranges using FigTree v1.4.2. Because the information criterion implemented was shown to be poor at distinguishing the strict clock from a model with a small fixed number of rates (Paradis, 2013), I repeated the procedure for a discrete clock model with 10 categories. The results are presented in Supplementary Figure 19.

Because calibrations requiring hard minimum and maximum ages may be considered prone to bias and because penalized likelihood does not utilize sequence data, I also performed Bayesian divergence dating using the recently developed birth-death process (Heath et al., 2014). This method assumes no prior belief on calibrated node ages, instead relying on a single recovery age of a fossil that is assumed to be a descendant of the calibrated node. The method simulates tree topology via a birth-death process and treats fossils as a part of the diversification process with variable attachment points on the tree and fixed recovery ages. In addition to assuming no prior beliefs on calibrated node ages, this method is not limited to using only the oldest fossils known for a given node. For these analyses I used BEAST v2.3.2 (Bouckaert et al., 2014) with Sampled Ancestors package (Gavryushkina et al., 2014). Because Bayesian divergence time estimation is computationally expensive with 150+ taxa, these analyses were limited to 109 loci (5% of all loci) sampled at even intervals throughout the rate spectrum. I set up the BEAST analyses under unpartitioned GTR+4Γ model of sequence evolution and uncorrelated clock sampling rates from a lognormal distribution. The analysis was set up with four independent runs for >300 million generations. I determined convergence and adequate sample size (all parameters sampled at ESS > 200) using Tracer v1.5. The calibration scheme included seven fossils with fixed sampling times obtained by drawing a random number from a uniform distribution (runif function in R base) bounded by minimum and maximum ages of the deposit in which the fossil is found (deposit ages follow the Fossilworks website (Alroy, 2016); Supplementary Table 4). I used conditioning on the root age with a prior.

Diversification analyses

BAMM v2.5 (Rabosky, 2014) analyses used the consensus BEAST chronogram and a table of sampling probabilities based on extant species diversity estimates for each genus (Borowiec, 2016b). The sampling proportions were set as follows: Acanthostichus: 0.17, Aenictogiton: 0.2, Aenictus: 0.09, Cerapachys: 0.5, Cheliomyrmex: 0.6, Chrysapace: 0.8, Cylindromyrmex: 0.4, Dorylus: 0.2, Eburopone: 0.12, Eciton: 0.58, Eusphinctus: 0.5, Labidus: 0.6, Leptanilloides: 0.17, Lioponera: 0.08, Lividopone: 0.13, Neivamyrmex: 0.11, Neocerapachys: 0.4, Nomamyrmex: 1, Ooceraea: 0.2, Parasyscia: 0.08, Simopone: 0.18, Sphincto-myrmex: 0.4, Syscia: 0.13, Tanipone: 0.4, Vicinopone: 1, Yunodorylus: 0.4, Zasphinctus: 0.15. I used the “setBAMMpriors” function in the R package BAMMtools (Rabosky et al., 2014) to create priors used for the analysis. I ran the MCMC for 20 million generations, sampling every 2,000 generations. I checked the convergence and plotted the analysis results using BAMMtools and CODA (Plummer et al., 2006).

Biogeographic analyses

I used the maximum likelihood functions available in BioGeoBEARS R package (Matzke, 2013, 2014) to compare fit and select from among models commonly used for estimation of biogeographic histories. I discretized species distributions into six biogeographic regions following (Cox, 2001) and treating Malagasy, as the seventh separate region. The regions included Neotropical, Nearctic, Palearctic, Afrotropical, Malagasy, Indomalayan, and Australasian. The boundary between Indomalayan and Australasian regions is the Wallace Line. I set up a time-stratified analysis that assumed different region adjacency and dispersal probabilities between 110-50 Ma and 50 Ma-present (M3_stratified-type model of Matzke (2014)) and compared likelihoods and AICc scores under six models: DEC, DEC+J, DIVALIKE, DIVALIKE+J, BAYAREALIKE, and BAYAREALIKE+J.

View this table:

Supplementary Table 1:

Voucher specimens used in this study. CASENT numbers correspond to records on AntWeb.org.

View this table:

Supplementary Table 2:

Statistics of data matrices used in this study.

View this table:

Supplementary Table 3:

Calibration scheme used for penalized likelihood analyses in chronos. MRCA column refers to the most recent common ancestor of two tip names in the maximum likelihood tree obtained from slow-evolving loci matrix.

View this table:

Supplementary Table 4:

Calibration scheme used for fossilized birth-death process analyses in BEAST. Total group refers to fossil that could be placed in either stem or crown group.

Acknowledgments

I would like to thank Phil Ward for guidance, support, and valuable conversations related to this project. Many thanks to Michael Branstetter who allowed me to work with an unpublished probe set and provided wet lab protocol training. Alex Wild, Corrie Moreau, and Daniel Kronauer gave important feedback during early stages of conceiving this study. Alex also provided photographs of live army ants used in Figure 1. Brian Johnson and Christian Rabeling kindly gave access to computing resources. Karl Kjer and Kimiora Ward provided comments that helped to improve this manuscript. Thank you to everyone who contributed specimens for sequencing or helped with field work: Fidèle Bemaeva, Brendon Boudinot, Júlio Chaul, Katsuyuki Eguchi, Flávia Esteves, Rodrigo Feitosa, Brian Fisher, Paco Hita-Garcia, Milan Janda, Jack Longino, Andrea Lucky, Sean McKenzie, Matt Prebus, Andry Rakotomalala, Caspar Schöning, Michael Staab, Andy Suarez, and Phil Ward. This research was funded by NSF Doctoral Dissertation Improvement Grant 1402432, a Young Explorers Grant from National Geographic, Microsoft Azure Research Award, Henry A. Jastro awards from the Department of Entomology and Nematology at UC Davis, and facilitated by Ernst Mayr Travel Grants in Animal Systematics and SYNTHESYS awards FR-TAF-594 and GB-TAF-303.

References

↵
Aberer, A. J., Kobert, K., and Stamatakis, A. (2014). ExaBayes: Massively parallel bayesian tree inference for the whole-genome era. Molecular Biology and Evolution, 31(10):2553–2556.
OpenUrl CrossRef PubMed
↵
Aleksandrova, G. N. and Zaporozhets, N. I. (2008). Palynological characteristics of Upper Creta-ceous and Paleogene deposits on the west of the Sambian Peninsula (Kaliningrad region), part Stratigraphy and Geological Correlations, 16(3):295–316.
OpenUrl
↵
Alroy, J. (2016). Gateway to the paleobiology database. 2015. Available: www.fossilworks.org.
↵
Baca, S. M., Toussaint, E. F., Miller, K. B., and Short, A. E. (2017). Molecular phylogeny of the aquatic beetle family noteridae (coleoptera: Adephaga) with an emphasis on data partitioning strategies. Molecular Phylogenetics and Evolution, 107:282–292.
OpenUrl
Baroni Urbani, C., Bolton, B., and Ward, P. S. (1992). The internal phylogeny of ants (Hy-menoptera: Formicidae). Systematic Entomology, (17):301–329.
OpenUrl
↵
Bayzid, M. S., Mirarab, S., Boussau, B., and Warnow, T. (2015). Weighted statistical binning: Enabling statistically consistent genome-scale phylogenetic analyses. PLoS ONE, 10(6):e0129183.
OpenUrl CrossRef PubMed
Betancur-R., R., Naylor, G. J. P., and Orti, G. (2013). Conserved genes, sampling error, and phy-logenomic inference. Systematic Biology, 63(2):257–262.
OpenUrl PubMed
↵
Billen, J. P. J. (1985). Comparative ultrastructure of the poison and Dufour glands in Old and New World army ants (Hymenoptera, Formicidae). Actes des Colloques Insectes Sociaux, (2):17–26.
OpenUrl
↵
Billen, J. P. J. and Gotwald, W. H. J. (1988). The crenellate lining of the Dufour gland in the genus Aenictus: a new character for interpreting the phylogeny of Old World army ants (Hymenoptera, Formicidae, dorylinae). Zoologica Scripta, (17):293–295.
OpenUrl
↵
Blaimer, B. B., Brady, S. G., Schultz, T. R., Lloyd, M. W., Fisher, B. L., and Ward, P. S. (2015). Phylogenomic methods outperform traditional multi-locus approaches in resolving deep evolutionary history: a case study of formicine ants. BMC Evolutionary Biology, 15(1):e271.
OpenUrl
↵
Blumenstiel, B., Cibulskis, K., Fisher, S., DeFelice, M., Barry, A., Fennell, T., Abreu, J., Minie, B., Costello, M., Young, G., et al. (2010). Targeted exon sequencing by in-solution hybrid selection. Current Protocols in Human Genetics, 66: Chapter 18: Unit 18.4.
↵
Bolger, A. M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30(15):2114–2120.
OpenUrl CrossRef PubMed Web of Science
↵
Bolton, B. (1990). Army ants reassessed: the phylogeny and classification of the doryline section (Hymenoptera, Formicidae). Journal of Natural History, 24(6):1339–1364.
OpenUrl
↵
Bolton, B. and Fisher, B. L. (2012). Taxonomy of the cerapachyine ant genera Simopone forel, Vicinopone gen. n. and Tanipone gen. n. (Hymenoptera: Formicidae). Zootaxa, (3283):1–101.
OpenUrl
↵
Bonasio, R., Zhang, G., Ye, C., Mutti, N. S., Fang, X., Qin, N., Donahue, G., Yang, P., Li, Q., Li, C., Zhang, P., Huang, Z., Berger, S. L., Reinberg, D., Wang, J., and Liebig, J. (2010). Genomic comparison of the ants Camponotus floridanus and Harpegnathos saltator. Science, 329(5995):1068–1071.
OpenUrl Abstract/FREE Full Text
↵
Borgmeier, T. (1955). Die Wanderameisen der Neotropischen Region. Studia Entomologica, 3:1–720.
OpenUrl
↵
Borowiec, M. L. (2016a). AMAS: a fast tool for alignment manipulation and computing of summary statistics. PeerJ, 4:e1660.
OpenUrl CrossRef
Borowiec, M. L. (2016b). Generic revision of the ant subfamily Dorylinae (Hymenoptera, Formicidae). ZooKeys, 608:1–280.
OpenUrl CrossRef
↵
Borowiec, M. L., Lee, E. K., Chiu, J. C., and Plachetzki, D. C. (2015). Extracting phylogenetic signal and accounting for bias in whole-genome data sets supports the Ctenophora as sister to remaining Metazoa. BMC Genomics, 16(1):e987.
OpenUrl
↵
Borowiec, M. L. and Longino, J. T. (2011). Three new species and reassessment of the rare neotropical ant genus Leptanilloides (Hymenoptera, Formicidae, leptanilloidinae). ZooKeys, 133:19–48.
OpenUrl
↵
Bouckaert, R., Heled, J., Kühnert, D., Vaughan, T., Wu, C.-H., Xie, D., Suchard, M. A., Rambaut, A., and Drummond, A. J. (2014). BEAST 2: A software platform for Bayesian evolutionary analysis. PLoS Computational Biology, 10(4):e1003537.
OpenUrl CrossRef
↵
Brady, S. G. (2003). Evolution of the army ant syndrome: The origin and long-term evolutionary stasis of a complex of behavioral and reproductive adaptations. Proceedings of the National Academy of Sciences, 100(11):6575–6579.
OpenUrl Abstract/FREE Full Text
↵
Brady, S. G., Fisher, B. L., Schultz, T. R., and Ward, P. S. (2014). The rise of army ants and their relatives: diversification of specialized predatory doryline ants. BMC Evolutionary Biology, 14(1):93.
OpenUrl
↵
Brady, S. G. and Ward, P. S. (2005). Morphological phylogeny of army ants and other dorylomorphs (Hymenoptera: Formicidae). Systematic Entomology, 30(4):593–618.
OpenUrl
Branstetter, M. G., Longino, J. T., Reyes-López, J., Schultz, T. R., and Brady, S. G. (2016). Into the tropics: phylogenomics and evolutionary dynamics of a contrarian clade of ants. bioRxiv.
↵
Branstetter, M. G., Longino, J. T., Ward, P. S., and Faircloth, B. C. (2017). Enriching the ant tree of life: enhanced UCE bait set for genome-scale phylogenetics of ants and other Hymenoptera. Methods in Ecology and Evolution, On-line early.
↵
Brikiatis, L. (2014). The De Geer, Thulean and Beringia routes: key concepts for understanding early Cenozoic biogeography. Journal of Biogeography, 41(6):1036–1054.
OpenUrl GeoRef
↵
Brown, J. M. (2014). Predictive approaches to assessing the fit of evolutionary models. Systematic Biology, 63(3):289–292.
OpenUrl CrossRef PubMed
↵
Brown, W. L. J. (1975). Contributions toward a reclassification of the Formicidae. v. Ponerinae, tribes Platythyreini, Cerapachyini, Cylindromyrmecini, Acanthostichini, and Aenictogitini. Search. Agriculture (Ithaca, New York), 5(1):1–115.
OpenUrl
↵
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., and Madden, T. L. (2009). BLAST+: architecture and applications. BMC Bioinformatics, 10(1):421.
OpenUrl CrossRef PubMed
↵
Charif, D. and Lobry, J. R. (2007). SeqinR 1.0-2: A contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. In Structural Approaches to Sequence Evolution, pages 207–232. Springer Science Business Media.
↵
Cock, P. J. A., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., and de Hoon, M. J. L. (2009). Biopython: freely available python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11):1422–1423.
OpenUrl CrossRef PubMed Web of Science
Coty, D., Aria, C., Garrouste, R., Wils, P., Legendre, F., and Nel, A. (2014). The first ant-termite syninclusion in amber with CT-scan analysis of taphonomy. PLoS ONE, 9(8):e104410.
OpenUrl
↵
Cox, B. (2001). The biogeographic regions reconsidered. Journal of Biogeography, 28(4):511–523.
OpenUrl CrossRef Web of Science
↵
De Andrade, M. (1998a). Fossil and extant species of Cylindromyrmex (Hymenoptera: Formicidae). Revue Suisse de Zoologie, 105(3):581–664.
OpenUrl Web of Science
De Andrade, M. L. (1998b). First description of fossil Acanthostichus from Dominican amber (Hymenoptera: Formicidae). Mitteilungen der Schweizerischen Entomologischen Gesellschaft, 71:269–274.
OpenUrl
De Andrade, M. L. (2001). A remarkable Dominican amber species of Cylindromyrmex with Brazilian affinities and additions to the generic revision (Hymenoptera: Formicidae). Beiträge zur Entomologie, (51):51–63.
OpenUrl
↵
Delsuc, F., Brinkmann, H., and Philippe, H. (2005). Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics, 6(5):361–375.
OpenUrl CrossRef PubMed Web of Science
Dlussky, G. M. (1996). Ants (Hymenoptera: Formicidae) from Burmese amber. Paleontological Journal., (30):449–454.
OpenUrl
↵
Edwards, S. V. (2009). Is a new and general theory of molecular systematics emerging? Evolution, 63(1):1–19.
OpenUrl CrossRef PubMed Web of Science
↵
Faircloth, B. C. (2011). Illumiprocessor-software for Illumina read quality filtering.
↵
Faircloth, B. C. (2015). PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics, 32(5):786–788.
OpenUrl PubMed
↵
Faircloth, B. C., Branstetter, M. G., White, N. D., and Brady, S. G. (2014). Target enrichment of ultraconserved elements from arthropods provides a genomic perspective on relationships among Hymenoptera. Molecular Ecology Resources, 15(3):489–501.
OpenUrl
↵
Faircloth, B. C. and Glenn, T. C. (2012). Not all sequence tags are created equal: Designing and validating sequence identification tags robust to indels. PLoS ONE, 7(8):e42543.
OpenUrl CrossRef PubMed
↵
Faircloth, B. C., McCormack, J. E., Crawford, N. G., Harvey, M. G., Brumfield, R. T., and Glenn, T. C. (2012). Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology, 61(5):717–726.
OpenUrl CrossRef PubMed
↵
Fisher, S., Barry, A., Abreu, J., Minie, B., Nolan, J., Delorey, T. M., Young, G., Fennell, T. J., Allen, A., Ambrogio, L., Berlin, A. M., Blumenstiel, B., Cibulskis, K., Friedrich, D., Johnson, R., Juhn, F., Reilly, B., Shammas, R., Stalker, J., Sykes, S. M., Thompson, J., Walsh, J., Zimmer, A., Zwirko, Z., Gabriel, S., Nicol, R., and Nusbaum, C. (2011). A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biology, 12(1):R1.
OpenUrl CrossRef PubMed
↵
Foster, P. (2004). Modeling compositional heterogeneity. Systematic Biology, 53(3):485–495.
OpenUrl CrossRef PubMed Web of Science
↵
Foster, P. G. and Hickey, D. A. (1999). Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. Journal of Molecular Evolution, 48(3):284–290.
OpenUrl CrossRef PubMed Web of Science
↵
Frandsen, P. B., Calcott, B., Mayer, C., and Lanfear, R. (2015). Automatic selection of partitioning schemes for phylogenetic analyses using iterative k-means clustering of site rates. BMC Evolutionary Biology, 15(1):13.
OpenUrl
↵
Garnier, S., Murphy, T., Lutz, M., Hurme, E., Leblanc, S., and Couzin, I. D. (2013). Stability and responsiveness in a self-organized living architecture. PLoS Computational Biology, 9(3):e1002984.
OpenUrl
↵
Gascuel, O. (1997). BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Molecular Biology and Evolution, 14(7):685–695.
OpenUrl CrossRef PubMed Web of Science
↵
Gavryushkina, A., Welch, D., Stadler, T., and Drummond, A. J. (2014). Bayesian inference of sampled ancestor trees for epidemiology and fossil calibration. PLoS Computational Biology, 10(12):e1003919.
OpenUrl
↵
Goremykin, V. V., Nikiforova, S. V., Cavalieri, D., Pindo, M., and Lockhart, P. (2015). The root of flowering plants and total evidence. Systematic Biology, 64(5):879–891.
OpenUrl CrossRef PubMed
↵
Gotwald, W. H. (1995). Army ants: the biology of social predation. Cornell University Press Ithaca.
↵
Gotwald, W. H. J. (1979). Phylogenetic implications of army ant zoogeography (Hymenoptera: Formicidae). Annals of the Entomological Society of America, (72):462–467.
OpenUrl CrossRef
↵
Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N., and Regev, A. (2011). Full-length transcriptome assembly from RNA-seq data without a reference genome. Nature Biotechnology, 29(7):644–652.
OpenUrl CrossRef PubMed
Grimaldi, D. and Agosti, D. (2000). A formicine in new jersey cretaceous amber (Hymenoptera: Formicidae) and early evolution of the ants. Proceedings of the National Academy of Sciences of the United States of America, (97):13678–13683.
OpenUrl Abstract/FREE Full Text
↵
Hasegawa, M. and Hashimoto, T. (1993). Ribosomal RNA trees misleading? Nature, 361(6407):23–23.
OpenUrl PubMed
↵
Heath, T. A., Huelsenbeck, J. P., and Stadler, T. (2014). The fossilized birth-death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences, 111(29):E2957–E2966.
OpenUrl Abstract/FREE Full Text
↵
Hermann, H. R. (1969). The hymenopterous poison apparatus: evolutionary trends in three closely related subfamilies of ants (Hymenoptera: Formicidae). Journal of the Georgia Entomological Society, 4(3):123–141.
OpenUrl
↵
Jaitrong, W. and Yamane, S. (2011). Synopsis of Aenictus species groups and revision of the A. currax and A. laeviceps groups in the eastern Oriental, Indo-Australian, and Australasian regions (Hymenoptera: Formicidae: Aenictinae). Zootaxa, 3128:1–46.
OpenUrl
↵
Jermiin, L., Ho, S. Y., Ababneh, F., Robinson, J., and Larkum, A. W. (2004). The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. Systematic Biology, 53(4):638–643.
OpenUrl CrossRef PubMed Web of Science
↵
Kronauer, D. J., Schöning, C., Vilhelmsen, L. B., and Boomsma, J. J. (2007). A molecular phy-logeny of Dorylus army ants provides evidence for multiple evolutionary transitions in foraging niche. BMC Evolutionary Biology, 7(1):56.
OpenUrl
↵
Kronauer, D. J. C. (2009). Recent advances in army ant biology (Hymenoptera: Formicidae). Myrmecological News, 12:51–65.
OpenUrl
↵
Lanfear, R., Frandsen, P. B., Wright, A. M., Senfeld, T., and Calcott, B. (2017). PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Molecular Biology And Evolution, 34(3):772–773.
OpenUrl CrossRef PubMed
LaPolla, J. S., Dlussky, G. M., and Perrichot, V. (2013). Ants and the fossil record. Annual Review of Entomology, 58:609–630.
OpenUrl CrossRef PubMed Web of Science
↵
Lockhart, P. J., Steel, M. A., Hendy, M. D., and Penny, D. (1994). Recovering evolutionary trees under a more realistic model of sequence evolution. Molecular Biology and Evolution, 11(4):605–612.
OpenUrl PubMed Web of Science
↵
MacKay, W. P. (1996). A revision of the ant genus Acanthostichus. Sociobiology, 27(2):129–179.
OpenUrl
↵
Matzke, N. J. (2013). Probabilistic historical biogeography: new models for founder-event speciation, imperfect detection, and fossils allow improved accuracy and model-testing. University of California, Berkeley.
↵
Matzke, N. J. (2014). Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Systematic Biology, 63(6):951–970.
OpenUrl CrossRef PubMed
↵
Miller, M. A., Pfeiffer, W., and Schwartz, T. (2010). Creating the CIPRES science gateway for inference of large phylogenetic trees. In 2010 Gateway Computing Environments Workshop (GCE). Institute of Electrical & Electronics Engineers (IEEE).
↵
Mirarab, S., Bayzid, M. S., Boussau, B., and Warnow, T. (2014). Statistical binning enables an accurate coalescent-based estimation of the avian tree. Science, 346(6215):1250463–1250463.
OpenUrl Abstract/FREE Full Text
↵
Mirarab, S. and Warnow, T. (2015). ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics, 31(12):i44–i52.
OpenUrl CrossRef PubMed
↵
Moran, R., Morgan, C., and O’Connell, M. (2015). A guide to phylogenetic reconstruction using heterogeneous models—a case study from the root of the placental mammal tree. Computation, 3(2):177–196.
OpenUrl CrossRef
Moreau, C. S. and Bell, C. D. (2013). Testing the museum versus cradle tropical biological diversity hypothesis: phylogeny, diversification, and ancestral biogeographic range evolution of the ants. Evolution, 67(8):2240–2257.
OpenUrl CrossRef PubMed Web of Science
↵
Nguyen, N. D., Mirarab, S., Kumar, K., and Warnow, T. (2015). Ultra-large alignments using phylogeny-aware profiles. Genome Biology, 16(1):124.
OpenUrl CrossRef PubMed
↵
Nygaard, S., Zhang, G., Schiøtt, M., Li, C., Wurm, Y., Hu, H., Zhou, J., Ji, L., Qiu, F., Rasmussen, M., Pan, H., Hauser, F., Krogh, A., Grimmelikhuijzen, C. J., Wang, J., and Boomsma, J. J. (2011). The genome of the leaf-cutting ant acromyrmex echinatior suggests key adaptations to advanced social life and fungus farming. Genome Research, 21(8):1339–1348.
OpenUrl Abstract/FREE Full Text
↵
Oxley, P. R., Ji, L., Fetter-Pruneda, I., McKenzie, S. K., Li, C., Hu, H., Zhang, G., and Kronauer, D. J. (2014). The genome of the clonal raider ant Cerapachys biroi. Current Biology, 24(4):451–458.
OpenUrl CrossRef PubMed
↵
Paradis, E. (2013). Molecular dating of phylogenies by likelihood methods: A comparison of models and a new information criterion. Molecular Phylogenetics and Evolution, 67(2):436–444.
OpenUrl CrossRef PubMed
↵
Paradis, E., Claude, J., and Strimmer, K. (2004). APE: Analyses of phylogenetics and evolution in R language. Bioinformatics, 20(2):289–290.
OpenUrl CrossRef PubMed Web of Science
↵
Peters, M. K., Likare, S., and Kraemer, M. (2008). Effects of habitat fragmentation and degradation on flocks of African ant-following birds. Ecological Applications, 18(4):847–858.
OpenUrl PubMed
↵
Philippe, H. and Forterre, P. (1999). The rooting of the universal tree of life is not reliable. J Mol Evol, 49(4):509–523.
OpenUrl CrossRef PubMed Web of Science
↵
Plummer, M., Best, N., Cowles, K., and Vines, K. (2006). Coda: Convergence diagnosis and output analysis for mcmc. R News, 6(1):7–11.
OpenUrl
↵
Rabosky, D. L. (2014). Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PLoS ONE, 9(2):e89543.
OpenUrl CrossRef PubMed
↵
Rabosky, D. L., Grundler, M., Anderson, C., Title, P., Shi, J. J., Brown, J. W., Huang, H., and Larson, J. G. (2014). BAMMtools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods in Ecology and Evolution, 5(7):701–707.
OpenUrl
Radchenko, A. G., Elmes, G. W., and Dlussky, G. (2007). The ants of the genus Myrmica (Hy-menoptera, Formicidae) from Baltic and Saxonian amber (Late Eocene). Journal of Paleontology, 81(6):1494–1501.
OpenUrl FREE Full Text
↵
Raignier, A., Van Boven, J., and du Congo Belge, M. R. (1955). Étude taxonomique, biologique et biométrique des Dorylus du sous-genre Anomma (Hymenoptera Formicidae), volume 2. Musée royal du Congo belge.
↵
Ravary, F. and Jaisson, P. (2002). The reproductive cycle of thelytokous colonies of Cerapachys biroi forel (Formicidae, cerapachyinae). Insectes Sociaux, 49(2):114–119.
OpenUrl CrossRef Web of Science
↵
Roch, S. and Warnow, T. (2015). On the robustness to gene tree estimation error (or lack thereof) of coalescent-based species tree methods. Systematic Biology, 64(4):663–676.
OpenUrl CrossRef PubMed
↵
Rodríguez-Ezpeleta, N., Brinkmann, H., Roure, B., Lartillot, N., Lang, B. F., and Philippe, H. (2007). Detecting and overcoming systematic errors in genome-scale phylogenies. Systematic Biol., 56(3):389–399.
OpenUrl CrossRef PubMed Web of Science
↵
Rohland, N. and Reich, D. (2012). Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Research, 22(5):939–946.
OpenUrl Abstract/FREE Full Text
↵
Salichos, L. and Rokas, A. (2013). Inferring ancient divergences requires genes with strong phy-logenetic signals. Nature, 497(7449):327–331.
OpenUrl CrossRef PubMed Web of Science
↵
Sanderson, M. J. (2002). Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular Biology and Evolution, 19(1):101–109.
OpenUrl CrossRef PubMed Web of Science
↵
Sanmartín, I., Enghoff, H., and Ronquist, F. (2001). Patterns of animal dispersal, vicariance and diversification in the Holarctic. Biological Journal of the Linnean Society, 73(4):345–390.
OpenUrl CrossRef GeoRef Web of Science
↵
Sayyari, E. and Mirarab, S. (2016). Fast coalescent-based computation of local branch support from quartet frequencies. Molecular Biology and Evolution, 33(7):1654–1668.
OpenUrl CrossRef PubMed
↵
Schneirla, T. C. (1945). The army-ant behavior pattern: Nomad-statary relations in the swarmers and the problem of migration. Biological Bulletin, 88(2):166.
OpenUrl
↵
Schrader, L., Kim, J. W., Ence, D., Zimin, A., Klein, A., Wyschetzki, K., Weichselgartner, T., Kemena, C., Stökl, J., Schultner, E., Wurm, Y., Smith, C. D., Yandell, M., Heinze, J., Gadau, J., and Oettler, J. (2014). Transposable element islands facilitate adaptation to novel environments in an invasive species. Nature Communications, 5:5495.
OpenUrl
↵
Schöning, C., Njagi, W. M., and Franks, N. R. (2005). Temporal and spatial patterns in the emigra-tions of the army ant Dorylus (Anomma) molestus in the montane forest of Mt Kenya. Ecological Entomology, 30(5):532–540.
OpenUrl
↵
Shen, X.-X., Hittinger, C. T., and Rokas, A. (2017). Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nature Ecology & Evolution, 1(5):0126.
OpenUrl CrossRef
↵
Smith, C. D., Zimin, A., Holt, C., Abouheif, E., Benton, R., Cash, E., Croset, V., Currie, C. R., Elhaik, E., Elsik, C. G., Fave, M.-J., Fernandes, V., Gadau, J., Gibson, J. D., Graur, D., Grubbs, K. J., Hagen, D. E., Helmkampf, M., Holley, J.-A., Hu, H., Viniegra, A. S. I., Johnson, B. R., Johnson, R. M., Khila, A., Kim, J. W., Laird, J., Mathis, K. A., Moeller, J. A., Munoz-Torres, M. C., Murphy, M. C., Nakamura, R., Nigam, S., Overson, R. P., Placek, J. E., Rajakumar, R., Reese, J. T., Robertson, H. M., Smith, C. R., Suarez, A. V., Suen, G., Suhr, E. L., Tao, S., Torres, C. W., van Wilgenburg, E., Viljakainen, L., Walden, K. K. O., Wild, A. L., Yandell, M., Yorke, J. A., and Tsutsui, N. D. (2011a). Draft genome of the globally widespread and invasive Argentine ant (Linepithema humile). Proceedings of the National Academy of Sciences, 108(14):5673–5678.
OpenUrl Abstract/FREE Full Text
↵
Smith, C. R., Cahan, S. H., Kemena, C., Brady, S. G., Yang, W., Bornberg-Bauer, E., Eriksson, T., Gadau, J., Helmkampf, M., Gotzek, D., Miyakawa, M. O., Suarez, A. V., and Mikheyev, A. (2015). How do genomes create novel phenotypes? insights from the loss of the worker caste in ant social parasites. Molecular Biology and Evolution, 32(11):2919–2931.
OpenUrl CrossRef PubMed
↵
Smith, C. R., Smith, C. D., Robertson, H. M., Helmkampf, M., Zimin, A., Yandell, M., Holt, C., Hu, H., Abouheif, E., Benton, R., Cash, E., Croset, V., Currie, C. R., Elhaik, E., Elsik, C. G., Fave, M.-J., Fernandes, V., Gibson, J. D., Graur, D., Gronenberg, W., Grubbs, K. J., Hagen, D. E., Viniegra, A. S. I., Johnson, B. R., Johnson, R. M., Khila, A., Kim, J. W., Mathis, K. A., Munoz- Torres, M. C., Murphy, M. C., Mustard, J. A., Nakamura, R., Niehuis, O., Nigam, S., Overson, R. P., Placek, J. E., Rajakumar, R., Reese, J. T., Suen, G., Tao, S., Torres, C. W., Tsutsui, N. D., Viljakainen, L., Wolschin, F., and Gadau, J. (2011b). Draft genome of the red harvester ant Pogonomyrmex barbatus. Proceedings of the National Academy of Sciences, 108(14):5667–5672.
OpenUrl Abstract/FREE Full Text
↵
Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313.
OpenUrl CrossRef PubMed Web of Science
↵
Suen, G., Teiling, C., Li, L., Holt, C., Abouheif, E., Bornberg-Bauer, E., Bouffard, P., Caldera, E. J., Cash, E., Cavanaugh, A., Denas, O., Elhaik, E., Favé, M.-J., Gadau, J., Gibson, J. D., Graur, D., Grubbs, K. J., Hagen, D. E., Harkins, T. T., Helmkampf, M., Hu, H., Johnson, B. R., Kim, J., Marsh, S. E., Moeller, J. A., Muñoz-Torres, M. C., Murphy, M. C., Naughton, M. C., Nigam, S., Overson, R., Rajakumar, R., Reese, J. T., Scott, J. J., Smith, C. R., Tao, S., Tsutsui, N. D., Viljakainen, L., Wissler, L., Yandell, M. D., Zimmer, F., Taylor, J., Slater, S. C., Clifton, S. W., Warren, W. C., Elsik, C. G., Smith, C. D., Weinstock, G. M., Gerardo, N. M., and Currie, C. R. (2011). The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle. PLoS Genetics, 7(2):e1002007.
OpenUrl
↵
Sukumaran, J. and Holder, M. T. (2010). DendroPy: a Python library for phylogenetic computing. Bioinformatics, 26(12):1569–1571.
OpenUrl CrossRef PubMed Web of Science
↵
Talavera, G. and Castresana, J. (2007). Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Systematic Biol., 56(4):564–577.
OpenUrl CrossRef PubMed Web of Science
↵
Tan, G., Muffato, M., Ledergerber, C., Herrero, J., Goldman, N., Gil, M., and Dessimoz, C. (2015). Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Systematic Biology, 64(5):778–791.
OpenUrl CrossRef PubMed
↵
Team, R. C. (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2013.
↵
Tsuji, K. and Yamauchi, K. (1995). Production of females by parthenogenesis in the ant, Cera-pachys biroi. Insectes Sociaux, 42(3):333–336.
OpenUrl CrossRef Web of Science
Ward, P. S., Brady, S. G., Fisher, B. L., and Schultz, T. R. (2015). The evolution of myrmicine ants: phylogeny and biogeography of a hyperdiverse ant clade (Hymenoptera: Formicidae). Systematic Entomology, 40(1):61–81.
OpenUrl
Wilson, E. O. (1985). Ants of the Dominican amber (Hymenoptera: Formicidae). 2. The first fossil army ants. Psyche (Cambridge), (92):11–16.
OpenUrl
↵
Wurm, Y., Wang, J., Riba-Grognuz, O., Corona, M., Nygaard, S., Hunt, B. G., Ingram, K. K., Falquet, L., Nipitwattanaphon, M., Gotzek, D., Dijkstra, M. B., Oettler, J., Comtesse, F., Shih, C.-J., Wu, W.-J., Yang, C.-C., Thomas, J., Beaudoing, E., Pradervand, S., Flegel, V., Cook, E. D., Fabbretti, R., Stockinger, H., Long, L., Farmerie, W. G., Oakey, J., Boomsma, J. J., Pamilo, P., Yi, S. V., Heinze, J., Goodisman, M. A. D., Farinelli, L., Harshman, K., Hulo, N., Cerutti, L., Xenarios, I., Shoemaker, D., and Keller, L. (2011). The genome of the fire ant Solenopsis invicta. Proceedings of the National Academy of Sciences, 108(14):5679–5684.
OpenUrl Abstract/FREE Full Text
↵
Zhong, M., Hansen, B., Nesnidal, M., Golombek, A., Halanych, K. M., and Struck, T. H. (2011). Detecting the symplesiomorphy trap: a multigene phylogenetic analysis of terebelliform annelids. BMC Evolutionary Biology, 11(1):369.
OpenUrl

View the discussion thread.

Posted May 04, 2017.

Download PDF

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5215)
Biochemistry (11753)
Bioengineering (8752)
Bioinformatics (29201)
Biophysics (14974)
Cancer Biology (12100)
Cell Biology (17413)
Clinical Trials (138)
Developmental Biology (9422)
Ecology (14182)
Epidemiology (2067)
Evolutionary Biology (18309)
Genetics (12245)
Genomics (16804)
Immunology (11869)
Microbiology (28098)
Molecular Biology (11596)
Neuroscience (60975)
Paleontology (451)
Pathology (1871)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2886)
Systems Biology (7340)
Zoology (1651)

[1] ↵
Aberer, A. J., Kobert, K., and Stamatakis, A. (2014). ExaBayes: Massively parallel bayesian tree inference for the whole-genome era. Molecular Biology and Evolution, 31(10):2553–2556.
OpenUrl CrossRef PubMed

[2] ↵
Aleksandrova, G. N. and Zaporozhets, N. I. (2008). Palynological characteristics of Upper Creta-ceous and Paleogene deposits on the west of the Sambian Peninsula (Kaliningrad region), part Stratigraphy and Geological Correlations, 16(3):295–316.
OpenUrl

[3] ↵
Alroy, J. (2016). Gateway to the paleobiology database. 2015. Available: www.fossilworks.org.

[4] ↵
Baca, S. M., Toussaint, E. F., Miller, K. B., and Short, A. E. (2017). Molecular phylogeny of the aquatic beetle family noteridae (coleoptera: Adephaga) with an emphasis on data partitioning strategies. Molecular Phylogenetics and Evolution, 107:282–292.
OpenUrl

[5] Baroni Urbani, C., Bolton, B., and Ward, P. S. (1992). The internal phylogeny of ants (Hy-menoptera: Formicidae). Systematic Entomology, (17):301–329.
OpenUrl

[6] ↵
Bayzid, M. S., Mirarab, S., Boussau, B., and Warnow, T. (2015). Weighted statistical binning: Enabling statistically consistent genome-scale phylogenetic analyses. PLoS ONE, 10(6):e0129183.
OpenUrl CrossRef PubMed

[7] Betancur-R., R., Naylor, G. J. P., and Orti, G. (2013). Conserved genes, sampling error, and phy-logenomic inference. Systematic Biology, 63(2):257–262.
OpenUrl PubMed

[8] ↵
Billen, J. P. J. (1985). Comparative ultrastructure of the poison and Dufour glands in Old and New World army ants (Hymenoptera, Formicidae). Actes des Colloques Insectes Sociaux, (2):17–26.
OpenUrl

[9] ↵
Billen, J. P. J. and Gotwald, W. H. J. (1988). The crenellate lining of the Dufour gland in the genus Aenictus: a new character for interpreting the phylogeny of Old World army ants (Hymenoptera, Formicidae, dorylinae). Zoologica Scripta, (17):293–295.
OpenUrl

[10] ↵
Blaimer, B. B., Brady, S. G., Schultz, T. R., Lloyd, M. W., Fisher, B. L., and Ward, P. S. (2015). Phylogenomic methods outperform traditional multi-locus approaches in resolving deep evolutionary history: a case study of formicine ants. BMC Evolutionary Biology, 15(1):e271.
OpenUrl

[11] ↵
Blumenstiel, B., Cibulskis, K., Fisher, S., DeFelice, M., Barry, A., Fennell, T., Abreu, J., Minie, B., Costello, M., Young, G., et al. (2010). Targeted exon sequencing by in-solution hybrid selection. Current Protocols in Human Genetics, 66: Chapter 18: Unit 18.4.

[12] ↵
Bolger, A. M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30(15):2114–2120.
OpenUrl CrossRef PubMed Web of Science

[13] ↵
Bolton, B. (1990). Army ants reassessed: the phylogeny and classification of the doryline section (Hymenoptera, Formicidae). Journal of Natural History, 24(6):1339–1364.
OpenUrl

[14] ↵
Bolton, B. and Fisher, B. L. (2012). Taxonomy of the cerapachyine ant genera Simopone forel, Vicinopone gen. n. and Tanipone gen. n. (Hymenoptera: Formicidae). Zootaxa, (3283):1–101.
OpenUrl

[15] ↵
Bonasio, R., Zhang, G., Ye, C., Mutti, N. S., Fang, X., Qin, N., Donahue, G., Yang, P., Li, Q., Li, C., Zhang, P., Huang, Z., Berger, S. L., Reinberg, D., Wang, J., and Liebig, J. (2010). Genomic comparison of the ants Camponotus floridanus and Harpegnathos saltator. Science, 329(5995):1068–1071.
OpenUrl Abstract/FREE Full Text

[16] ↵
Borgmeier, T. (1955). Die Wanderameisen der Neotropischen Region. Studia Entomologica, 3:1–720.
OpenUrl

[17] ↵
Borowiec, M. L. (2016a). AMAS: a fast tool for alignment manipulation and computing of summary statistics. PeerJ, 4:e1660.
OpenUrl CrossRef

[18] Borowiec, M. L. (2016b). Generic revision of the ant subfamily Dorylinae (Hymenoptera, Formicidae). ZooKeys, 608:1–280.
OpenUrl CrossRef

[19] ↵
Borowiec, M. L., Lee, E. K., Chiu, J. C., and Plachetzki, D. C. (2015). Extracting phylogenetic signal and accounting for bias in whole-genome data sets supports the Ctenophora as sister to remaining Metazoa. BMC Genomics, 16(1):e987.
OpenUrl

[20] ↵
Borowiec, M. L. and Longino, J. T. (2011). Three new species and reassessment of the rare neotropical ant genus Leptanilloides (Hymenoptera, Formicidae, leptanilloidinae). ZooKeys, 133:19–48.
OpenUrl

[21] ↵
Bouckaert, R., Heled, J., Kühnert, D., Vaughan, T., Wu, C.-H., Xie, D., Suchard, M. A., Rambaut, A., and Drummond, A. J. (2014). BEAST 2: A software platform for Bayesian evolutionary analysis. PLoS Computational Biology, 10(4):e1003537.
OpenUrl CrossRef

[22] ↵
Brady, S. G. (2003). Evolution of the army ant syndrome: The origin and long-term evolutionary stasis of a complex of behavioral and reproductive adaptations. Proceedings of the National Academy of Sciences, 100(11):6575–6579.
OpenUrl Abstract/FREE Full Text

[23] ↵
Brady, S. G., Fisher, B. L., Schultz, T. R., and Ward, P. S. (2014). The rise of army ants and their relatives: diversification of specialized predatory doryline ants. BMC Evolutionary Biology, 14(1):93.
OpenUrl

[24] ↵
Brady, S. G. and Ward, P. S. (2005). Morphological phylogeny of army ants and other dorylomorphs (Hymenoptera: Formicidae). Systematic Entomology, 30(4):593–618.
OpenUrl

[25] Branstetter, M. G., Longino, J. T., Reyes-López, J., Schultz, T. R., and Brady, S. G. (2016). Into the tropics: phylogenomics and evolutionary dynamics of a contrarian clade of ants. bioRxiv.

[26] ↵
Branstetter, M. G., Longino, J. T., Ward, P. S., and Faircloth, B. C. (2017). Enriching the ant tree of life: enhanced UCE bait set for genome-scale phylogenetics of ants and other Hymenoptera. Methods in Ecology and Evolution, On-line early.

[27] ↵
Brikiatis, L. (2014). The De Geer, Thulean and Beringia routes: key concepts for understanding early Cenozoic biogeography. Journal of Biogeography, 41(6):1036–1054.
OpenUrl GeoRef

[28] ↵
Brown, J. M. (2014). Predictive approaches to assessing the fit of evolutionary models. Systematic Biology, 63(3):289–292.
OpenUrl CrossRef PubMed

[29] ↵
Brown, W. L. J. (1975). Contributions toward a reclassification of the Formicidae. v. Ponerinae, tribes Platythyreini, Cerapachyini, Cylindromyrmecini, Acanthostichini, and Aenictogitini. Search. Agriculture (Ithaca, New York), 5(1):1–115.
OpenUrl

[30] ↵
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., and Madden, T. L. (2009). BLAST+: architecture and applications. BMC Bioinformatics, 10(1):421.
OpenUrl CrossRef PubMed

[31] ↵
Charif, D. and Lobry, J. R. (2007). SeqinR 1.0-2: A contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. In Structural Approaches to Sequence Evolution, pages 207–232. Springer Science Business Media.

[32] ↵
Cock, P. J. A., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., and de Hoon, M. J. L. (2009). Biopython: freely available python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11):1422–1423.
OpenUrl CrossRef PubMed Web of Science

[33] Coty, D., Aria, C., Garrouste, R., Wils, P., Legendre, F., and Nel, A. (2014). The first ant-termite syninclusion in amber with CT-scan analysis of taphonomy. PLoS ONE, 9(8):e104410.
OpenUrl

[34] ↵
Cox, B. (2001). The biogeographic regions reconsidered. Journal of Biogeography, 28(4):511–523.
OpenUrl CrossRef Web of Science

[35] ↵
De Andrade, M. (1998a). Fossil and extant species of Cylindromyrmex (Hymenoptera: Formicidae). Revue Suisse de Zoologie, 105(3):581–664.
OpenUrl Web of Science

[36] De Andrade, M. L. (1998b). First description of fossil Acanthostichus from Dominican amber (Hymenoptera: Formicidae). Mitteilungen der Schweizerischen Entomologischen Gesellschaft, 71:269–274.
OpenUrl

[37] De Andrade, M. L. (2001). A remarkable Dominican amber species of Cylindromyrmex with Brazilian affinities and additions to the generic revision (Hymenoptera: Formicidae). Beiträge zur Entomologie, (51):51–63.
OpenUrl

[38] ↵
Delsuc, F., Brinkmann, H., and Philippe, H. (2005). Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics, 6(5):361–375.
OpenUrl CrossRef PubMed Web of Science

[39] Dlussky, G. M. (1996). Ants (Hymenoptera: Formicidae) from Burmese amber. Paleontological Journal., (30):449–454.
OpenUrl

[40] ↵
Edwards, S. V. (2009). Is a new and general theory of molecular systematics emerging? Evolution, 63(1):1–19.
OpenUrl CrossRef PubMed Web of Science

[41] ↵
Faircloth, B. C. (2011). Illumiprocessor-software for Illumina read quality filtering.

[42] ↵
Faircloth, B. C. (2015). PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics, 32(5):786–788.
OpenUrl PubMed

[43] ↵
Faircloth, B. C., Branstetter, M. G., White, N. D., and Brady, S. G. (2014). Target enrichment of ultraconserved elements from arthropods provides a genomic perspective on relationships among Hymenoptera. Molecular Ecology Resources, 15(3):489–501.
OpenUrl

[44] ↵
Faircloth, B. C. and Glenn, T. C. (2012). Not all sequence tags are created equal: Designing and validating sequence identification tags robust to indels. PLoS ONE, 7(8):e42543.
OpenUrl CrossRef PubMed

[45] ↵
Faircloth, B. C., McCormack, J. E., Crawford, N. G., Harvey, M. G., Brumfield, R. T., and Glenn, T. C. (2012). Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology, 61(5):717–726.
OpenUrl CrossRef PubMed

[46] ↵
Fisher, S., Barry, A., Abreu, J., Minie, B., Nolan, J., Delorey, T. M., Young, G., Fennell, T. J., Allen, A., Ambrogio, L., Berlin, A. M., Blumenstiel, B., Cibulskis, K., Friedrich, D., Johnson, R., Juhn, F., Reilly, B., Shammas, R., Stalker, J., Sykes, S. M., Thompson, J., Walsh, J., Zimmer, A., Zwirko, Z., Gabriel, S., Nicol, R., and Nusbaum, C. (2011). A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biology, 12(1):R1.
OpenUrl CrossRef PubMed

[47] ↵
Foster, P. (2004). Modeling compositional heterogeneity. Systematic Biology, 53(3):485–495.
OpenUrl CrossRef PubMed Web of Science

[48] ↵
Foster, P. G. and Hickey, D. A. (1999). Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. Journal of Molecular Evolution, 48(3):284–290.
OpenUrl CrossRef PubMed Web of Science

[49] ↵
Frandsen, P. B., Calcott, B., Mayer, C., and Lanfear, R. (2015). Automatic selection of partitioning schemes for phylogenetic analyses using iterative k-means clustering of site rates. BMC Evolutionary Biology, 15(1):13.
OpenUrl

[50] ↵
Garnier, S., Murphy, T., Lutz, M., Hurme, E., Leblanc, S., and Couzin, I. D. (2013). Stability and responsiveness in a self-organized living architecture. PLoS Computational Biology, 9(3):e1002984.
OpenUrl

[51] ↵
Gascuel, O. (1997). BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Molecular Biology and Evolution, 14(7):685–695.
OpenUrl CrossRef PubMed Web of Science

[52] ↵
Gavryushkina, A., Welch, D., Stadler, T., and Drummond, A. J. (2014). Bayesian inference of sampled ancestor trees for epidemiology and fossil calibration. PLoS Computational Biology, 10(12):e1003919.
OpenUrl

[53] ↵
Goremykin, V. V., Nikiforova, S. V., Cavalieri, D., Pindo, M., and Lockhart, P. (2015). The root of flowering plants and total evidence. Systematic Biology, 64(5):879–891.
OpenUrl CrossRef PubMed

[54] ↵
Gotwald, W. H. (1995). Army ants: the biology of social predation. Cornell University Press Ithaca.

[55] ↵
Gotwald, W. H. J. (1979). Phylogenetic implications of army ant zoogeography (Hymenoptera: Formicidae). Annals of the Entomological Society of America, (72):462–467.
OpenUrl CrossRef

[56] ↵
Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N., and Regev, A. (2011). Full-length transcriptome assembly from RNA-seq data without a reference genome. Nature Biotechnology, 29(7):644–652.
OpenUrl CrossRef PubMed

[57] Grimaldi, D. and Agosti, D. (2000). A formicine in new jersey cretaceous amber (Hymenoptera: Formicidae) and early evolution of the ants. Proceedings of the National Academy of Sciences of the United States of America, (97):13678–13683.
OpenUrl Abstract/FREE Full Text

[58] ↵
Hasegawa, M. and Hashimoto, T. (1993). Ribosomal RNA trees misleading? Nature, 361(6407):23–23.
OpenUrl PubMed

[59] ↵
Heath, T. A., Huelsenbeck, J. P., and Stadler, T. (2014). The fossilized birth-death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences, 111(29):E2957–E2966.
OpenUrl Abstract/FREE Full Text

[60] ↵
Hermann, H. R. (1969). The hymenopterous poison apparatus: evolutionary trends in three closely related subfamilies of ants (Hymenoptera: Formicidae). Journal of the Georgia Entomological Society, 4(3):123–141.
OpenUrl

[61] ↵
Jaitrong, W. and Yamane, S. (2011). Synopsis of Aenictus species groups and revision of the A. currax and A. laeviceps groups in the eastern Oriental, Indo-Australian, and Australasian regions (Hymenoptera: Formicidae: Aenictinae). Zootaxa, 3128:1–46.
OpenUrl

[62] ↵
Jermiin, L., Ho, S. Y., Ababneh, F., Robinson, J., and Larkum, A. W. (2004). The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. Systematic Biology, 53(4):638–643.
OpenUrl CrossRef PubMed Web of Science

[63] ↵
Kronauer, D. J., Schöning, C., Vilhelmsen, L. B., and Boomsma, J. J. (2007). A molecular phy-logeny of Dorylus army ants provides evidence for multiple evolutionary transitions in foraging niche. BMC Evolutionary Biology, 7(1):56.
OpenUrl

[64] ↵
Kronauer, D. J. C. (2009). Recent advances in army ant biology (Hymenoptera: Formicidae). Myrmecological News, 12:51–65.
OpenUrl

[65] ↵
Lanfear, R., Frandsen, P. B., Wright, A. M., Senfeld, T., and Calcott, B. (2017). PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Molecular Biology And Evolution, 34(3):772–773.
OpenUrl CrossRef PubMed

[66] LaPolla, J. S., Dlussky, G. M., and Perrichot, V. (2013). Ants and the fossil record. Annual Review of Entomology, 58:609–630.
OpenUrl CrossRef PubMed Web of Science

[67] ↵
Lockhart, P. J., Steel, M. A., Hendy, M. D., and Penny, D. (1994). Recovering evolutionary trees under a more realistic model of sequence evolution. Molecular Biology and Evolution, 11(4):605–612.
OpenUrl PubMed Web of Science

[68] ↵
MacKay, W. P. (1996). A revision of the ant genus Acanthostichus. Sociobiology, 27(2):129–179.
OpenUrl

[69] ↵
Matzke, N. J. (2013). Probabilistic historical biogeography: new models for founder-event speciation, imperfect detection, and fossils allow improved accuracy and model-testing. University of California, Berkeley.

[70] ↵
Matzke, N. J. (2014). Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Systematic Biology, 63(6):951–970.
OpenUrl CrossRef PubMed

[71] ↵
Miller, M. A., Pfeiffer, W., and Schwartz, T. (2010). Creating the CIPRES science gateway for inference of large phylogenetic trees. In 2010 Gateway Computing Environments Workshop (GCE). Institute of Electrical & Electronics Engineers (IEEE).

[72] ↵
Mirarab, S., Bayzid, M. S., Boussau, B., and Warnow, T. (2014). Statistical binning enables an accurate coalescent-based estimation of the avian tree. Science, 346(6215):1250463–1250463.
OpenUrl Abstract/FREE Full Text

[73] ↵
Mirarab, S. and Warnow, T. (2015). ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics, 31(12):i44–i52.
OpenUrl CrossRef PubMed

[74] ↵
Moran, R., Morgan, C., and O’Connell, M. (2015). A guide to phylogenetic reconstruction using heterogeneous models—a case study from the root of the placental mammal tree. Computation, 3(2):177–196.
OpenUrl CrossRef

[75] Moreau, C. S. and Bell, C. D. (2013). Testing the museum versus cradle tropical biological diversity hypothesis: phylogeny, diversification, and ancestral biogeographic range evolution of the ants. Evolution, 67(8):2240–2257.
OpenUrl CrossRef PubMed Web of Science

[76] ↵
Nguyen, N. D., Mirarab, S., Kumar, K., and Warnow, T. (2015). Ultra-large alignments using phylogeny-aware profiles. Genome Biology, 16(1):124.
OpenUrl CrossRef PubMed

[77] ↵
Nygaard, S., Zhang, G., Schiøtt, M., Li, C., Wurm, Y., Hu, H., Zhou, J., Ji, L., Qiu, F., Rasmussen, M., Pan, H., Hauser, F., Krogh, A., Grimmelikhuijzen, C. J., Wang, J., and Boomsma, J. J. (2011). The genome of the leaf-cutting ant acromyrmex echinatior suggests key adaptations to advanced social life and fungus farming. Genome Research, 21(8):1339–1348.
OpenUrl Abstract/FREE Full Text

[78] ↵
Oxley, P. R., Ji, L., Fetter-Pruneda, I., McKenzie, S. K., Li, C., Hu, H., Zhang, G., and Kronauer, D. J. (2014). The genome of the clonal raider ant Cerapachys biroi. Current Biology, 24(4):451–458.
OpenUrl CrossRef PubMed

[79] ↵
Paradis, E. (2013). Molecular dating of phylogenies by likelihood methods: A comparison of models and a new information criterion. Molecular Phylogenetics and Evolution, 67(2):436–444.
OpenUrl CrossRef PubMed

[80] ↵
Paradis, E., Claude, J., and Strimmer, K. (2004). APE: Analyses of phylogenetics and evolution in R language. Bioinformatics, 20(2):289–290.
OpenUrl CrossRef PubMed Web of Science

[81] ↵
Peters, M. K., Likare, S., and Kraemer, M. (2008). Effects of habitat fragmentation and degradation on flocks of African ant-following birds. Ecological Applications, 18(4):847–858.
OpenUrl PubMed

[82] ↵
Philippe, H. and Forterre, P. (1999). The rooting of the universal tree of life is not reliable. J Mol Evol, 49(4):509–523.
OpenUrl CrossRef PubMed Web of Science

[83] ↵
Plummer, M., Best, N., Cowles, K., and Vines, K. (2006). Coda: Convergence diagnosis and output analysis for mcmc. R News, 6(1):7–11.
OpenUrl

[84] ↵
Rabosky, D. L. (2014). Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PLoS ONE, 9(2):e89543.
OpenUrl CrossRef PubMed

[85] ↵
Rabosky, D. L., Grundler, M., Anderson, C., Title, P., Shi, J. J., Brown, J. W., Huang, H., and Larson, J. G. (2014). BAMMtools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods in Ecology and Evolution, 5(7):701–707.
OpenUrl

[86] Radchenko, A. G., Elmes, G. W., and Dlussky, G. (2007). The ants of the genus Myrmica (Hy-menoptera, Formicidae) from Baltic and Saxonian amber (Late Eocene). Journal of Paleontology, 81(6):1494–1501.
OpenUrl FREE Full Text

[87] ↵
Raignier, A., Van Boven, J., and du Congo Belge, M. R. (1955). Étude taxonomique, biologique et biométrique des Dorylus du sous-genre Anomma (Hymenoptera Formicidae), volume 2. Musée royal du Congo belge.

[88] ↵
Ravary, F. and Jaisson, P. (2002). The reproductive cycle of thelytokous colonies of Cerapachys biroi forel (Formicidae, cerapachyinae). Insectes Sociaux, 49(2):114–119.
OpenUrl CrossRef Web of Science

[89] ↵
Roch, S. and Warnow, T. (2015). On the robustness to gene tree estimation error (or lack thereof) of coalescent-based species tree methods. Systematic Biology, 64(4):663–676.
OpenUrl CrossRef PubMed

[90] ↵
Rodríguez-Ezpeleta, N., Brinkmann, H., Roure, B., Lartillot, N., Lang, B. F., and Philippe, H. (2007). Detecting and overcoming systematic errors in genome-scale phylogenies. Systematic Biol., 56(3):389–399.
OpenUrl CrossRef PubMed Web of Science

[91] ↵
Rohland, N. and Reich, D. (2012). Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Research, 22(5):939–946.
OpenUrl Abstract/FREE Full Text

[92] ↵
Salichos, L. and Rokas, A. (2013). Inferring ancient divergences requires genes with strong phy-logenetic signals. Nature, 497(7449):327–331.
OpenUrl CrossRef PubMed Web of Science

[93] ↵
Sanderson, M. J. (2002). Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular Biology and Evolution, 19(1):101–109.
OpenUrl CrossRef PubMed Web of Science

[94] ↵
Sanmartín, I., Enghoff, H., and Ronquist, F. (2001). Patterns of animal dispersal, vicariance and diversification in the Holarctic. Biological Journal of the Linnean Society, 73(4):345–390.
OpenUrl CrossRef GeoRef Web of Science

[95] ↵
Sayyari, E. and Mirarab, S. (2016). Fast coalescent-based computation of local branch support from quartet frequencies. Molecular Biology and Evolution, 33(7):1654–1668.
OpenUrl CrossRef PubMed

[96] ↵
Schneirla, T. C. (1945). The army-ant behavior pattern: Nomad-statary relations in the swarmers and the problem of migration. Biological Bulletin, 88(2):166.
OpenUrl

[97] ↵
Schrader, L., Kim, J. W., Ence, D., Zimin, A., Klein, A., Wyschetzki, K., Weichselgartner, T., Kemena, C., Stökl, J., Schultner, E., Wurm, Y., Smith, C. D., Yandell, M., Heinze, J., Gadau, J., and Oettler, J. (2014). Transposable element islands facilitate adaptation to novel environments in an invasive species. Nature Communications, 5:5495.
OpenUrl

[98] ↵
Schöning, C., Njagi, W. M., and Franks, N. R. (2005). Temporal and spatial patterns in the emigra-tions of the army ant Dorylus (Anomma) molestus in the montane forest of Mt Kenya. Ecological Entomology, 30(5):532–540.
OpenUrl

[99] ↵
Shen, X.-X., Hittinger, C. T., and Rokas, A. (2017). Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nature Ecology & Evolution, 1(5):0126.
OpenUrl CrossRef

[100] ↵
Smith, C. D., Zimin, A., Holt, C., Abouheif, E., Benton, R., Cash, E., Croset, V., Currie, C. R., Elhaik, E., Elsik, C. G., Fave, M.-J., Fernandes, V., Gadau, J., Gibson, J. D., Graur, D., Grubbs, K. J., Hagen, D. E., Helmkampf, M., Holley, J.-A., Hu, H., Viniegra, A. S. I., Johnson, B. R., Johnson, R. M., Khila, A., Kim, J. W., Laird, J., Mathis, K. A., Moeller, J. A., Munoz-Torres, M. C., Murphy, M. C., Nakamura, R., Nigam, S., Overson, R. P., Placek, J. E., Rajakumar, R., Reese, J. T., Robertson, H. M., Smith, C. R., Suarez, A. V., Suen, G., Suhr, E. L., Tao, S., Torres, C. W., van Wilgenburg, E., Viljakainen, L., Walden, K. K. O., Wild, A. L., Yandell, M., Yorke, J. A., and Tsutsui, N. D. (2011a). Draft genome of the globally widespread and invasive Argentine ant (Linepithema humile). Proceedings of the National Academy of Sciences, 108(14):5673–5678.
OpenUrl Abstract/FREE Full Text

[101] ↵
Smith, C. R., Cahan, S. H., Kemena, C., Brady, S. G., Yang, W., Bornberg-Bauer, E., Eriksson, T., Gadau, J., Helmkampf, M., Gotzek, D., Miyakawa, M. O., Suarez, A. V., and Mikheyev, A. (2015). How do genomes create novel phenotypes? insights from the loss of the worker caste in ant social parasites. Molecular Biology and Evolution, 32(11):2919–2931.
OpenUrl CrossRef PubMed

[102] ↵
Smith, C. R., Smith, C. D., Robertson, H. M., Helmkampf, M., Zimin, A., Yandell, M., Holt, C., Hu, H., Abouheif, E., Benton, R., Cash, E., Croset, V., Currie, C. R., Elhaik, E., Elsik, C. G., Fave, M.-J., Fernandes, V., Gibson, J. D., Graur, D., Gronenberg, W., Grubbs, K. J., Hagen, D. E., Viniegra, A. S. I., Johnson, B. R., Johnson, R. M., Khila, A., Kim, J. W., Mathis, K. A., Munoz- Torres, M. C., Murphy, M. C., Mustard, J. A., Nakamura, R., Niehuis, O., Nigam, S., Overson, R. P., Placek, J. E., Rajakumar, R., Reese, J. T., Suen, G., Tao, S., Torres, C. W., Tsutsui, N. D., Viljakainen, L., Wolschin, F., and Gadau, J. (2011b). Draft genome of the red harvester ant Pogonomyrmex barbatus. Proceedings of the National Academy of Sciences, 108(14):5667–5672.
OpenUrl Abstract/FREE Full Text

[103] ↵
Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313.
OpenUrl CrossRef PubMed Web of Science

[104] ↵
Suen, G., Teiling, C., Li, L., Holt, C., Abouheif, E., Bornberg-Bauer, E., Bouffard, P., Caldera, E. J., Cash, E., Cavanaugh, A., Denas, O., Elhaik, E., Favé, M.-J., Gadau, J., Gibson, J. D., Graur, D., Grubbs, K. J., Hagen, D. E., Harkins, T. T., Helmkampf, M., Hu, H., Johnson, B. R., Kim, J., Marsh, S. E., Moeller, J. A., Muñoz-Torres, M. C., Murphy, M. C., Naughton, M. C., Nigam, S., Overson, R., Rajakumar, R., Reese, J. T., Scott, J. J., Smith, C. R., Tao, S., Tsutsui, N. D., Viljakainen, L., Wissler, L., Yandell, M. D., Zimmer, F., Taylor, J., Slater, S. C., Clifton, S. W., Warren, W. C., Elsik, C. G., Smith, C. D., Weinstock, G. M., Gerardo, N. M., and Currie, C. R. (2011). The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle. PLoS Genetics, 7(2):e1002007.
OpenUrl

[105] ↵
Sukumaran, J. and Holder, M. T. (2010). DendroPy: a Python library for phylogenetic computing. Bioinformatics, 26(12):1569–1571.
OpenUrl CrossRef PubMed Web of Science

[106] ↵
Talavera, G. and Castresana, J. (2007). Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Systematic Biol., 56(4):564–577.
OpenUrl CrossRef PubMed Web of Science

[107] ↵
Tan, G., Muffato, M., Ledergerber, C., Herrero, J., Goldman, N., Gil, M., and Dessimoz, C. (2015). Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Systematic Biology, 64(5):778–791.
OpenUrl CrossRef PubMed

[108] ↵
Team, R. C. (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2013.

[109] ↵
Tsuji, K. and Yamauchi, K. (1995). Production of females by parthenogenesis in the ant, Cera-pachys biroi. Insectes Sociaux, 42(3):333–336.
OpenUrl CrossRef Web of Science

[110] Ward, P. S., Brady, S. G., Fisher, B. L., and Schultz, T. R. (2015). The evolution of myrmicine ants: phylogeny and biogeography of a hyperdiverse ant clade (Hymenoptera: Formicidae). Systematic Entomology, 40(1):61–81.
OpenUrl

[111] Wilson, E. O. (1985). Ants of the Dominican amber (Hymenoptera: Formicidae). 2. The first fossil army ants. Psyche (Cambridge), (92):11–16.
OpenUrl

[112] ↵
Wurm, Y., Wang, J., Riba-Grognuz, O., Corona, M., Nygaard, S., Hunt, B. G., Ingram, K. K., Falquet, L., Nipitwattanaphon, M., Gotzek, D., Dijkstra, M. B., Oettler, J., Comtesse, F., Shih, C.-J., Wu, W.-J., Yang, C.-C., Thomas, J., Beaudoing, E., Pradervand, S., Flegel, V., Cook, E. D., Fabbretti, R., Stockinger, H., Long, L., Farmerie, W. G., Oakey, J., Boomsma, J. J., Pamilo, P., Yi, S. V., Heinze, J., Goodisman, M. A. D., Farinelli, L., Harshman, K., Hulo, N., Cerutti, L., Xenarios, I., Shoemaker, D., and Keller, L. (2011). The genome of the fire ant Solenopsis invicta. Proceedings of the National Academy of Sciences, 108(14):5679–5684.
OpenUrl Abstract/FREE Full Text

[113] ↵
Zhong, M., Hansen, B., Nesnidal, M., Golombek, A., Halanych, K. M., and Struck, T. H. (2011). Detecting the symplesiomorphy trap: a multigene phylogenetic analysis of terebelliform annelids. BMC Evolutionary Biology, 11(1):369.
OpenUrl

Convergent evolution of the army ant syndrome and congruence in big-data phylogenetics

Abstract

Introduction

Results

Relationships Among Doryline Lineages Inferred From Combined Data Matrix

Evidence for Bias in Maximum Likelihood Tree Based on All and “High Signal” Loci

The Timeline of Doryline Evolution and Diversification

Biogeographic History

Concluding Remarks

Doryline Biology and Evolution Need Further Study

Densely Sampled Phylogenomic Data Sets Are Not Robust to Artefacts and Bias

Methods

Taxon Sampling and Data Generation

Phylogenetics

Divergence Time Estimation

Diversification Analyses

Biogeographic History Estimation

Extended Methods

Data availability

Taxon sampling

Molecular data collection and sequencing

Processing of UCE data

Extraction of protein-coding sequences

Alignment and trimming

Partitioning

Phylogenetic analyses using maximum likelihood

Phylogenetic analyses using Bayesian inference

Species tree analyses

Measures of compositional heterogeneity

Tree-based locus statistics

Divergence time estimation

Diversification analyses

Biogeographic analyses

Acknowledgments

References

Citation Manager Formats

Subject Area