Abstract
Understanding the course of eco-morphological evolution in adaptive radiations is challenging as the phylogenetic relationships among the species involved are typically difficult to resolve. Newts of the genus Triturus (marbled and crested newts) are a well-studied case: they exhibit substantial variation in the number of trunk vertebrae (NTV) and a higher NTV corresponds to a longer annual aquatic period. Because the Triturus phylogeny is still unresolved, the evolutionary pathway for NTV and annual aquatic period is unclear. To resolve the phylogeny of Triturus, we generate a c. 6,000 transcriptome-derived marker data set using a custom target enrichment probe set, and conduct phylogenetic analyses including: 1) data concatenation with RAxML, 2) gene tree summary with ASTRAL, and 3) species tree estimation with SNAPP. All analyses consistently result in the same, highly supported topology. Our new phylogenetic hypothesis only requires the minimal number of inferred changes in NTV count to explain the NTV radiation observed today. This suggests that, while diversification in body shape allowed ecological expansion in Triturus to encompass an increasingly aquatic life style, body shape evolution was phylogenetically constrained.
1. Background
In adaptive radiations, reproductive isolating barriers between nascent species evolve in response to rapid ecological specialization [1] and that ecological speciation typically correlates with pronounced morphological differentiation [2, 3]. Adaptive radiations are known throughout the Tree of Life and illustrate the power of natural selection to drive speciation [4]. While adaptive radiations represent some of the best-known examples of evolution in action – most famously Darwin’s finches [5] and Lake Victoria cichlid fishes [6] – the phylogenetic relationships between the species involved are notoriously difficult to decipher [7–9]. Yet, to accurately retrace the evolution of phenotypic diversity in adaptive radiations, requires well-established phylogenies.
Inferring the true branching order in adaptive radiations is hampered by the short time frame over which they unfold, which provides little opportunity between splitting events for phylogenetically informative substitutions to become established (resulting in low phylogenetic resolution [10, 11]) and fixed (resulting in incomplete lineage sorting and discordance among gene trees [12–14]). Resolving the phylogeny of rapidly multiplying lineages becomes more complicated the further back in time the radiation occurred, as the accumulation of uninformative substitutions along terminal branches leads to long-branch attraction [15, 16]. A final impediment is reticulation between closely related (not necessarily sister-) species through past or ongoing hybridization, resulting in additional gene-tree/species-tree discordance [17–19].
Phylogenomics can help. Consulting a large number of markers spread throughout the genome has proven successful in resolving both recent (e.g. [20–26]) and ancient (e.g. [27–31]) evolutionary radiations. Advances in laboratory and sequencing techniques, bioinformatics and tree-building methods, facilitate phylogenetic reconstruction based on thousands of homologous loci for a large number of individuals, and promise to help reveal the evolution of eco-morphological characters involved in adaptive radiations [32, 33]. Here we conduct a phylogenomic analysis for an adaptive radiation that moderately-sized multilocus nuclear DNA datasets [34–36] have consistently failed to resolve: the Eurasian newt genus Triturus (Amphibia: Urodela: Salamandridae; vernacularly known as the marbled and crested newts).
One of the most intriguing features of Triturus evolution is the correlation between ecology and the number of trunk vertebrae (NTV). Species characterized by a higher modal NTV (which translates into a more elongate body build with shorter limbs) are associated with a more aquatic lifestyle [37–44]; the number of months a Triturus species spends in the water (defined at the population level as the peak in emigration minus the peak in immigration) roughly equals NTV minus 10 (Fig. 1). The intrageneric variation in NTV shown by Triturus, ranging from 12 to 17, is unparalleled in the family Salamandridae [44, 45]. Although a causal relationship between NTV expansion and an increasingly aquatic lifestyle has been presumed [37–44], the evolutionary pathway of NTV and aquaticness in the adaptive radiation of Triturus is unclear. A resolved species tree is required to address this issue.
Our goal is to obtain a genome-enabled phylogeny for Triturus use it to reconstruct the eco-morphological evolution of NTV and aquatic/terrestrial ecology across the genus. As the large size of salamander genomes hampers whole genome sequencing (but see [46–48]), we employ a genome reduction approach in which we capture and sequence a set of transcriptome-derived markers using target enrichment, a technique which affords extremely high resolution at multiple taxonomic levels [49–54]. Using data concatenation (with RAxML), gene tree summarization (with ASTRAL) and species tree estimation (with SNAPP), we fully resolve the Triturus phylogeny and place the extreme body shape and ecological variation observed in this adaptive radiation into an evolutionary context.
2. Methods
Target capture array design
Nine Triturus newts (seven crested and two marbled newt species) and one banded newt (Ommatotriton) were subjected to transcriptome sequencing. Transcriptome assemblies for each species were generated using Trinity v2.2.0 [55], clustered at 90% using usearch v9.1.13 [56], and subjected to reciprocal best blast hit analysis [57–59] to produce a set of T. dobrogicus transcripts (the species with the highest quality transcriptome assembly) that had putative orthologues present in the nine other transcriptome assemblies. These transcripts were then annotated using blastx to Xenopus tropicalis proteins, retaining one annotated transcript per protein. We attempted to discern splice sites in the transcripts, as probes spanning splice boundaries may perform poorly [60], by mapping transcripts iteratively to the genomes of Chrysemys picta [61], X. tropicalis [62], Nanorana parkerii [63] and Rana catesbeiana [64]. A single exon ≥ 200bp and ≤ 450bp was retained for each transcript target. To increase the utility of the target set to all Triturus species, orthologous sequences from multiple species were included for targets with > 5% sequence divergence from T. dobrogicus [49]. We generated a target set of 7,102 genomic regions for a total target length of approximately 2.3 million bp. A total of 39,143 unique RNA probes were synthesized as a MyBaits-II kit for this target set at approximately 2.6X tiling density by Arbor Biosciences (Ann Arbor, MI, Ref# 170210-32). A detailed outline of the target capture array design process is presented in Supplementary Text S1.
Sampling scheme
We sampled 23 individual Triturus newts (Fig. 2; Supplementary Table S1) for which tissues were available from previous studies [65–67]. Because the sister relationship between the two marbled and seven crested newts is well established, while the relationships among the crested newt species are unclear, we sampled the crested newt species more densely, including three individuals per species to capture intraspecific differentiation and avoid misleading phylogenies resulting from single exemplar sampling [68] (Fig. 1). As Triturus species show introgressive hybridization at contact zones [69], we aimed to reduce the impact of interspecific gene flow by only including individuals that originate away from hybrid zones and have previously been interpreted as unaffected by interspecific genetic admixture [65, 66]. A test for the phylogenetic utility of the transcripts used for marker design underscores the reality of phylogenetic distortion by interspecific gene flow (details in Supplementary Text S1).
Laboratory methods
DNA was extracted from samples using a salt extraction protocol [70], and 10,000ng per sample was sheared to approximately 200bp-500bp on a BioRuptor NGS (Diagenode) and dual-end size selected (0.8X-1.0X) with SPRI beads. Dual-indexed libraries were prepared from 375-2000ng of size selected DNA using KAPA LTP library prep kits [71]. These libraries were pooled (with samples from other projects) into batches of 16 samples at 250ng per sample (4,000ng total) and enriched in the presence of 30,000ng of c0t-1 repetitive sequence blocker [50] derived from T. carnifex (casualties from a removal action of an invasive population [72]) by hybridizing blockers with libraries for 30 minutes and probes with libraries/blockers for 30 hours. Enriched libraries were subjected to 14 cycles of PCR with KAPA HiFi HotStart ReadyMix and pooled at an equimolar ratio for 150bp paired-end sequencing across multiple Illumina HiSeq 4000 lanes (receiving an aggregate of 18% of one lane, for a multiplexing equivalent of 128 samples per lane).
Processing of target capture data
Sequences from the sample receiving the greatest number of reads were used to de novo assemble target sequences for each target region using the assembly by reduced complexity (ARC) pipeline [73]. A single assembled contig was selected for each original target region by means of reciprocal best blast hit (RBBH) [74] and these were used as a reference assembly for all downstream analyses. Adapter contamination was removed from sample reads using skewer v0.2.2 [75] and reads were then mapped to the reference assembly using BWA-MEM v0.7.15-r1140 [76]. Picard tools v2.9.2 (https://broadinstitute.github.io/picard/) was used to add read group information and mark PCR duplicates, and HaplotypeCaller and GenotypeGVCFs from GATK v3.8 [77] were used to jointly genotype the relevant groups of samples (either crested newts or crested newts + marbled newts depending on the analysis; see below). SNPs that failed any of the following hard filters were removed: QD < 2, MQ < 40, FS > 60, MQRankSum < −12.5, ReadPosRankSum < −8, and QUAL < 30 [78]. We next attempted to remove paralogous targets from our dataset with a Hardy Weinberg Equilibrium (HWE) filter for heterozygote excess. Heterozygote excess p-values were calculated for every SNP using vcftools 0.1.15 [79], and any target containing at least one SNP with a heterozygote excess p-value < 0.05 was removed from downstream analysis. More detail on the processing of the target capture data can be found in Supplementary Text S2.
Phylogenetic analyses
For data concatenation, a maximum likelihood phylogeny was inferred with RAxML version 8.2.11 [80] based on an alignment of 133,601 SNPs across 5,866 different targets. We included all 23 Triturus individuals in this analysis. For gene tree summary, ASTRAL v5.6.1 [81] was used to estimate the crested newt species tree from 5,610 gene trees generated in RAxML. The 21 crested newt samples were assigned species membership and no marbled newts were included because estimating terminal branch lengths is not possible for species with a single representative. For species tree estimation, SNAPP v1.3.0 [82] within the BEAST v2.4.8 [83] environment was used to infer the crested newt species tree from biallelic SNPs randomly selected from each of 5,581 post-filtering targets. All three individuals per crested newt species were treated as a single terminal and marbled newts were again excluded because sampling one individual per species violates the Yule speciation prior assumption. We also estimated divergence times in SNAPP. A detailed description of our strategy for phylogenetic analyses is available in Supplementary Text S3.
3. Results
The concatenated analysis with RAxML supports a basal bifurcation in Triturus between the marbled and crested newts (Fig. 3), consistent with the prevailing view that they are reciprocally monophyletic [34–36]. RAxML recovers each of the crested newt species as monophyletic, validating our decision to collapse the three individuals sampled per species in a single terminal in ASTRAL and SNAPP. Furthermore, all five Triturus body builds are recovered as monophyletic (cf. [34–36]). The greatest intraspecific divergence is observed in T. carnifex (Supplementary Text S1; Supplementary Fig. S1; Supplementary Table S2).
Phylogenetic inference based on data concatenation with RAxML (Fig. 3), gene tree summary with ASTRAL (Fig. 4a) and species tree estimation with SNAPP (Fig. 4b) all recover the same crested newt topology, with a basal bifurcation between the T. karelinii-group (NTV = 13; T. ivanbureschi sister to T. anatolicus + T. karelinii) and the remaining taxa, which themselves are resolved into the species pairs T. carnifex + T. macedonicus (NTV=14; the T. carnifex-group), and T. cristatus (NTV=15) + T. dobrogicus (NTV=16/17). In addition, the bifurcation giving rise to the four crested newt species groups (cf. Fig. 1) occurred in a relatively short time frame (Supplementary Fig. S2), reflected by two particularly short, but resolvable internal branches (Fig. 3; Fig. 4).
The phylogenomic analyses suggest considerable gene tree/species tree discordance in Triturus. The normalized quartet score of the ASTRAL tree (Fig. 4a), which reflects the proportion of input gene tree quartets satisfied by the species tree, is 0.63, indicating a high degree of incomplete lineage sorting. Furthermore, the only node in the SNAPP tree with a posterior probability below 1.0 (i.e. 0.99) is subtended by a very short branch (Fig. 4b). We also observe highly supported topological incongruence with the full mtDNA-based phylogeny of Triturus (Supplementary Text S4; Supplementary Fig. S3) [38].
Considering an NTV count of 12, as observed in the marbled newts as well as the most closely related newt genera, as the ancestral state for Triturus [44, 84], three sequential single-vertebral additions to NTV along internal branches, and one or two additions along the terminal branch leading to T. dobrogicus (in which NTV = 16 and NTV = 17 occur at approximately equal frequency [44, 85]), are required to explain the present-day variation in NTV observed in Triturus (Fig. 3). This is the minimum possible number of inferred changes in NTV count required to explain the NTV radiation observed today (Supplementary Fig. S4). No NTV deletions or reversals have to be inferred, implying a linear, single-addition progression rule for vertebral addition in Triturus.
4. Discussion
We use phylogenomic data to study the evolution of ecological and phenotypic diversity within the adaptive radiation of Triturus newts. In contrast to previous attempts to recover a multilocus species tree [34–36], we recover full phylogenetic resolution with strong support. Despite a high degree of gene tree/species tree discordance, independent phylogenetic approaches based on data concatenation (RAxML), gene tree summarization (ASTRAL) and species tree estimation (SNAPP), all recover the same, highly supported topology for Triturus (Fig. 3; Fig. 4). The Triturus case study underscores that sequence capture by target enrichment is a promising approach to resolve the phylogenetic challenges associated with adaptive radiations, particularly for taxa with large and complicated genomes where other genomic approaches are impractical, including salamanders [50].
Our new phylogenetic hypothesis allows us to place the eco-morphological differentiation shown by Triturus into a coherent evolutionary context. Over time, Triturus expanded its range of NTV to encompass higher counts (Fig. 3). The Triturus tree is consistent with a maximally parsimonious scenario, under which four to five character state changes are required to explain the radiation in NTV observed today. Any other possible topology would necessitate a higher number of NTV changes to be inferred (Supplementary Fig. S4). Three of these inferred changes are positioned on internal branches, of which two are particularly short, suggesting that changes in NTV count can evolve in a relatively short time. The fourth and fifth inferred change are situated on the external branch leading to T. dobrogicus, the only Triturus species with substantial intraspecific variation in NTV count [44, 85].
Newts annually alternate between an aquatic and a terrestrial habitat and the functional trade-off between adaptation to life in water or on land likely poses contrasting demands on body build [86–89]. Assuming that the observed relationship between one additional trunk vertebra and an extra month annually spent in the water (Fig. 1) is causal, then the NTV flexibility expressed by Triturus suggests enhanced ecological opportunities by an ability to exploit a wider range in hydroperiod (i.e. the annual availability of standing water) more efficiently. Despite the evolvability of NTV count [44], NTV evolution has been phylogenetically constrained in Triturus; apparently the change in NTV count was directional and involved the addition of one trunk vertebra at a time (Fig. 3; Supplementary Fig. S4). Species with a more derived body build have a relatively prolonged aquatic period and, because species with transitional NTV counts remain extant, the end result is an eco-morphological radiation.
Triturus newts show a certain degree of intraspecific variation in NTV today. Such variation is partially explained by interspecific hybridization (emphasizing the genetic basis of NTV count) [69], but there is standing variation in NTV count within all Triturus species [42]. This suggests that, during Triturus evolution, there has always been intraspecific NTV count polymorphism for natural selection to work with. Whether the directional, parsimonious evolution of higher NTV and the equally parsimonious evolutionary increase in aquatic lifestyle is causal, and which of these two may be the actual target of selection, remain important open questions. A proper understanding of the functional relationship between body build and aquaticness in Triturus is still lacking [86]. The recent availability of the first salamander genomes [46–48] offers the prospect of sequencing the genome of each Triturus species and exploring the developmental basis for NTV and its functional consequences in the diversification of the genus.
Ethics
For sampling for transcriptome sequencing permits were provided by the Italian Ministry of the Environment (DPN-2009-0026530), the Environment Protection Agency of Montenegro (no. UPI-328/4), the Ministry of Energy, Development and Environmental Protection of Republic of Serbia (no. 353-01-75/2014-08), and TÜBİTAK, Turkey (no. 113Z752). RAVON & Natuurbalans-Limes Divergens provided the T. carnifex used to create c0t-1.
Data availability
Raw sequence read data for the sequence capture libraries of the 23 Triturus samples and the transcriptome libraries are available at SRA (PRJNA498336). Transcriptome assemblies and genotype calls (VCF) for the 21- and 23-sample datasets are available at Zenodo (https://doi.org/10.5281/zenodo.1470914).
Author contributions
BW, EMM, JWA, RKB, HBS designed the research; BW and EMM performed the research; BW and EMM wrote the paper with input from JWA, RKB and HBS. All authors gave final approval for publication.
Competing interests
We have no competing interests.
Funding
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 655487.
Acknowledgements
Andrea Chiocchio, Daniele Canestrelli, Michael Fahrbach, Ana Ivanović, Raymond van der Lans, and Kurtuluş Olgun helped obtain samples for transcriptome sequencing. Tara Luckau helped in the lab. Peter Scott provided valuable suggestions on methodology. This work used the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 Instrumentation Grants S10RR029668 and S10RR027303. Computing resources were provided by XSEDE [90] and the Texas Advanced Computing Center (TACC) Stampede2 cluster at The University of Texas at Austin.
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].
- [19].↵
- [20].↵
- [21].
- [22].
- [23].
- [24].
- [25].
- [26].↵
- [27].↵
- [28].
- [29].
- [30].
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].
- [36].↵
- [37].↵
- [38].↵
- [39].
- [40].
- [41].
- [42].↵
- [43].
- [44].↵
- [45].↵
- [46].↵
- [47].
- [48].↵
- [49].↵
- [50].↵
- [51].
- [52].
- [53].
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].
- [88].
- [89].↵
- [90].↵
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵