Abstract
Tip-dating methods are becoming popular alternatives to traditional “node-dating.” However, they have not been extensively tested. We “ground-truth” the most popular methods against a dated tree of fossil Canidae derived from monographs by Wang and Tedford. Using a revised canid morphology dataset we compare MrBayes 3.2.5 to Beast 2.1.3 combined with BEASTmasteR (phylo.wikidot.com/beastmaster), an R package that automates the conversion of dates, priors, and NEXUS character matrices into the complex Beast2 XML format. We find that unconstrained MrBayes analysis under the uniform node age prior fails to retrieve reasonable results, exhibiting extremely high uncertainty in dates. On the other hand, Beast2 inference matches the ground-truth well, under both birth-death serially sampled (BDSS, disallowing direct ancestors) and sampled ancestor (SABD) tree models, as does MrBayes using BDSS. MrBayes using SABD seems to have difficulty converging in some analyses. These results, on a high quality fossil dataset, indicate that while tip-dating is very promising, methodological issues in tip-dating can have drastic effects, and require close attention, especially on more typical datasets where the distinction between “method problems” and “data problems” will be more difficult to detect.
Main text
Testing phylogenetic inference methods against an externally known truth is highly desirable, but is rarely possible except when an experimenter manufactures a known evolutionary history either with simulations [1] or splitting populations of microbial/virus cultures [2]. Even when conducted, it is debatable to what extent manufactured histories are comparable to the complexity of real evolutionary histories, where heterogeneity of rates, environment, and data acquisition are likely to be significant [3].
Our goal is to assess Bayesian total evidence (“tip-dating”) methods. While node-dating approaches are valuable, they are subject to a number of well-known criticisms [4-7] such as subjectivity and incomplete use of information. In addition, node-dating weakens inferences to the extent that it essentially constrains a priori some of the nodes/dates that we would prefer to infer. Tip-dating methods are becoming popular, but results seem to vary widely between methods and datasets (references in Supplemental Material, SM). It is useful to “ground-truth” [8] the new methods and models on an “ideal” empirical dataset, one where the fossil record is of sufficiently high quality that the true evolutionary tree and dates are broadly known even without complex computational methods. An ideal dataset would also meet the assumptions made by the models (Table 1). When multiple methods are run on the same ground-truth dataset, not only can differences in inference be attributed to the method, but an assessment can be made about which methods are making inferences closer to the “known” truth. Methods that fail on the ideal dataset are unlikely to provide useful inferences on typical, non-ideal datasets.
Fossil datasets ideal for ground-truthing are few and far between, due to the vagaries of preservation and description, but one for which a strong argument can be made is the fossil Canidae (dog family; [9]). Canids avoid the challenges faced by most datasets (Table 1; SM), and have been thoroughly monographed (SM). However, we acknowledge that even a very good fossil dataset does not represent absolute “truth”, so we have no objection to reading our study as a comparison of tip-dating methods against established expert opinion.
Methods
Ground-truth tree and characters
The tree was digitized from the monographs of Wang and Tedford using TreeRogue [10], with judgment calls resolved in favour of preserving the authors’ depiction of divergence times (SM). Morphological characters and dates came from the published matrix of Slater (2015) [11, 12].
Tip-dating analyses
MrBayes analyses were conducted by modification of Slater’s commands file. A large number of variant MrBayes analyses (38 total) were constructed to investigate several issues noticed in the interaction of MrBayes versions and documentation, and Slater’s commands file (SM, Appendix 1).
We focused on six analyses (four MrBayes 3.2.5 analyses and two Beast2.1.3) to compare to the ground-truth tree (Figure 1a), and to Slater’s published analysis (Figure 1b: mb1_orig) analyses. These were (1c) mb1: Slater’s original uniform node age prior analysis including node date calibrations, with some corrections; (1d) mb8: uniform node age prior, no node dates, flat priors on clock parameters, uniform(45,100) prior on the root age; (1f) mb9: mb8 but with SABD tree prior and flat priors on speciation, extinction, and sampling rate; (1e) mb10: mb9 but BDSS, i.e. disallowing sampled ancestors; (1g) r1: Beast2 BDSS analysis with flat priors were used for each major parameter (mean and SD of the lognormal relaxed clock; and birth, death, and serial sampling rates); (1h) r2: Beast2 SABD analysis with the same priors. Beast2 analyses were constructed with BEASTmasteR [13, 14]; full details on the analyses and post-processing steps are presented in SM.
Results
The dated trees from the six focal analyses are compared in Figure 1 (plots of all trees are available in SM), and key priors and results are shown in Table S1. The general picture is clear: the unconstrained MrBayes uniform node age prior analysis (mb8) produces implausibly old ages and huge uncertainties, with the age of Canidae overlapping the K-Pg boundary. This behaviour was also noted by Slater [11]. The ground-truth dates of crown Canis (which includes Cuon, Lycaon, and Xenocyon) and crown Caninae are 3.2 and 11.7 Ma, but mb8 makes mean estimates of 27.5 and 38.9 Ma, and even the very wide 95% highest posterior densities (HPDs), spanning 22-25 my, do not overlap the truth. More surprisingly, even Slater’s highly constrained analysis (mb1), although much closer, does not produce HPDs (5.1-9.6 Ma; 17.8-25.5 Ma) that overlap the ground truth for these nodes. In contrast, both Beast2 analysis (r1 and r2) and MrBayes BDSS (mb10) produce mean estimates close to the truth, with narrower HPD widths (2-3 my).
The MrBayes SABD analysis (mb9) wrongly estimated these node ages as identical with the age of Canidae; this is due to mb9 misplacing Lycaon pictus (African wild dog) and Cuon javanicus (Dhole) in the extinct Borophagines. If this is ignored, the estimates are much closer, although the crown Canis estimate still fails to overlap (Table S1, notes 5 and 6). A suggestion to repeat mb9 with 4 runs instead of 2 (Mike Lee, personal communication) did produce an mb9 result that placed these taxa in the conventional position (SM), but we present the unconventional result to emphasize that it appears much greater care is required to achieve convergence with MrBayes SABD than with other methods.
This overall picture is confirmed by additional comparisons, include comparisons of topological distances, correlation plots between estimated and true dates, and posterior prediction of tip dates (SM and Tables S1, S2)
Discussion
The result of greatest interest is the poor performance of the MrBayes uniform node age prior even in a “perfect-case” dataset. Whether or not this is surprising depends on researcher background. We suggest that reasoning from first principles suggests that effective tip-dating under the uniform node age prior will be difficult-to-impossible without strongly informative priors on node dates and/or clock rate and variability. Apart from such constraints, nothing in the tip dates or the uniform node age prior restricts the age of nodes below the dated tips; thus the node ages are, in effect, scaled up and down as the root age is sampled according to the root age prior (a required setting for the MrBayes uniform node age prior). Without informative priors, the clock rate and variability parameters will adjust along with the tree height; highly uncertain node dates will result.
Despite what first principles suggest, we suspect may surprise some researchers. The MrBayes uniform node age prior was the leading model in the early tip-dating literature (11/16 papers as of mid-2015, 9 of them as the exclusive Bayesian tip-dating method), and until recently (October 2014, v. 3.2.3) the uniform node age prior was the only option available in MrBayes. Early tip-dating efforts in Beast/Beast2 required tedious manual editing of XML and/or elaborate scripting efforts (such as BEASTmasteR), whereas MrBayes was relatively easy to use. Therefore, many early attempts at tip-dating used the uniform node age prior.
In contrast to the disappointing results with the uniform node age prior, analyses using the BDSS or SABD tree prior (mb10, r1, r2) fared well against ground truth. Given only the characters and tip-dates, and with uninformative priors on parameters and the root age, these analyses were able to estimate node ages with high accuracy. Surprisingly, these analyses outperformed the uniform node age prior even when this analysis was given substantial additional information in the form of many node calibrations (mb1). It seems that even well-constrained uniform node age prior analyses have a tendency to space node ages unrealistically evenly between calibrations and tip dates, regardless of morphological branch lengths (SM). The disagreement between the MrBayes BDSS and SABD (m10 and m9) analyses about the position of Lycaon+Cuon is puzzling and is discussed further in SM.
Conclusions
Tip-dating with the uniform node age prior was explicitly introduced [6] as an alternative to node-dating, attractive precisely because tip-dating avoided various undesirable compromises that researchers are forced to make to when constructing node-age priors. Ronquist et al. [6] also critiqued Stadlers [15] BDSS prior as being “complete but unrealistic,” particularly due to assumptions about constant birth/death/sampling rates and sampling in the Recent. They offered the uniform prior as an alternative, free of these difficulties. If, however, strongly informative node-age priors are required to produce reasonable results under the uniform node age prior, the main appeal of this prior is lost. The exploration of birth-death-sampling models for MrBayes [16] suggests that the future of tip-dating is likely to lay in adding realism to the BDSS-like models, rather than in attempting to devise wholly agnostic dating priors.
A great deal of work remains in the area of tip-dating in terms of methods testing and implementing more realistic methods. We have shown that “ground-truth” datasets, though rare and imperfect, are extremely useful in evaluating methods and models, bringing to light issues that would be less noticeable with lower-quality datasets and/or more complex setups (e.g., informative priors on parameters and node dates).
Data accessibility
All scripts, data files, and results files are available via a zipfile on Dryad (doi:10.5061/dryad.750p8) [Backups: https://drive.google.com/folderview?id=0B2S6mul1KaCdNk5iR1dieWxHX0U&usp=sharing, or: https://github.com/nmatzke/MatzkeWright2016]
Competing interests
We have no competing interests.
Authors’ Contributions
NJM wrote BEASTmasteR, conducted the Beast2 computational analyses and drafted the manuscript. AW contributed to MrBayes dating efforts and edited and corrected the manuscript.
Funding
NJM was supported by NIMBioS fellowship under NSF Award #EFJ0832858, and ARC DECRA fellowship DE150101773. Work on this topic began under the NSF Bivalves in Time and Space grant (DEB-0919451). AW was supported by NSF DEB-1256993.
Captions for Supplemental Figures, Tables, and Data
Supplemental Data Files
Canidae_traceLogs.pdf – Trace plots of key variables for all 40 analyses.
Canidae_treeLogs.pdf – Plots of the MCC trees for all 40 analyses.
Ground_truth_vs_estimated_node_ages.pdf – Linear regressions showing the correlation between the ground truth and estimated node ages, for nodes shared between the ground truth tree and estimated trees.
Canidae_ground_truth.newick - The “ground-truth” tree, derived from digitization of the phylogenies of Canidae published in the monographs of Wang and Tedford, using TreeRogue.
Table_S2_TipDate_runs_v3.xlsx - Summary of all 40 variant analyses (contains Supplemental Table S2, and some associated notes and file locations)
Matzke_Wright_SuppData.zip - A zipfile of all inputs, outputs, and scripts for all analyses.
Acknowledgements
We thank David Bapst, Graeme Lloyd, Jeremy Beaulieu, Kathryn Massana, Brian O’Meara, and Mike Lee for helpful comments and discussion, as well as the participants of the 2014 Society of Vertebrate Paleontology workshop and symposium on tip-dating. We also thank the BEAST developers and the beast-users Google Group, particularly Remco Bouckaert.
Appendix 1: Issues with the MrBayes dating analysis of Slater (2015)
In setting up variant MrBayes analyses (Supplemental Table S2), a number of issues became apparent with the NEXUS file of the original Slater (2015) analysis. These are detailed below in order to help aid future MrBayes analyses, and in some cases to suggest improvements in the MrBayes code or documentation. These issues do not appear to greatly alter the dating results of Slater (2015), due to the large number of tip-and node-date constraints in that analysis (compare Figure 1b: mb2.3.5_mb1_orig; and Figure 1c: mb3.2.5_mb1), but they did cause major issues for analyses without node-date constraints.
The example NEXUS file being examined is canidae.nex, downloaded May 2015, and re-downloaded (unchanged) in April 2016 from: http://datadryad.org/bitstream/handle/10255/dryad.73273/canidae.nex?sequence=1.
A file correcting the issues identified below, but otherwise maintaining the intended analysis of Slater (2015) (uniform node age prior, node date constraints, etc.) is file “canidae_all_issues_fixed.nex”, located in directory mb_3.2.5b_add_ingroup/mb1/ of the Supplemental Data file “Matzke_Wright_SuppData.zip.”
Issue 1: Root node date calibration
The NEXUS file includes a variety of node-date calibrations, including an offsetexp(min=45, mean=50) calibration for the root node:
Line 433 of canidae.nex:
calibrate root=offsetexponential(45, 50); [mean = 50, median = 48.5, 95% upper = 60]
Unfortunately, this date prior on the root node appears to be ignored by MrBayes. This can be confirmed by inspecting Slater (2015)’s Figure S2, where the age of the root is approximately 42 Ma, despite the fact that the root node constraint has a hard minimum of 45 Ma.
The only hint that MrBayes is ignoring the root calibration is the following warning message:
WARNING: Constraint ‘root’ refers only to deleted taxa and will be disregarded
In the screen output of the MrBayes run, this warning is easy to miss, as it is hidden amongst many other warnings of this type:
WARNING: There is one character incompatible with the specified coding bias. This character will be excluded.
The second warning is due to a character in the data matrix being invariant. Another reason that the first warning can be missed is that the warning is inaccurate (the taxa were not deleted).
It appears that, with the root node date calibration ignored, and with no tree age prior (treeagePr setting) given, the MrBayes dating analyses default to a tree height prior with a gamma(1,1) distribution. The message output to the screen at runtime is:
Tree age has a Gamma(1.00,1.00) distribution
This parameter describes the expected variance given a branchlength in expected amounts of change. igrvar is multiplied by the each branchlength to give the expected variability.
Issue 2: Prior on clock rate variability
The NEXUS file includes this prior for the igrvar parameter (igrvar = variance parameter for the gamma distribution on branchwise rate variability, for independent branch rates).
Line 468 of canidae.nex:
prset igrvarpr=exp(126.887); [a vague prior]
The comment suggests the Exponential(126.887) prior as “a vague prior”. We can see how a user could think this, given the language in the igrvar documentation:
MrBayes > help prset (in MrBayes 3.2.5) gives:
Igrvarpr – This parameter allows you to specify a prior on the variance of the gamma distribution from which the branch lengths are drawn in the independent branch rate (IGR) relaxed clock model. Specifically, the parameter specifies the rate at which the variance increases with respect to the base rate of the clock. If you have a branch of a length corresponding to 0.4 expected changes per site according to the base rate of the clock, and the igrvar parameter has a value of 2.0, then the effective branch length will be drawn from a distribution with a variance of 0.4*2.0.
You can set the parameter to a fixed value, or specify that it is drawn from an exponential or uniform distribution:
prset igrvarpr = fixed(<number>)
prset igrvarpr = exponential(<number>)
prset igrvarpr = uniform(<number>,<number>)
For backward compatibility, ‘ibrvarpr’ is allowed as a synonym of ‘igrvarpr’.
(This text is also found in commref_mb3.2.txt in the MrBayes 3.2.x download)
However, elsewhere in MrBayes, the exponential distribution is generally interpreted such that the input parameter for exp() is the exponential rate parameter, λ, and the expectation of the mean is β = 1/λ. Thus, the expectation of an Exponential(126.887) distribution is 1/126.887=0.00788. Thus, instead of a vague prior on branchwise rate variation, this prior essentially mandates a strict clock.
Our interpretation is confirmed by examining the inference of the estimated mean of branch rate variance parameter under MrBayes runs where the igrvar parameter has been changed (Supplemental Table S2).
Issue 3. Relative rate prior (ratepr)
In the NEXUS file, the relative rate prior (ratepr) is set to “variable”:
Line 458 of canidae.nex:
prset applyto=(all) ratepr = variable;
This setting creates a parameter, m{1}, representing the relative rate of the morphology partition compared to other partitions (DNA, RNA, etc.) under a common overall clock model. However, canidae.nex is a morphology-only dataset and only has 1 partition. MrBayes does not identify this situation and fix m{1} to “fixed”. Instead, it attempts to estimate this relative rate along with the clock rate and clock variability. This creates poor mixing due to non-identifiability, and “crenelations” in the MCMC trace of parameters. Page 3 (analysis mb1_orig) of the Supplemental Data file Canidae_traceLogs.pdf shows these crenelations: the MCMC trace jumps to one value, samples around that value for a while, and then jumps to a much different value. Later in the chain, it discretely jumps towards the original value, and the cycle repeats. This behaviour leads to low ESS values and bimodal parameter estimates.
In Slater’s highly-constrained original analysis, the effect on other inferences is not particularly noticeable and presumably makes little difference. However, it becomes a major issue for mixing and parameter estimation as node constraints are removed.
Issue 4. Rate prior on the morphological clock
In the NEXUS file, the relative rate prior (ratepr) is set to “variable”:
Line 469 of canidae.nex:
prset clockratepr = lognorm(-6,0.1);
Slater set a tight prior on the morphology clock rate. The lognorm(-6, 0.1) distribution has a mean in real space of 0.0025 changes/my, and an SD of 0.00025. This is a user decision rather than a problem, and it is clearly mentioned in Slater (2015).
It may be, however, that the decision for a strongly informative prior on the clock rate was made in part in order to “make the analysis behave,” due to problems caused by the uniform node age prior, and perhaps some of the other issues mentioned in this appendix. We note that tip-dating analyses with BDSS-type tree models function very well even with broad, uninformative priors on the rate of the morphological clock (Supplemental Table S2).
Issue 5. Outgroup, and specifying the outgroup
The outgroup taxon, named “outgroup” in Slater’s analysis, is identified as the outgroup in canidae.nex:
Line 459 of canidae.nex:
outgroup 1;
Taxon 1 is the outgroup OTU. However, in MrBayes dating analyses, it appears that the outgroup setting is ignored. This highlights a fundamental difference between undated and dating analyses. In undated analyses, all trees are formally unrooted, and rooting via an outgroup can take place during or after the phylogenetic inference. Thus, in the original, non-dating versions of MrBayes, the “outgroup” option was simply a convenience for the user, unless the outgroup consisted of multiple OTUs, in which case it serves as a topology constraint.
However, in a dating analysis, all sampled trees are always rooted, whether or not the user has decided on an outgroup. Furthermore, the mechanics of specifying an outgroup are more complex. Merely declaring an OTU an outgroup, or declaring an outgroup clade to be monophyletic, will not necessarily do the job. After all, a clade that is forced to be monophyletic could still be deeply nested inside the ingroup, unless something prevents this.
The simplest way to force the outgroup to be the earliest-branching group in a dating analysis is to set up a node constraint specifying that the ingroup is monophyletic. This could be programmed into the MrBayes outgroup command, but at the time of writing, it was not. In the case of Slater (2015)’s canidae.nex, it happens that there is a node constraint named “Canidae” and includes all living and fossil Canidae in the analysis. This constraint is used in the original Slater analysis, so the effect of the MrBayes outgroup problem is not noticed until the node constraints are removed; in this situation, some uniform clock tip-dating analyses fail to put the outgroup in the outgroup position (Supplemental Table S2). Fossilized BD analyses seem to put the outgroup in the correct position even without any constraints (Figure 1).
Issue 6. Typos in some OTU names.
Comparison with the “ground truth” tree manually digitized from the monographs of Tedford and Wang identified several likely typos in Canidae.nex (in fairness, comparison also revealed a number of typos in the draft ground truth tree; these are corrected in the final version). The correct spellings were double-checked via google and comparison to the monographs.
Typo
Cynarctoides_accridens
Phlaocyon_marshlandensis
Paracynarctus_sinclari
Rhizocyon_oreganensis
Cynarctoides_gawanae
Protomarctus_opatus
Urocyon_galushi
Urocyon_citronus
Corrected
Cynarctoides_acridens
Phlaocyon_marslandensis
Paracynarctus_sinclairi
Rhizocyon_oregonensis
Cynarctoides_gawnae
Protomarctus_optatus
Urocyon_galushai
Urocyon_citrinus