Motivation

A chronogram is a dated phylogenetic tree whose branch lengths have been scaled to represent time. Such chronograms are computed based on available date estimates (e.g. from dated fossils), which provide absolute time constraints for one or more nodes of an input undated phylogeny, coupled with an appropriate underlying model for evolutionary rates variation along the branches of the phylogeny. However, traditional methods for phylogenetic dating cannot take into account relative time constraints, such as those provided by inferred horizontal transfer events. In many cases, chronograms computed using only absolute time constraints are inconsistent with known relative time constraints.

Results

In this work, we introduce a new approach, Dating Trees using Relative constraints (DaTeR), for phylogenetic dating that can take into account both absolute and relative time constraints. The key idea is to use existing Bayesian approaches for phylogenetic dating to sample posterior chronograms satisfying desired absolute time constraints, minimally adjust or ‘error-correct’ these sampled chronograms to satisfy all given relative time constraints, and aggregate across all error-corrected chronograms. DaTeR uses a constrained optimization framework for the error-correction step, finding minimal deviations from previously assigned dates or branch lengths. We applied DaTeR to a biological dataset of 170 Cyanobacterial taxa and a reliable set of 24 transfer-based relative constraints, under six different molecular dating models. Our extensive analysis of this dataset demonstrates that DaTeR is both highly effective and scalable and that its application can significantly improve estimated chronograms.

Availability and implementation

Freely available from https://compbio.engr.uconn.edu/software/dater/

Supplementary information

Supplementary data are available at Bioinformatics online.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.