Variational Time-Tree Inference for Time-Calibrated Phylogenies
VBayes is a Python framework for estimating divergence times on fixed rooted phylogenies under a birth–death prior with species sampling and a molecular clock model. It supports both vanilla variational inference (VI) and GNN-based parameterizations.
This README explains installation, usage, command-line arguments, and output files.
python3 -m venv .venv
source .venv/bin/activateVBayes requires Pytorch version 2.6.0 or higher with dependencies listed in requirements.txt file. Make sure you have requirements.txt in the project root, then run:
pip install -r requirements.txt| Argument | Description | Default value |
|---|---|---|
--aln-path |
Multiple Sequence Alignment (MSA) file for the analysis. | |
--tree-path |
Phylogenetic tree with time calibrations in newick format. Internal nodes and root node can be calibrated by adding a node label 'B(L,U)' for a uniformly distrbuted softbound calibrations with lower and upperbounds L and U respectively. |
|
--aln-name |
Placeholder name for an analysis to generate output files. | |
--logs-path |
Path to store logs. | |
--save-path |
Path to store VBayes models after optimization. | |
--lambda-bd |
Birth rate for Birth-death prior. | 1.0 |
--mu-bd |
Death rate for Birth-death prior. | 1.0 |
--rho-bd |
Species sampling fraction for Birth-death prior. | 0.5 |
--mu-clock |
Mean rate for the clock prior. | 0.5 |
--sigma-clock |
Variance for the clock prior. | 1.0 |
--init-clock-rate |
Initial clock rate used for optimization. | 1.0 |
--clock-type |
Type of molecular clock (strict or fixed rate). | strict |
--feature-dim |
Number of parameters to optimize per node for vanilla VI models.Typically 2 (mean and variance). | 2 |
--branch-model |
Optimization model for time parameters. Choices: "" (empty string) for direct parameter optimization (vanilla VI), or "gnn" for GNN implementation. | "" |
--max-iter |
Total number of optimization iterations. | 20,000 |
--warm-up-steps |
Number of warm-up steps. | 10,000 |
--anneal-rate |
Factor by which the learning rate is annealed. | 0.75 |
--anneal-freq |
How often (in iterations) to apply annealing. | 1,000 |
--init-inverse-temp |
Initial inverse temperature used in annealing. | 1e-5 |
--n-particles |
Number of particles used for sampling from the variational distrbutions during optimization. | 1 |
More details on commandline arguments can be found using:
python3 main.py --help python3 main.py This will infer the time-tree for the data provided in ./data/16taxa-1x. The logs and file containing the internal node ages are stored in the provided log path ./data/16taxa-1x/logs.
VBayes will generate plots for ELBO (Evidence Lower Bound) and other prior distribution convergence inside the logs path.
Following commandline should be used to specify your dataset to infer time-tree with VBayes.
python3 main.py --aln-path {path_to_MSA} --tree-path {path_to_tree} --aln-name {name for the analysis} --logs-path {path_for_logs}