Phylorust is a Rust-based command-line tool to generate phylogenetically informative SNP site sets (as FASTA files) and associated Tree's from VCF file(s) and FASTA reference genome files.
- Reads a reference FASTA and single sample or multi-sample VCF(s).
- Generates phylogenetically informative SNP site sets at configurable coverage thresholds.
- Produces per-sample FASTA alignments.
- Runs FastTree automatically (if installed) to generate trees.
- ASCII tree rendering directly in the terminal.
- Simple tab-delimited input file (
Name_Type_Location.tab) for managing multiple samples.
To build and run Phylorust, you’ll need:
- Rust (stable, installed via
rustup).- Verify install with:
rustc --version cargo --version
- Verify install with:
- R (for plotting histograms).
- Packages:
ggplot2,readr,dplyr(install inside R with):install.packages(c("ggplot2", "readr", "dplyr"))
- Packages:
- FastTree (optional, for tree generation).
- Must be in your system
PATH. - Verify with:
FastTree -help
- Must be in your system
Clone the repo and install with Cargo:
git clone https://github.com/rhysf/Phylorust.git
cd Phylorust
cargo install --path .If you installed Rust with rustup, ~/.cargo/bin is normally already in your $PATH. If not, you can add it:
echo 'export PATH="$HOME/.cargo/bin:$PATH"' >> ~/.bashrc
source ~/.bashrcYou can now run:
phylorust --helpAlternative (manual install): If you prefer to place the binary in ~/.local/bin
cargo build --release
cp target/release/phylorust ~/.local/bin/Phylorust uses a tab-delimited file (Name_Type_Location.tab) with three columns:
SampleName VCF /path/to/sample.vcf
- SampleName = Your preferred sample label (used in output FASTAs/trees).
- Filetype (must be VCF or vcf).
- /path/to/sample.vcf = Path to the sample’s VCF file.
phylorust \
--fasta ./examples/Cryp_gatt_R265.genome.fa-scaffold3.14.fasta \
--name_type_location ./examples/Name_Type_location.tabThis will:
- Parse the reference FASTA.
- Parse VCFs listed in Name_Type_location.tab.
- Generate SNP site sets and coverage histograms.
- Produce FASTA alignments for each coverage threshold.
- Run FastTree (if available) and print ASCII trees in the terminal.
git clone https://github.com/rhysf/Phylorust.git
cd Phylorust
docker build -t phylorust .
docker run --rm -v $(pwd)/examples:/examples phylorust \
--fasta /examples/Cryp_gatt_R265.genome.fa-scaffold3.14.fasta \ --name_type_location /examples/Name_Type_location_Docker.tabOn HPC systems without Docker, you can convert the Docker image into a Singularity (Apptainer) image
apptainer build phylorust.sif docker-daemon://phylorust:latest
apptainer run phylorust.sif \
--fasta examples/Cryp_gatt_R265.genome.fa-scaffold3.14.fasta \
--name_type_location examples/Name_Type_location.tabKey options (full list available with --help): --fasta → Reference FASTA file. --name_type_location → Tab-delimited file of sample names, file type, and VCF paths. --output_dir
→ Directory for results (default: Phylorust_output). --generate_fastas → FASTA generation mode (all or specific thresholds). --skip-fasttree → Skip tree generation. --fasttree-bin → Path to FastTree binary (if not in PATH).Histograms are generated with R and saved to both .png and .pdf. You can also run the plotting script directly:
Rscript plot_histogram.R site_coverage_histogram.tsv 90This project is licensed under the MIT License.
