BioInfoJava-Utils

BioInfoJava-Utils is a modular Java library providing high-performance implementations of core bioinformatics algorithms, such as distance matrix computation and phylogenetic tree construction from VCF and FASTA files.

This library serves as the computational backend for the fastreeR software suite, which offers a flexible and user-friendly interface to these tools across multiple platforms and environments.

Integration and Accessibility

The functionality of BioInfoJava-Utils is exposed through the fastreeR interface, which is accessible in the following ways:

🆕 Java Backend (v2.3.0) !! now supports reading from gzip (for example .gz), bzip2 (for example .bz2) and xz compressed VCF files.
Java Backend (v2.2.0) implements streaming bootstrap; from VCF file get a newick tree with encoded bootstrap support values
Java Backend (v2.0.0) 100x times FASTreER and only a couple hundred MB RAM needed. Java 11+ suggested.
Bioconda: install with conda install -c bioconda fastreer (recipe)
Docker: available on DockerHub and GHCR for containerized execution
PyPI: install with pip install fastreer (repository)
Python CLI: through a lightweight Python wrapper that calls the Java backend
R / Bioconductor: via rJava (package)
Galaxy: available on Galaxy Toolshed.
Pure Java API: developers can integrate this library directly in Java-based pipelines or software.

Overview

BioInfoJava-Utils provides efficient, scalable, and parallel implementations of widely used bioinformatics algorithms. It is designed for processing large-scale genomic datasets efficiently, supporting both research and production environments.

Features

Reads directly from plain, gzip, bzip2 or xz VCF files.
🥾 Streaming bootstrap support in the VCF2TREE utility
🚀 Ultra-fast with a superior multithreaded concurrency model and minimal RAM usage, from GBs down to just MBs!
⚙️ Compute sample-wise distance matrices from VCF (cosine) or FASTA (D2S) files
🌳 Build phylogenetic trees using neighbor-joining algorithm
🧬 Support for hierarchical clustering with dynamic tree pruning
🔄 Multithreaded processing for large input files
📦 Integrates seamlessly into diverse environments (R, Python, Docker, Java)

Installation

Prerequisites

Java 11 or higher
Maven (for building the project)

Building from Source

Clone the repository:

   git clone https://github.com/gkanogiannis/BioInfoJava-Utils.git

Navigate to the project directory:

   cd BioInfoJava-Utils

Build the project using Maven:

   mvn clean package install

This will generate a JAR files in the bin directory.

Usage

The main class for executing the utilities is:

com.gkano.bioinfo.javautils.JavaUtils

You can run the utilities via the command line or integrate them into other Java applications.

java -jar bin/BioInfoJavaUtils-VERSION-jar-with-dependencies.jar --help

License

This project is licensed under the GNU General Public License v3.0.

Citation

If you use BioInfoJava-Utils in your research, please cite the following:

Gkanogiannis, A. et al. A scalable assembly-free variable selection algorithm for biomarker discovery from metagenomes. BMC Bioinformatics 17, 311 (2016). https://doi.org/10.1186/s12859-016-1186-3

Author

Anestis Gkanogiannis
Bioinformatics/ML Scientist
Website: https://www.gkanogiannis.com
ORCID: 0000-0002-6441-0688

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github/workflows		.github/workflows
bin		bin
lib		lib
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BioInfoJava-Utils

Integration and Accessibility

Overview

Features

Installation

Prerequisites

Building from Source

Usage

License

Citation

Author

About

Uh oh!

Releases 8

Packages

Uh oh!

Languages

License

gkanogiannis/BioInfoJava-Utils

Folders and files

Latest commit

History

Repository files navigation

BioInfoJava-Utils

Integration and Accessibility

Overview

Features

Installation

Prerequisites

Building from Source

Usage

License

Citation

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Languages

Packages