ARG-PRISM

ARGPrism is a deep learning-based pipeline for predicting and annotating Antibiotic Resistance Genes (ARGs) from protein sequences using transformer embeddings and neural networks.

Key Features

Deep Learning Classification: ProtAlbert transformer embeddings + neural network classifier
GPU Accelerated: Fast processing with CUDA support
Reference Mapping: DIAMOND BLAST alignment to ARG databases
Simple Interface: Easy-to-use command line tool
Flexible Deployment: CPU or GPU execution

Installation

Prerequisites

Linux operating system (Ubuntu 20.04+)
Conda/Miniconda/Mamba (Recommended) must be installed
8+ GB RAM (16 GB recommended)
NVIDIA GPU with CUDA 11.8+ or 12.x (optional, for acceleration)

Option 1: Install from Conda (Recommended)

# Install from conda-forge
mamba install -c bioconda argprism

# Verify installation
argprism --version

Option 2: Install from Source

# Clone repository
git clone https://github.com/haseebmanzur/ARGPrism.git
cd ARGprism

# Create environment
mamba env create -f environment.yml

# Activate environment  
mamba activate argprism

# Verify installation
argprism --version

Quick Start

# Activate environment
mamba activate argprism

# Run on test data
argprism Test_dataset/Test_data.faa --output-dir results/

Usage

Command Line

argprism INPUT_FILE.faa [OPTIONS]

Options

Option	Description	Default
`-o, --output-dir`	Output directory	`argprism_output`
`--device`	Force CPU/CUDA usage	Auto-detect
`--quiet`	Reduce output verbosity	False

Python API

from argprism import run_pipeline

# Run pipeline
result = run_pipeline(
    input_fasta="input.faa",
    output_dir="results/",
    verbose=True
)

print(f"Predictions: {len(result.predictions)}")
print(f"ARGs found: {result.predicted_fasta}")

Pipeline Overview

ARGPrism processes protein sequences through the following steps:

Input FASTA → ProtAlbert Embeddings → Neural Classifier → ARG Prediction → DIAMOND Mapping → Report

Process Details

Embedding Generation: ProtAlbert generates 4096-dimensional embeddings
Classification: Neural network predicts ARG/Non-ARG for each sequence
Reference Mapping: DIAMOND aligns predicted ARGs to reference database
Report Generation: Creates annotated CSV with ARG names and drug classes

Input/Output

Input

FASTA file: Protein sequences to analyze
Built-in models and databases are included

Output Files

All results saved to output directory:

predicted_ARGs.fasta - Sequences classified as ARGs
predicted_ARGs_vs_ref.tsv - DIAMOND alignment results
final_ARG_prediction_report.csv - Annotated predictions with ARG names/drugs
diamond_arg_db.dmnd - DIAMOND database index

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For questions or support, please open an issue on GitHub.

Project PI: Dr. Masood Ur Rehman
Email: m.kayani@sines.nust.edu.pk

Author: Haseeb Manzoor
GitHub: @haseebmanzur

Package Maintainer: Muhammad Muneeb Nasir
GitHub: @muneebdev7

Acknowledgments

ProtAlbert - Protein language model
DIAMOND - Sequence alignment tool

Citation

If you use ARGPrism in your research, please cite:

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
Test_dataset		Test_dataset
Tests		Tests
argprism		argprism
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
meta.yaml		meta.yaml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ARG-PRISM

Key Features

Table of Contents

Installation

Prerequisites

Option 1: Install from Conda (Recommended)

Option 2: Install from Source

Quick Start

Usage

Command Line

Options

Python API

Pipeline Overview

Process Details

Input/Output

Input

Output Files

License

Contact

Acknowledgments

Citation

About

Uh oh!

Releases

Languages

License

muneebdev7/ARGPrism_dev

Folders and files

Latest commit

History

Repository files navigation

ARG-PRISM

Key Features

Table of Contents

Installation

Prerequisites

Option 1: Install from Conda (Recommended)

Option 2: Install from Source

Quick Start

Usage

Command Line

Options

Python API

Pipeline Overview

Process Details

Input/Output

Input

Output Files

License

Contact

Acknowledgments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages