CARMANIA

Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis

CARMANIA is a self-supervised genomic language model framework that augments next-token prediction with a transition-matrix regularization loss. This integration improves biological sequence modeling by aligning predicted transitions with empirical bigram(2-mer) statistics, allowing for better long-range dependency modeling and functional interpretation.

🧠 Pretrained Models

The following models are already available for use on Hugging Face Hub:

🦠🧬 MsAlEhR/carmania-big-10k-prok-genome
🦠🧬 MsAlEhR/carmania-4k-scp-gene-taxa
👤🧬 MsAlEhR/carmania-160k-seqlen-human

🚀 Quick Start

from transformers import AutoModel, AutoTokenizer
import torch

model = AutoModel.from_pretrained(
    "MsAlEhR/carmania-160k-seqlen-human",
    trust_remote_code=True,
    torch_dtype=torch.float16,   # fixed dtype (or autocast)
).to("cuda")

tokenizer = AutoTokenizer.from_pretrained(
    "MsAlEhR/carmania-160k-seqlen-human",
    trust_remote_code=True,
    model_max_length=160000,
)

inputs = tokenizer("ACGTAGGCTA", return_tensors="pt").to("cuda")

outputs = model(**inputs)

🧪 Sequence-Guided Generation

An experimental notebook exploring CARMANIA-driven sequence optimization using Enformer scores is now available.
This lightweight module perturbs input DNA sequences and uses Enformer’s predicted regulatory signals as a scoring function to iteratively generate variants with improved activity.

📄 Notebook:
carmania_enformer_guided_generation.ipynb

Citation

@article{refahi2025context,
  title= {Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis},
  author= {Refahi, Mohammadsaleh and Abavisani, Mahdi and Sokhansanj, Bahrad A. and Brown, James R. and Rosen, Gail},
  journal= {arXiv preprint arXiv:2507.09378},
  year= {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
carmania		carmania
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md
carmania_logo.png		carmania_logo.png
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CARMANIA

🧠 Pretrained Models

🚀 Quick Start

🧪 Sequence-Guided Generation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CARMANIA

🧠 Pretrained Models

🚀 Quick Start

🧪 Sequence-Guided Generation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages