Frontier language models from 600M to 480B parameters. Open-weight models optimized for edge devices through cloud-scale deployments, built on the Zen architecture with efficient inference via Rust, MLX, and GGUF.

Zen LM develops state-of-the-art language models spanning ten modalities from 600M (embedded) to 480B (frontier research). All models use the Zen architecture with RoPE embeddings, SwiGLU activation, grouped-query attention, and Flash Attention 2. Available through the Hanzo LLM Gateway, Hugging Face, vLLM, and local inference via MLX/GGUF.
# Using Hanzo LLM Gateway
from hanzo import Client
client = Client()
response = client.chat.completions.create(
model="zenlm/zen4-pro",
messages=[{"role": "user", "content": "Hello!"}]
)
# Using Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("zenlm/zen4-pro")
| Model |
Type |
Description |
| zen-omni |
Audio+Vision |
Hypermodal 30B MoE |
| zen-vl |
Vision |
Vision-language models (4B/8B/30B) |
| zen-video |
Video |
High-quality video synthesis |
| zen-scribe |
Speech |
Speech recognition |
| zen-translator |
Translation |
Real-time multimodal translation |
| Project |
Description |
| zen |
Zen AI model family -- efficient models for edge and cloud |
| engine |
High-performance inference engine -- Rust/MLX/GGUF |
| gym |
Unified fine-tuning for 100+ LLMs and VLMs |
| zen-omni |
Zen-Omni 30B -- hypermodal AI with MLX/GGUF |
| docs |
Documentation and model cards |
- Papers -- 329+ research papers
- Stats -- 38,906+ commits, 59M net LOC
- Security -- cryptographic audit trail
- History -- 2008-2026 timeline
Apache 2.0 for code, model-specific licenses for weights.
Co-developed by Hanzo AI and Zoo Labs