Releases: Daemoniorum-LLC/infernum-framework
Releases · Daemoniorum-LLC/infernum-framework
Infernum v0.1.0
Initial open-source release of Infernum LLM inference framework.
Features
- Blazingly fast local LLM inference
- OpenAI API compatibility (drop-in replacement)
- Streaming support with real-time token output
- Multi-backend: CPU, CUDA (NVIDIA), Metal (Apple Silicon)
- Interactive chat with history and session management
- Model caching via HuggingFace Hub
Quick Start
cargo install --path crates/infernum
infernum config set-model TinyLlama/TinyLlama-1.1B-Chat-v1.0
infernum chatLicense
Dual-licensed under MIT and Apache 2.0
Copyright (c) 2024-2025 Daemoniorum, LLC