-
Notifications
You must be signed in to change notification settings - Fork 0
System Requirements
Ryan Robson edited this page Sep 16, 2025
·
2 revisions
This guide helps you determine if your system can run Inferno and what performance to expect.
Minimum for basic usage:
- RAM: 8GB (16GB recommended)
- Storage: 20GB free space (plus model storage)
- CPU: Any modern x64 processor
- OS: Linux, macOS, or Windows
For optimal performance:
- RAM: 32GB+ for large models
- GPU: NVIDIA RTX 3060 or better, Apple M1/M2, AMD RX 6600+
- Storage: SSD recommended, 100GB+ for multiple models
- Network: For model downloads (optional after setup)
β Best Performance | β Full GPU Support | β Docker Native
- Ubuntu: 20.04 LTS, 22.04 LTS, 24.04 LTS
- Debian: 11 (Bullseye), 12 (Bookworm)
- CentOS/RHEL: 8, 9
- Fedora: 37, 38, 39
- Arch Linux: Rolling release
- openSUSE: Leap 15.4+, Tumbleweed
# Ubuntu/Debian
sudo apt update
sudo apt install build-essential curl pkg-config libssl-dev
# CentOS/RHEL/Fedora
sudo dnf install gcc gcc-c++ curl openssl-devel pkgconfig
# Arch Linux
sudo pacman -S base-devel curl openssl pkg-configNVIDIA GPUs (Recommended):
- Driver: 470+ (525+ for RTX 4000 series)
- CUDA: 11.8+ (auto-installed with ONNX Runtime)
- Compute Capability: 6.0+ (GTX 1060 and newer)
- VRAM: 6GB minimum, 12GB+ recommended
# Install NVIDIA drivers (Ubuntu)
sudo ubuntu-drivers autoinstall
# Or manually:
sudo apt install nvidia-driver-525
# Verify installation
nvidia-smiAMD GPUs:
- ROCm: 5.0+ for RX 6000 series and newer
- VRAM: 8GB minimum
# Install ROCm (Ubuntu 22.04)
wget https://repo.radeon.com/amdgpu-install/latest/ubuntu/jammy/amdgpu-install_*_all.deb
sudo dpkg -i amdgpu-install_*_all.deb
sudo amdgpu-install --usecase=rocmIntel GPUs:
- Arc A-Series: A380, A750, A770
- Integrated: Iris Xe, UHD Graphics (limited performance)
β
Great for Development | β
Apple Silicon Support |
- macOS: 12.0 (Monterey) or later
- Recommended: macOS 13.0+ for best Metal performance
Apple Silicon (M1/M2/M3):
- Unified Memory: 16GB minimum, 32GB+ for large models
- Metal: Native support, excellent performance
- Neural Engine: Automatic acceleration for compatible operations
Intel Macs:
- RAM: 16GB minimum (no unified memory benefits)
- GPU: Dedicated GPU recommended (AMD Radeon Pro)
- Note: Performance significantly lower than Apple Silicon
# Install Xcode Command Line Tools
xcode-select --install
# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install required tools
brew install cmake pkg-config opensslβ
Full DirectML Support |
- Windows: 10 (1903+), 11
- WSL2: Ubuntu 22.04 LTS (recommended for Linux performance)
NVIDIA GPUs:
- DirectML: Native Windows support
- CUDA: Full support via WSL2
- Driver: Game Ready 472+ or Studio 472+
AMD GPUs:
- DirectML: Native Windows support
- Driver: Adrenalin 22.10.1+
Intel GPUs:
- DirectML: Arc A-Series support
- Driver: Latest Intel Arc drivers
# Install Visual Studio Build Tools
# Download from: https://visualstudio.microsoft.com/visual-cpp-build-tools/
# Install Rust
# Download from: https://rustup.rs/
# Install Git
# Download from: https://git-scm.com/download/win# Enable WSL2
wsl --install -d Ubuntu-22.04
# Inside WSL2, follow Linux Ubuntu instructions| Model Size | Minimum RAM | Recommended RAM | GPU VRAM | Performance |
|---|---|---|---|---|
| 7B parameters | 8GB | 16GB | 6GB | Good |
| 13B parameters | 16GB | 32GB | 12GB | Very Good |
| 30B parameters | 32GB | 64GB | 24GB | Excellent |
| 70B parameters | 64GB | 128GB | 48GB+ | Outstanding |
- NVIDIA: RTX 4090, RTX 4080, A6000
- Apple: M2 Ultra, M3 Ultra (32GB+ unified memory)
- Performance: Run 70B+ models smoothly
- NVIDIA: RTX 4070, RTX 3080, RTX 3090
- Apple: M2 Pro, M3 Pro (16GB+ unified memory)
- AMD: RX 7800 XT, RX 6800 XT
- Performance: Handle 30B models well, 70B with careful management
- NVIDIA: RTX 4060, RTX 3070, RTX 3060 Ti
- Apple: M1 Pro, M2 (16GB unified memory)
- AMD: RX 6700 XT, RX 7600
- Performance: Great for 13B models, acceptable for 30B
- NVIDIA: RTX 3060, GTX 1660 Ti
- Apple: M1 (8GB unified memory)
- AMD: RX 6600, RX 5700 XT
- Intel: Arc A750, A770
- Performance: Perfect for 7B models
- 7B model: 4-8GB per model
- 13B model: 8-16GB per model
- 30B model: 20-40GB per model
- 70B model: 40-80GB per model
ποΈ Inferno Storage Layout
βββ π /models/ (100-500GB)
β βββ llama-2-7b.gguf (7GB)
β βββ mistral-7b.gguf (7GB)
β βββ codellama-13b.gguf (13GB)
βββ π /cache/ (10-50GB)
β βββ response_cache/
βββ π /logs/ (1-10GB)
β βββ audit_logs/
βββ π /config/ (<1GB)
βββ inferno.toml
- HDD: 20-50 tokens/sec (acceptable)
- SATA SSD: 50-100 tokens/sec (good)
- NVMe SSD: 100-200+ tokens/sec (excellent)
- Model Downloads: 5-80GB per model (one-time)
- Bandwidth: 10+ Mbps recommended for faster downloads
- Alternative: Transfer models via USB/external drive
- Network: Not required (fully offline capable)
- Updates: Optional, only for Inferno updates
- Monitoring: Optional external metrics collection
| Format | Linux | macOS | Windows | Performance |
|---|---|---|---|---|
| GGUF | β Full | β Full | β Full | Excellent |
| ONNX | β Full | β Full | β Full | Excellent |
| PyTorch | β Convert | β Convert | β Convert | Via Conversion |
| SafeTensors | β Convert | β Convert | β Convert | Via Conversion |
| GPU Vendor | Linux | macOS | Windows | API |
|---|---|---|---|---|
| NVIDIA | β CUDA/Vulkan | β No | β DirectML/CUDA | CUDA, DirectML |
| AMD | β ROCm/Vulkan | β No | β DirectML | ROCm, DirectML |
| Intel | β Vulkan | β No | β DirectML | Vulkan, DirectML |
| Apple | N/A | β Metal | N/A | Metal |
# Download Inferno
inferno --version
# Test with small model
inferno run --model test-7b --prompt "Hello world" --benchmark
# Check GPU utilization
inferno system info --gpu
# Performance test
inferno benchmark --model your-model --duration 60s- RTX 4090: 150-200 tok/s
- RTX 3080: 80-120 tok/s
- M2 Pro: 60-80 tok/s
- RTX 3060: 40-60 tok/s
- RTX 4090: 80-120 tok/s
- RTX 3080: 40-60 tok/s
- M2 Pro: 30-40 tok/s
- RTX 3060: 15-25 tok/s
Performance Issues?
- Check Performance Tuning for optimization tips
- See Troubleshooting for common problems
Hardware Questions?
- Visit GitHub Discussions for hardware recommendations
- Check Hardware Compatibility for tested configurations
Want to Upgrade?
- See Benchmarks for detailed performance comparisons
- Check our Hardware Buying Guide
System requirements updated for Inferno v1.0.0. Check Changelog for latest updates.