System Requirements

💻 System Requirements

This guide helps you determine if your system can run Inferno and what performance to expect.

🎯 Quick Check

Minimum for basic usage:

RAM: 8GB (16GB recommended)
Storage: 20GB free space (plus model storage)
CPU: Any modern x64 processor
OS: Linux, macOS, or Windows

For optimal performance:

RAM: 32GB+ for large models
GPU: NVIDIA RTX 3060 or better, Apple M1/M2, AMD RX 6600+
Storage: SSD recommended, 100GB+ for multiple models
Network: For model downloads (optional after setup)

🖥️ Platform-Specific Requirements

Linux (Recommended)

✅ Best Performance | ✅ Full GPU Support | ✅ Docker Native

Supported Distributions

Ubuntu: 20.04 LTS, 22.04 LTS, 24.04 LTS
Debian: 11 (Bullseye), 12 (Bookworm)
CentOS/RHEL: 8, 9
Fedora: 37, 38, 39
Arch Linux: Rolling release
openSUSE: Leap 15.4+, Tumbleweed

System Dependencies

# Ubuntu/Debian
sudo apt update
sudo apt install build-essential curl pkg-config libssl-dev

# CentOS/RHEL/Fedora
sudo dnf install gcc gcc-c++ curl openssl-devel pkgconfig

# Arch Linux
sudo pacman -S base-devel curl openssl pkg-config

GPU Support

NVIDIA GPUs (Recommended):

Driver: 470+ (525+ for RTX 4000 series)
CUDA: 11.8+ (auto-installed with ONNX Runtime)
Compute Capability: 6.0+ (GTX 1060 and newer)
VRAM: 6GB minimum, 12GB+ recommended

# Install NVIDIA drivers (Ubuntu)
sudo ubuntu-drivers autoinstall
# Or manually:
sudo apt install nvidia-driver-525

# Verify installation
nvidia-smi

AMD GPUs:

ROCm: 5.0+ for RX 6000 series and newer
VRAM: 8GB minimum

# Install ROCm (Ubuntu 22.04)
wget https://repo.radeon.com/amdgpu-install/latest/ubuntu/jammy/amdgpu-install_*_all.deb
sudo dpkg -i amdgpu-install_*_all.deb
sudo amdgpu-install --usecase=rocm

Intel GPUs:

Arc A-Series: A380, A750, A770
Integrated: Iris Xe, UHD Graphics (limited performance)

macOS

✅ Great for Development | ✅ Apple Silicon Support | ⚠️ Limited GPU Options

Supported Versions

macOS: 12.0 (Monterey) or later
Recommended: macOS 13.0+ for best Metal performance

Hardware Requirements

Apple Silicon (M1/M2/M3):

Unified Memory: 16GB minimum, 32GB+ for large models
Metal: Native support, excellent performance
Neural Engine: Automatic acceleration for compatible operations

Intel Macs:

RAM: 16GB minimum (no unified memory benefits)
GPU: Dedicated GPU recommended (AMD Radeon Pro)
Note: Performance significantly lower than Apple Silicon

Installation Dependencies

# Install Xcode Command Line Tools
xcode-select --install

# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install required tools
brew install cmake pkg-config openssl

Windows

✅ Full DirectML Support | ⚠️ Requires WSL2 for Best Experience

Supported Versions

Windows: 10 (1903+), 11
WSL2: Ubuntu 22.04 LTS (recommended for Linux performance)

GPU Support

NVIDIA GPUs:

DirectML: Native Windows support
CUDA: Full support via WSL2
Driver: Game Ready 472+ or Studio 472+

AMD GPUs:

DirectML: Native Windows support
Driver: Adrenalin 22.10.1+

Intel GPUs:

DirectML: Arc A-Series support
Driver: Latest Intel Arc drivers

Windows Native Setup

# Install Visual Studio Build Tools
# Download from: https://visualstudio.microsoft.com/visual-cpp-build-tools/

# Install Rust
# Download from: https://rustup.rs/

# Install Git
# Download from: https://git-scm.com/download/win

WSL2 Setup (Recommended)

# Enable WSL2
wsl --install -d Ubuntu-22.04

# Inside WSL2, follow Linux Ubuntu instructions

🚀 Hardware Performance Guide

Model Size vs. Hardware Requirements

Model Size	Minimum RAM	Recommended RAM	GPU VRAM	Performance
7B parameters	8GB	16GB	6GB	Good
13B parameters	16GB	32GB	12GB	Very Good
30B parameters	32GB	64GB	24GB	Excellent
70B parameters	64GB	128GB	48GB+	Outstanding

GPU Performance Tiers

Tier 1: Enthusiast 🚀

NVIDIA: RTX 4090, RTX 4080, A6000
Apple: M2 Ultra, M3 Ultra (32GB+ unified memory)
Performance: Run 70B+ models smoothly

Tier 2: High Performance ⚡

NVIDIA: RTX 4070, RTX 3080, RTX 3090
Apple: M2 Pro, M3 Pro (16GB+ unified memory)
AMD: RX 7800 XT, RX 6800 XT
Performance: Handle 30B models well, 70B with careful management

Tier 3: Good Performance 💪

NVIDIA: RTX 4060, RTX 3070, RTX 3060 Ti
Apple: M1 Pro, M2 (16GB unified memory)
AMD: RX 6700 XT, RX 7600
Performance: Great for 13B models, acceptable for 30B

Tier 4: Entry Level 📈

NVIDIA: RTX 3060, GTX 1660 Ti
Apple: M1 (8GB unified memory)
AMD: RX 6600, RX 5700 XT
Intel: Arc A750, A770
Performance: Perfect for 7B models

💾 Storage Requirements

Model Storage Needs

7B model: 4-8GB per model
13B model: 8-16GB per model
30B model: 20-40GB per model
70B model: 40-80GB per model

Recommended Storage Setup

🗂️ Inferno Storage Layout
├── 📁 /models/           (100-500GB)
│   ├── llama-2-7b.gguf   (7GB)
│   ├── mistral-7b.gguf   (7GB)
│   └── codellama-13b.gguf (13GB)
├── 📁 /cache/            (10-50GB)
│   └── response_cache/
├── 📁 /logs/             (1-10GB)
│   └── audit_logs/
└── 📁 /config/           (<1GB)
    └── inferno.toml

Storage Performance Impact

HDD: 20-50 tokens/sec (acceptable)
SATA SSD: 50-100 tokens/sec (good)
NVMe SSD: 100-200+ tokens/sec (excellent)

🌐 Network Requirements

Initial Setup

Model Downloads: 5-80GB per model (one-time)
Bandwidth: 10+ Mbps recommended for faster downloads
Alternative: Transfer models via USB/external drive

Runtime Operation

Network: Not required (fully offline capable)
Updates: Optional, only for Inferno updates
Monitoring: Optional external metrics collection

🔧 Compatibility Matrix

Model Format Support by Platform

Format	Linux	macOS	Windows	Performance
GGUF	✅ Full	✅ Full	✅ Full	Excellent
ONNX	✅ Full	✅ Full	✅ Full	Excellent
PyTorch	✅ Convert	✅ Convert	✅ Convert	Via Conversion
SafeTensors	✅ Convert	✅ Convert	✅ Convert	Via Conversion

GPU Acceleration Support

GPU Vendor	Linux	macOS	Windows	API
NVIDIA	✅ CUDA/Vulkan	❌ No	✅ DirectML/CUDA	CUDA, DirectML
AMD	✅ ROCm/Vulkan	❌ No	✅ DirectML	ROCm, DirectML
Intel	✅ Vulkan	❌ No	✅ DirectML	Vulkan, DirectML
Apple	N/A	✅ Metal	N/A	Metal

🧪 Performance Testing

Benchmark Your System

# Download Inferno
inferno --version

# Test with small model
inferno run --model test-7b --prompt "Hello world" --benchmark

# Check GPU utilization
inferno system info --gpu

# Performance test
inferno benchmark --model your-model --duration 60s

Expected Performance Baselines

7B Model Performance (tokens/second)

RTX 4090: 150-200 tok/s
RTX 3080: 80-120 tok/s
M2 Pro: 60-80 tok/s
RTX 3060: 40-60 tok/s

13B Model Performance (tokens/second)

RTX 4090: 80-120 tok/s
RTX 3080: 40-60 tok/s
M2 Pro: 30-40 tok/s
RTX 3060: 15-25 tok/s

❓ Need Help?

Performance Issues?

Check Performance Tuning for optimization tips
See Troubleshooting for common problems

Hardware Questions?

Visit GitHub Discussions for hardware recommendations
Check Hardware Compatibility for tested configurations

Want to Upgrade?

See Benchmarks for detailed performance comparisons
Check our Hardware Buying Guide

System requirements updated for Inferno v1.0.0. Check Changelog for latest updates.

System Requirements

💻 System Requirements

🎯 Quick Check

🖥️ Platform-Specific Requirements

Linux (Recommended)

Supported Distributions

System Dependencies

GPU Support

macOS

Supported Versions

Hardware Requirements

Installation Dependencies

Windows

Supported Versions

GPU Support

Windows Native Setup

WSL2 Setup (Recommended)

🚀 Hardware Performance Guide

Model Size vs. Hardware Requirements

GPU Performance Tiers

Tier 1: Enthusiast 🚀

Tier 2: High Performance ⚡

Tier 3: Good Performance 💪

Tier 4: Entry Level 📈

💾 Storage Requirements

Model Storage Needs

Recommended Storage Setup

Storage Performance Impact

🌐 Network Requirements

Initial Setup

Runtime Operation

🔧 Compatibility Matrix

Model Format Support by Platform

GPU Acceleration Support

🧪 Performance Testing

Benchmark Your System

Expected Performance Baselines

7B Model Performance (tokens/second)

13B Model Performance (tokens/second)

❓ Need Help?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

📚 Inferno Wiki

🚀 Getting Started

📖 User Guides

🔧 Advanced Topics

💻 API & Integration

🛠️ Development

❓ Help & Support

📊 Reference

Clone this wiki locally