Quick Start Tutorial

🚀 Quick Start Tutorial

Get your first AI conversation running in 10 minutes! This tutorial walks you through everything from installation to your first AI response.

🎯 What You'll Accomplish

By the end of this tutorial, you'll have:

✅ Inferno running on your system
✅ A working AI model loaded
✅ Successfully generated AI responses
✅ Understanding of basic commands

Time Required: 10-15 minutes Skill Level: Beginner Prerequisites: None!

Step 1: Install Inferno (2 minutes)

Choose the fastest method for your system:

🐳 Docker (Recommended - Fastest)

# One command to rule them all!
docker run -d --name inferno -p 8080:8080 inferno:latest serve --demo

# Verify it's running
curl http://localhost:8080/health
# Should return: {"status":"healthy"}

📦 Pre-built Binary (Alternative)

# Linux/macOS
wget https://github.com/ringo380/inferno/releases/latest/download/inferno-$(uname -s)-$(uname -m).tar.gz
tar xzf inferno-*.tar.gz && sudo mv inferno /usr/local/bin/

# Verify installation
inferno --version

✅ Checkpoint: You should see version information or a healthy status response.

Step 2: Get Your First Model (3 minutes)

Let's download a small, fast model perfect for testing:

# Download a 7B parameter model (great for testing)
inferno models download microsoft/DialoGPT-small

# Alternative: Use our demo model (built into Docker)
inferno models list

Expected Output:

Available models:
microsoft-DialoGPT-small  gguf  1.2 GB  2024-12-16 10:30

✅ Checkpoint: You should see at least one model in your list.

Step 3: Start the Server (1 minute)

If not using Docker, start the Inferno server:

# Start the server
inferno serve --bind 0.0.0.0:8080

# You should see:
# [INFO] Starting HTTP server on 0.0.0.0:8080
# [INFO] HTTP API server is running on http://0.0.0.0:8080

Keep this terminal open and open a new one for the next steps.

✅ Checkpoint: Server should be running without errors.

Step 4: Your First AI Conversation (2 minutes)

Method 1: Simple cURL Request

# Your first AI question!
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft-DialoGPT-small",
    "messages": [
      {"role": "user", "content": "Hello! What can you help me with?"}
    ],
    "max_tokens": 100
  }'

Method 2: Command Line Interface

# Even simpler - use the CLI
inferno run --model microsoft-DialoGPT-small --prompt "Hello! What can you help me with?"

Method 3: Interactive Mode

# Start interactive chat session
inferno chat --model microsoft-DialoGPT-small

# Type your messages and press Enter
# Type 'exit' to quit

✅ Checkpoint: You should receive an AI-generated response!

Step 5: Try Different Features (2 minutes)

Streaming Responses (Watch AI "Think")

# See the response generate in real-time
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft-DialoGPT-small",
    "messages": [{"role": "user", "content": "Tell me a short story about AI"}],
    "stream": true
  }'

Different Types of Requests

# Code generation
inferno run --model microsoft-DialoGPT-small \
  --prompt "Write a Python function that calculates fibonacci numbers"

# Question answering
inferno run --model microsoft-DialoGPT-small \
  --prompt "What are the benefits of running AI locally?"

# Creative writing
inferno run --model microsoft-DialoGPT-small \
  --prompt "Write a haiku about privacy and AI"

🎉 Congratulations!

You've successfully:

✅ Installed Inferno
✅ Downloaded and loaded a model
✅ Generated AI responses
✅ Tried different interaction methods

🚀 What's Next?

Immediate Next Steps

Try a Better Model: Model Management - Download larger, more capable models
Explore Features: Usage Examples - See real-world use cases
Optimize Performance: Performance Tuning - Make it faster for your hardware

Common Next Actions

For Developers:

# Set up OpenAI-compatible endpoint
# Now use any OpenAI client library with Inferno!
import openai
client = openai.OpenAI(base_url="http://localhost:8080/v1", api_key="not-needed")

For Power Users:

# Enable authentication and monitoring
inferno serve --auth --metrics --config production.toml

For Businesses:

# Deploy with Docker Compose (includes monitoring)
wget https://raw.githubusercontent.com/ringo380/inferno/main/docker-compose.yml
docker-compose up -d

🛠️ Customize Your Setup

Configuration File

Create inferno.toml for persistent settings:

# ~/.config/inferno/inferno.toml
models_dir = "/home/user/ai-models"
log_level = "info"

[server]
bind_address = "0.0.0.0"
port = 8080

[backend_config]
gpu_enabled = true
context_size = 4096

Environment Variables

# Quick configuration via environment
export INFERNO_LOG_LEVEL=debug
export INFERNO_MODELS_DIR="/custom/path/models"

# Start with custom config
inferno serve

🐛 Troubleshooting Your First Run

"Model not found"

# List available models
inferno models list

# Download if missing
inferno models download model-name

"Connection refused"

# Check if server is running
curl http://localhost:8080/health

# Check what's using port 8080
sudo lsof -i :8080

# Use different port
inferno serve --bind 0.0.0.0:8081

"Out of memory"

# Use smaller model
inferno models download microsoft/DialoGPT-small

# Reduce context size
inferno serve --context-size 2048

Docker Issues

# Check Docker status
docker ps

# View logs
docker logs inferno

# Restart container
docker restart inferno

💡 Pro Tips for New Users

Performance Tips

Start Small: Use 7B models first, then upgrade
SSD Storage: Store models on SSD for faster loading
GPU Memory: Monitor GPU usage with nvidia-smi

Usage Tips

Batch Processing: Use --batch for multiple prompts
Save Responses: Use --output file.txt to save results
Custom Instructions: Create prompt templates for consistency

Exploration Ideas

# Try different AI tasks
inferno run --model your-model --prompt "Summarize: [paste long text]"
inferno run --model your-model --prompt "Translate to Spanish: Hello world"
inferno run --model your-model --prompt "Extract key points from: [paste content]"
inferno run --model your-model --prompt "Fix this code: [paste code]"

📚 Learn More

Essential Reading

Usage Examples - Real-world scenarios and solutions
Model Management - How to find, download, and organize models
Configuration Guide - Detailed configuration options
API Examples - Integration with your applications

Community Resources

GitHub Discussions - Get help and share experiences
GitHub Discussions - Feature requests and technical discussions
FAQ - Common questions and answers

Advanced Topics

Production Deployment - Scale to handle real workloads
Security Hardening - Secure your installation
Monitoring Setup - Track performance and usage
Load Balancing - Distribute across multiple instances

❓ Need Help?

Quick Questions: Check our FAQ Technical Issues: See Troubleshooting Community Help: Visit GitHub Discussions Bug Reports: GitHub Issues

🎊 Welcome to the Inferno Community!

You're now part of a growing community of developers, researchers, and organizations using AI while maintaining privacy and control.

Share your success on GitHub Discussions - we love hearing about your use cases!

Tutorial last updated for Inferno v1.0.0. Found this helpful? Consider Contributing to Wiki to help others!

Quick Start Tutorial

🚀 Quick Start Tutorial

🎯 What You'll Accomplish

Step 1: Install Inferno (2 minutes)

🐳 Docker (Recommended - Fastest)

📦 Pre-built Binary (Alternative)

Step 2: Get Your First Model (3 minutes)

Step 3: Start the Server (1 minute)

Step 4: Your First AI Conversation (2 minutes)

Method 1: Simple cURL Request

Method 2: Command Line Interface

Method 3: Interactive Mode

Step 5: Try Different Features (2 minutes)

Streaming Responses (Watch AI "Think")

Different Types of Requests

🎉 Congratulations!

🚀 What's Next?

Immediate Next Steps

Common Next Actions

🛠️ Customize Your Setup

Configuration File

Environment Variables

🐛 Troubleshooting Your First Run

"Model not found"

"Connection refused"

"Out of memory"

Docker Issues

💡 Pro Tips for New Users

Performance Tips

Usage Tips

Exploration Ideas

📚 Learn More

Essential Reading

Community Resources

Advanced Topics

❓ Need Help?

🎊 Welcome to the Inferno Community!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

📚 Inferno Wiki

🚀 Getting Started

📖 User Guides

🔧 Advanced Topics

💻 API & Integration

🛠️ Development

❓ Help & Support

📊 Reference

Clone this wiki locally