Step-by-step practical recipes for self-hosted AI systems. Each recipe is standalone — pick the one that matches what you're building.
| # | Recipe | What You'll Build | GPU Required? |
|---|---|---|---|
| 01 | Voice Agent Setup | Whisper STT + vLLM + Kokoro TTS pipeline | Yes |
| 02 | Document Q&A | RAG system with Qdrant/ChromaDB + local LLM | Optional |
| 03 | Code Assistant | Tool-calling code agent with file ops | Yes |
| 04 | Privacy Proxy | PII-stripping proxy for cloud API calls | No |
| 05 | Multi-GPU Cluster | Load-balanced multi-node GPU inference | Yes (2+) |
| 06 | Swarm Patterns | Sub-agent parallelization and coordination | Yes |
| 08 | n8n + Local LLM | Workflow automation with local models | Yes |
| — | Agent Template | Code specialist agent with debugging protocol | Yes |
| Goal | Start With |
|---|---|
| Run a voice assistant locally | Recipe 01 |
| Search my documents with AI | Recipe 02 |
| Build a local code copilot | Recipe 03 |
| Use cloud AI without leaking data | Recipe 04 |
| Scale across multiple GPUs | Recipe 05 |
| Run multiple agents in parallel | Recipe 06 |
| Automate workflows with AI | Recipe 08 |
| Set up a coding agent from scratch | Agent Template |
All recipes assume you have:
- A Linux machine (Ubuntu 22.04+ recommended)
- Python 3.10+
- Docker installed
GPU recipes additionally need:
- NVIDIA GPU with CUDA support
- NVIDIA Container Toolkit
- vLLM installed (see SETUP.md for base installation)
- SETUP.md — Base vLLM + OpenClaw installation
- HARDWARE-GUIDE.md — GPU buying guide with real benchmarks
- ARCHITECTURE.md — How the tool call proxy works
- PATTERNS.md — Transferable patterns for persistent agents