@@ -239,6 +239,84 @@ print(f"Half-life: {decay_result.half_life:.1f} periods")
239239print (f " Optimal holding period: { decay_result.optimal_holding_period} periods " )
240240```
241241
242+ ## LLM Configuration (QuantPod agents)
243+
244+ QuantPod's CrewAI agents require an LLM provider. The system supports local
245+ models via Ollama (free, no API key) and a full cloud provider fallback chain.
246+ Resolution order: per-tier env override → ` LLM_PROVIDER ` → ` LLM_FALLBACK_CHAIN ` .
247+
248+ ### Option A — Local Ollama (recommended, free)
249+
250+ Requires [ Ollama] ( https://ollama.com ) and enough RAM for the model.
251+
252+ ``` bash
253+ # Install Ollama
254+ brew install ollama # macOS
255+ # or: https://ollama.com/download for Linux/Windows
256+
257+ # Start Ollama
258+ ollama serve
259+
260+ # Pull the model (one model runs all agent tiers)
261+ ollama pull qwen3.5:9b # 6.6 GB
262+
263+ # Optional: tune for performance (Apple Silicon)
264+ launchctl setenv OLLAMA_KEEP_ALIVE -1 # keep model loaded permanently
265+ launchctl setenv OLLAMA_FLASH_ATTENTION 1 # Metal acceleration
266+ launchctl setenv OLLAMA_NUM_PARALLEL 10 # 10 parallel IC requests
267+ # restart Ollama after setting these
268+ ```
269+
270+ Then set in ` .env ` :
271+ ``` bash
272+ LLM_PROVIDER=ollama
273+ OLLAMA_BASE_URL=http://localhost:11434
274+ LLM_MODEL_IC=ollama/qwen3.5:9b
275+ LLM_MODEL_POD=ollama/qwen3.5:9b
276+ LLM_MODEL_ASSISTANT=ollama/qwen3.5:9b
277+ LLM_MODEL_DECODER=ollama/qwen3.5:9b
278+ ```
279+
280+ Verify setup:
281+ ``` bash
282+ python scripts/check_ollama_health.py
283+ ```
284+
285+ ### Option B — Cloud providers
286+
287+ Set ` LLM_PROVIDER ` to your preferred provider and supply credentials in ` .env ` .
288+ See ` .env.example ` for all supported options.
289+
290+ | Provider | Key env var | Model used |
291+ | ----------| -------------| ------------|
292+ | ` bedrock ` | AWS credentials (boto3 chain) | Haiku 4.5 (ICs), Sonnet 4.6 (pods) |
293+ | ` anthropic ` | ` ANTHROPIC_API_KEY ` | claude-sonnet-4 |
294+ | ` openai ` | ` OPENAI_API_KEY ` | gpt-4o |
295+ | ` gemini ` | ` GEMINI_API_KEY ` | gemini-2.5-flash |
296+ | ` groq ` | ` GROQ_API_KEY ` | llama-3.3-70b (free tier available) |
297+
298+ ** Cost per full crew run** (16 agents):
299+
300+ | Setup | Est. cost |
301+ | -------| -----------|
302+ | Local Ollama | $0.00 |
303+ | Groq (free tier) | $0.00 |
304+ | Bedrock Haiku ICs + Sonnet pods | ~ $0.02 |
305+ | OpenAI GPT-4o (all agents) | ~ $0.12 |
306+
307+ ### Per-agent overrides
308+
309+ Any tier can be overridden independently — mix local and cloud:
310+
311+ ``` bash
312+ # Example: free local ICs, stronger cloud model for pod synthesis
313+ LLM_MODEL_IC=ollama/qwen3.5:9b
314+ LLM_MODEL_POD=bedrock/us.anthropic.claude-sonnet-4-6
315+ LLM_MODEL_ASSISTANT=bedrock/us.anthropic.claude-sonnet-4-6
316+ ```
317+
318+ ---
319+
242320## Module Maturity
243321
244322| Module | Status | Notes |
0 commit comments