Skip to content

Commit 0340d0b

Browse files
committed
docs: add LLM configuration section to README
Cover local Ollama setup (free, qwen3.5:9b), cloud provider options, cost comparison table, and per-agent override examples. Targets contributors setting up QuantPod for the first time.
1 parent 4482859 commit 0340d0b

File tree

1 file changed

+78
-0
lines changed

1 file changed

+78
-0
lines changed

README.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,84 @@ print(f"Half-life: {decay_result.half_life:.1f} periods")
239239
print(f"Optimal holding period: {decay_result.optimal_holding_period} periods")
240240
```
241241

242+
## LLM Configuration (QuantPod agents)
243+
244+
QuantPod's CrewAI agents require an LLM provider. The system supports local
245+
models via Ollama (free, no API key) and a full cloud provider fallback chain.
246+
Resolution order: per-tier env override → `LLM_PROVIDER``LLM_FALLBACK_CHAIN`.
247+
248+
### Option A — Local Ollama (recommended, free)
249+
250+
Requires [Ollama](https://ollama.com) and enough RAM for the model.
251+
252+
```bash
253+
# Install Ollama
254+
brew install ollama # macOS
255+
# or: https://ollama.com/download for Linux/Windows
256+
257+
# Start Ollama
258+
ollama serve
259+
260+
# Pull the model (one model runs all agent tiers)
261+
ollama pull qwen3.5:9b # 6.6 GB
262+
263+
# Optional: tune for performance (Apple Silicon)
264+
launchctl setenv OLLAMA_KEEP_ALIVE -1 # keep model loaded permanently
265+
launchctl setenv OLLAMA_FLASH_ATTENTION 1 # Metal acceleration
266+
launchctl setenv OLLAMA_NUM_PARALLEL 10 # 10 parallel IC requests
267+
# restart Ollama after setting these
268+
```
269+
270+
Then set in `.env`:
271+
```bash
272+
LLM_PROVIDER=ollama
273+
OLLAMA_BASE_URL=http://localhost:11434
274+
LLM_MODEL_IC=ollama/qwen3.5:9b
275+
LLM_MODEL_POD=ollama/qwen3.5:9b
276+
LLM_MODEL_ASSISTANT=ollama/qwen3.5:9b
277+
LLM_MODEL_DECODER=ollama/qwen3.5:9b
278+
```
279+
280+
Verify setup:
281+
```bash
282+
python scripts/check_ollama_health.py
283+
```
284+
285+
### Option B — Cloud providers
286+
287+
Set `LLM_PROVIDER` to your preferred provider and supply credentials in `.env`.
288+
See `.env.example` for all supported options.
289+
290+
| Provider | Key env var | Model used |
291+
|----------|-------------|------------|
292+
| `bedrock` | AWS credentials (boto3 chain) | Haiku 4.5 (ICs), Sonnet 4.6 (pods) |
293+
| `anthropic` | `ANTHROPIC_API_KEY` | claude-sonnet-4 |
294+
| `openai` | `OPENAI_API_KEY` | gpt-4o |
295+
| `gemini` | `GEMINI_API_KEY` | gemini-2.5-flash |
296+
| `groq` | `GROQ_API_KEY` | llama-3.3-70b (free tier available) |
297+
298+
**Cost per full crew run** (16 agents):
299+
300+
| Setup | Est. cost |
301+
|-------|-----------|
302+
| Local Ollama | $0.00 |
303+
| Groq (free tier) | $0.00 |
304+
| Bedrock Haiku ICs + Sonnet pods | ~$0.02 |
305+
| OpenAI GPT-4o (all agents) | ~$0.12 |
306+
307+
### Per-agent overrides
308+
309+
Any tier can be overridden independently — mix local and cloud:
310+
311+
```bash
312+
# Example: free local ICs, stronger cloud model for pod synthesis
313+
LLM_MODEL_IC=ollama/qwen3.5:9b
314+
LLM_MODEL_POD=bedrock/us.anthropic.claude-sonnet-4-6
315+
LLM_MODEL_ASSISTANT=bedrock/us.anthropic.claude-sonnet-4-6
316+
```
317+
318+
---
319+
242320
## Module Maturity
243321

244322
| Module | Status | Notes |

0 commit comments

Comments
 (0)