Multi-LLM orchestration pipeline for Generative Engine Optimization (GEO) content production. Receives a natural-language demand, decomposes it into atomic tasks via Claude Sonnet 4.5, routes each task to the most appropriate LLM (8 models across 5 providers) based on complexity-aware tier routing + 80% provider concentration cap + adaptive scoring, and executes waves in parallel with caching, checkpoints, quality gates, FinOps governance and WhatsApp/email alerts on budget thresholds.
12,500+ lines | 1,189 calls tracked | 8 models / 5 providers | unified tracking via geo-finops
Updated 2026-04-07 — Migrated from single-model-per-task-type (96.7% cost concentration in Opus 4) to tier routing by complexity (Haiku 4.5 → Sonnet 4.5 → Opus 4.6). Added Kimi K2 + Qwen 3 32B in Groq, sonar-deep-research in Perplexity, Gemini 2.5 Pro for analysis. Projected savings: 20-40% per execution. Full audit: docs/AUDIT_2026-04-07.md.
Unified FinOps tracking — All calls (this orchestrator + papers + curso-factory + caramaschi + landing-page-geo probes) now flow into a single SQLite local database with nightly Supabase sync. Live dashboard at https://alexandrecaramaschi.com/finops. See the standalone
geo-finopsrepository (initial release v1.1.0).
Demand --> Orchestrator (Claude decomposes) --> Router (adaptive scoring)
|
+---------------+---------------+
| | |
Wave 1 (parallel) Wave 2 (parallel) Wave 3
+--+--+--+ +--+--+ +--+
|P |G |O | |C |G | |C |
+--+--+--+ +--+--+ +--+
P=Perplexity G=Gemini O=OpenAI C=Claude Q=Groq
|
v
Consolidated result
(report + Gantt + cost breakdown)
| Provider | Model | Tier / Role | Cost/1M tokens (in/out) |
|---|---|---|---|
| Anthropic | claude-opus-4-6 | premium · architecture/code complexity 4-5 | $15.00 / $75.00 |
| Anthropic | claude-sonnet-4-5 | balanced · default for code/review complexity 3 | $3.00 / $15.00 |
| Anthropic | claude-haiku-4-5 | economy · classification/summarization complexity 1-2 | $0.80 / $4.00 |
| OpenAI | gpt-4o | writing, copywriting, SEO | $2.50 / $10.00 |
| gemini-2.5-pro | analysis, data_processing (always Pro, never Flash) | $1.25 / $5.00 | |
| Perplexity | sonar-pro | research padrao com fontes ao vivo | $3.00 / $15.00 |
| Perplexity | sonar-deep-research | research multi-step para complexity 4-5 (raciocinio profundo) | $2.00 / $8.00 |
| Groq | llama-3.3-70b-versatile | ultra-rapida (~10x), default para Groq tier 1-2 | $0.59 / $0.79 |
| Groq | moonshotai/kimi-k2-instruct | Kimi K2 1T params, raciocinio agentic, complexity 4-5 | $1.00 / $3.00 |
| Groq | qwen/qwen3-32b | multilingue, traducao primary | $0.29 / $0.59 |
| Complexity | Claude family | Groq family | Perplexity family |
|---|---|---|---|
| 1-2 (low) | claude-haiku-4-5 | llama-3.3-70b-versatile | sonar-pro |
| 3 (medium) | claude-sonnet-4-5 | qwen/qwen3-32b | sonar-pro |
| 4-5 (high) | claude-opus-4-6 | moonshotai/kimi-k2-instruct | sonar-deep-research |
After 5+ tasks executed in a session, if any provider exceeds its CAP_*_SHARE environment variable (default 0.80), the router rebalances to the first viable alternative from a different provider. Configurable per provider via CAP_ANTHROPIC_SHARE, CAP_OPENAI_SHARE, etc.
| Stage | LLM | Function |
|---|---|---|
| Research | Perplexity | Gathers live data with citations |
| Writing | GPT-4o | Produces final long-form content |
| Analysis | Gemini 2.5 Pro | Analyzes and structures data |
| Classification | Groq/Llama-3.3-70B | Fast categorization and tagging |
| Review | Claude Opus 4.6 | Quality check and final revision |
| Type | Primary LLM | Fallback |
|---|---|---|
research |
Perplexity | Gemini |
analysis |
Gemini | Claude |
writing |
GPT-4o | Claude |
copywriting |
GPT-4o | Claude |
code |
Claude | GPT-4o |
review |
Claude | GPT-4o |
seo |
GPT-4o | Perplexity |
data_processing |
Gemini | GPT-4o |
fact_check |
Perplexity | Gemini |
classification |
Groq | Gemini |
translation |
GPT-4o | Gemini |
summarization |
Gemini | GPT-4o |
# Full pipeline
python cli.py run "Write a complete study on GEO vs traditional SEO"
# View plan without executing
python cli.py plan "Research competitors and write report"
# LLM status
python cli.py status
# List configured models and pricing
python cli.py models
# Cost report
python cli.py cost-report
# FinOps
python cli.py finops status # Current limit state
python cli.py finops reset # Reset daily counters
python cli.py finops report # Detailed cost report
# Tracing
python cli.py trace list # List recent traces
python cli.py trace show <id> # Trace details
python cli.py trace last # Last tracepython cli.py run "demand" --dry-run # Show plan without executing
python cli.py run "demand" --verbose # Detailed progress output
python cli.py run "demand" --output-dir ./out # Custom output directory
python cli.py run "demand" --force # Override budget guardA bridge script enables direct integration from Claude Code sessions:
bash C:/Sandyboxclaude/scripts/bin/geo-bridge.sh "your demand here"The bridge script automatically:
- Loads API keys from
geo-orchestrator/.env - Changes to the orchestrator directory
- Executes
python cli.py run "$@" --verbose - Displays a summary of LLMs used and their costs
The orchestrator integrates with curso-factory for automated course generation:
- Perplexity researches the topic and market context
- GPT-4o drafts module content and scripts
- Gemini structures the curriculum and learning objectives
- Claude reviews quality and pedagogical consistency
- Output feeds directly into the curso-factory Jinja2 templates
The alexandrecaramaschi.com platform currently hosts 35 courses, 387 modules, 122K+ lines produced through this pipeline.
| Round | Focus | Key Deliverables |
|---|---|---|
| Round 1 | Foundation | Orchestrator, pipeline, adaptive router, 4 agents, CLI, SHA-256 cache, checkpoints, quality gates, budget guard |
| Round 2 | Resilience & observability | FinOps with daily limits, token bucket rate limiter, tracing with spans, connection pool, cost tracker, context pipeline, feedback loop |
| Round 3 | Advanced intelligence | Circuit breaker, metrics dashboard, token budget allocator, agent memory, session load balancer, task re-prioritization, complexity scoring |
| Round 4 | Web showcase | Showcase page generated by 5 LLMs, automatic deploy |
| Round 5 | Security hardening | API keys via headers (not URL params), git filter-repo to clean history, .gitignore on all repos, key rotation |
| Round 6 | MARL analysis | Analysis based on Foerster/Jaques/Albrecht. 12 tasks, 7 groups, 5 LLMs, $2.68. Proposals: inter-agent communication, collaborative feedback, adaptive balancer |
- Result Cache: SHA-256 with 24h TTL. Identical tasks are not re-executed.
- Checkpoints: State saved per wave. Resume without re-executing completed tasks.
- Quality Gates: Automatic validation per task type. Failure triggers retry on fallback.
- Budget Guard: Pre-execution cost estimate. Blocks if cost > limit, alert if real > 2x estimate.
- Adaptive Router: Weighted score — success (60%), cost (20%), latency (20%).
- Deduplication: Cosine similarity > 0.7 merges tasks automatically.
- Context Optimization: Long outputs summarized via Gemini before injecting as context.
- Rate Limiter: Token bucket per provider with burst and stagger for Gemini.
- FinOps: Daily limits per provider, cost reports, history in JSONL.
- Tracing: Spans per task with timeline, duration, and metadata.
- Connection Pool: HTTP connection reuse per provider.
- Feedback Loop: Quality gate results adjust router scores.
- Circuit Breaker: Protection against offline providers. Opens circuit after consecutive failures, tries half-open periodically.
- Dashboard: Consolidated usage, cost, and performance metrics per provider.
- Token Budget Allocator: Intelligent distribution of token budget between tasks based on complexity.
- Agent Memory: Agents maintain context between executions to progressively improve quality.
| Provider | Daily Limit (USD) | Env Variable |
|---|---|---|
| Anthropic | $2.00 | FINOPS_LIMIT_ANTHROPIC |
| OpenAI | $2.00 | FINOPS_LIMIT_OPENAI |
| $1.00 | FINOPS_LIMIT_GOOGLE |
|
| Perplexity | $1.00 | FINOPS_LIMIT_PERPLEXITY |
| Groq | $1.00 | FINOPS_LIMIT_GROQ |
| Global | $5.00 | FINOPS_LIMIT_GLOBAL |
| Demand Type | Tasks | Estimated Cost |
|---|---|---|
| Simple research | 2-3 | $0.01-0.05 |
| Article with research | 4-5 | $0.05-0.15 |
| Complete study | 6-8 | $0.10-0.50 |
| Site with content | 7-10 | $0.50-1.50 |
git clone https://github.com/alexandrebrt14-sys/geo-orchestrator.git
cd geo-orchestrator
pip install -e .
cp .env.example .env
# Edit .env with your API keys| Variable | Provider |
|---|---|
ANTHROPIC_API_KEY |
Anthropic (Claude) |
OPENAI_API_KEY |
OpenAI (GPT-4o) |
PERPLEXITY_API_KEY |
Perplexity (Sonar) |
GOOGLE_AI_API_KEY |
Google (Gemini) |
GROQ_API_KEY |
Groq (Llama 3.3 70B) |
geo-orchestrator/
cli.py # Main CLI (Click) — entry point
pyproject.toml # Project configuration and dependencies
.env.example # Environment variable template
src/
config.py # LLM configs, task routing, FinOps limits
models.py # Pydantic models (Task, Plan, TaskResult, ExecutionReport)
orchestrator.py # Core: decompose, deduplicate, cache, budget guard, report
pipeline.py # Execution engine: waves, checkpoints, quality gates, fallback
router.py # Adaptive router: scoring, fallback, session load balancer
llm_client.py # Unified HTTP client for 5 providers (retry, backoff)
rate_limiter.py # Token bucket per provider (RPM limits, burst, stagger)
cost_tracker.py # Cost tracking per task and per LLM
finops.py # FinOps engine: daily limits, alerts, reports
tracer.py # Tracing with spans: timeline and observability
connection_pool.py # HTTP connection pool per provider
circuit_breaker.py # Circuit breaker per provider: CLOSED/OPEN/HALF_OPEN
agents/
researcher.py # Perplexity agent (research with citations)
writer.py # GPT-4o agent (writing, copy, SEO)
architect.py # Claude agent (code, architecture, review)
analyzer.py # Gemini agent (analysis, classification, batch)
groq_agent.py # Groq Llama 3.3 70B agent (speed, rapid drafts)
scripts/
run_5llm_board.py # 5-LLM board: collaborative audit and improvement
implement_improvements.py
round3_deep_improvements.py
docs/
MANUAL.md # Complete technical manual
ARCHITECTURE.md # Detailed technical architecture
output/ # Execution reports, cache, checkpoints
- API keys never in URLs — Google API key via
x-goog-api-keyheader - All secrets via environment variables (
.envin.gitignore) output/directory excluded from git (contains logs with sensitive data)- Git history cleaned with
git filter-repoafter GitGuardian incident - Audit module:
papers/src/finops/secrets.pywith leak scanning
- GitHub: https://github.com/alexandrebrt14-sys/geo-orchestrator
- Owner: Alexandre Caramaschi — CEO of Brasil GEO, former CMO at Semantix (Nasdaq), co-founder of AI Brasil
| Property | Stack | Status |
|---|---|---|
| alexandrecaramaschi.com | Next.js 16 + React 19 + Supabase | Production — 35 courses, 25 insights, 122K+ lines |
| brasilgeo.ai | Cloudflare Workers | Production — 14 articles |
| geo-orchestrator | Python + 5 LLMs | Active — multi-LLM pipeline |
| curso-factory | Python + Jinja2 | Active — course generation pipeline |
| geo-checklist | Markdown | Open-source — GEO audit checklist |
| llms-txt-templates | Markdown + JSON | Open-source — llms.txt standard |
| geo-taxonomy | JSON + CSV + Markdown | Open-source — 60+ GEO terms |
| entity-consistency-playbook | Markdown | Open-source — entity consistency |
| papers | Python + Supabase | Research — LLM citation study |