-
Notifications
You must be signed in to change notification settings - Fork 373
Open
Labels
Description
Plano already sits in the routing layer but it doesn't take advantage of that position to automatically reduce cost for developers. We should build a first-class GPU free-tier arbitrage policy: routing low-stakes or bursty agent traffic to free/low-cost providers when available, with deterministic fallback to the primary when they're unavailable or overloaded, and full trace visibility into every routing decision.
Requirements
- Configurable arbitrage policy at the model_providers level: specify a ranked list of free/low-cost providers with fallback ordering
- Deterministic fallback: when a free-tier provider is unavailable, rate-limited, or errors, Plano falls back to the primary predictably
- All routing decisions surfaced in traces: which provider was selected, why, when it fell back, and what the next selection was
- Reliability guardrails: free-tier providers should not silently degrade: failures must be explicit and logged
What "done" looks like
A developer can add a minimal config block to enable arbitrage, run a request, and see in the trace: provider selected, reason (free-tier available), fallback chain if applicable.
Reactions are currently unavailable