-
Notifications
You must be signed in to change notification settings - Fork 373
Open
Description
Adds a new routing configuration format gated to v0.4.0+.
What's changing
New top-level routing_preferences — group multiple candidate models under a named route with a cost/latency-based selection policy:
version: v0.4.0
model_providers:
- model: openai/gpt-4o
access_key: $OPENAI_API_KEY
- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
default: true
routing_preferences:
- name: code understanding
description: understand and explain existing code snippets
models:
- openai/gpt-4o
- openai/gpt-4o-mini
selection_policy:
prefer: cheapest # cheapest | fastest | random
model_metrics_sources:
cost:
url: https://example.com/model-costs # { "openai/gpt-4o": 0.005, ... }
refresh_interval: 300
latency:
url: https://example.com/model-latency
refresh_interval: 60Inline override — request body can include routing_preferences to override config for a single request.
Rules
v0.4.0forbidsrouting_preferencesinsidemodel_providers(startup error if present)- Models referenced in
routing_preferencesmust be declared inmodel_providers model_metrics_sourcesprovides live cost/latency data via HTTP GET, refreshed in the background
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels