Skip to content

feat: top-level routing_preferences with selection_policy and metrics fetch (v0.4.0) #848

@adilhafeez

Description

@adilhafeez

Adds a new routing configuration format gated to v0.4.0+.

What's changing

New top-level routing_preferences — group multiple candidate models under a named route with a cost/latency-based selection policy:

version: v0.4.0

model_providers:
  - model: openai/gpt-4o
    access_key: $OPENAI_API_KEY
  - model: openai/gpt-4o-mini
    access_key: $OPENAI_API_KEY
    default: true

routing_preferences:
  - name: code understanding
    description: understand and explain existing code snippets
    models:
      - openai/gpt-4o
      - openai/gpt-4o-mini
    selection_policy:
      prefer: cheapest   # cheapest | fastest | random

model_metrics_sources:
  cost:
    url: https://example.com/model-costs   # { "openai/gpt-4o": 0.005, ... }
    refresh_interval: 300
  latency:
    url: https://example.com/model-latency
    refresh_interval: 60

Inline override — request body can include routing_preferences to override config for a single request.

Rules

  • v0.4.0 forbids routing_preferences inside model_providers (startup error if present)
  • Models referenced in routing_preferences must be declared in model_providers
  • model_metrics_sources provides live cost/latency data via HTTP GET, refreshed in the background

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions