feat: top-level routing_preferences with selection_policy and metrics fetch (v0.4.0)

Adds a new routing configuration format gated to `v0.4.0`+.

## What's changing

**New top-level `routing_preferences`** — group multiple candidate models under a named route with a cost/latency-based selection policy:

```yaml
version: v0.4.0

model_providers:
  - model: openai/gpt-4o
    access_key: $OPENAI_API_KEY
  - model: openai/gpt-4o-mini
    access_key: $OPENAI_API_KEY
    default: true

routing_preferences:
  - name: code understanding
    description: understand and explain existing code snippets
    models:
      - openai/gpt-4o
      - openai/gpt-4o-mini
    selection_policy:
      prefer: cheapest   # cheapest | fastest | random

model_metrics_sources:
  cost:
    url: https://example.com/model-costs   # { "openai/gpt-4o": 0.005, ... }
    refresh_interval: 300
  latency:
    url: https://example.com/model-latency
    refresh_interval: 60
```

**Inline override** — request body can include `routing_preferences` to override config for a single request.

## Rules
- `v0.4.0` forbids `routing_preferences` inside `model_providers` (startup error if present)
- Models referenced in `routing_preferences` must be declared in `model_providers`
- `model_metrics_sources` provides live cost/latency data via HTTP GET, refreshed in the background

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: top-level routing_preferences with selection_policy and metrics fetch (v0.4.0) #848

What's changing

Rules

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: top-level routing_preferences with selection_policy and metrics fetch (v0.4.0) #848

Description

What's changing

Rules

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions