Skip to content

Support loading routing policy from external HTTP endpoint #812

@adilhafeez

Description

@adilhafeez

Summary

Add support for fetching routing policies from an external HTTP endpoint. Plano makes an HTTP call to a configured URL, receives routing preferences as JSON, and caches them locally with a configurable TTL. This keeps Plano's policy integration generic — the external service can be backed by anything (database, config service, API gateway).

In a multitenant deployment, the caller includes a policy_id and revision in the routing request payload. Plano uses these when fetching and caching the policy — enabling per-tenant/per-customer routing policies with revision-aware caching.

Configuration

routing:
  policy_provider:
    url: "https://my-service.internal/v1/routing-policy"
    headers:
      Authorization: "Bearer $POLICY_API_KEY"
    ttl_seconds: 300

Routing request

{
  "messages": [...],
  "policy_id": "customer-abc-123",
  "revision": 42
}

revision is a monotonically increasing integer. When the caller sends a higher revision than what's cached, Plano fetches the updated policy.

When policy_id is present and no inline routing_policy is provided, Plano fetches the policy from the configured endpoint:

GET https://my-service.internal/v1/routing-policy?policy_id=customer-abc-123&revision=42

Routing response (returned to the caller)

{
  "model": "gpt-4o",
  "route": "quick",
  "trace_id": "abc123..."
}

Expected payload from external policy endpoint

{
  "policy_id": "customer-abc-123",
  "revision": 42,
  "schema_version": "v1",
  "routing_preferences": [
    {
      "model": "gpt-4o",
      "routing_preferences": [
        {"name": "quick response", "description": "fast lightweight responses"}
      ]
    },
    {
      "model": "claude-sonnet",
      "routing_preferences": [
        {"name": "deep analysis", "description": "comprehensive detailed analysis"}
      ]
    }
  ]
}
  • policy_id — identifies the policy (e.g. per-customer)
  • revision — monotonically increasing integer indicating which revision of the policy
  • schema_version — the format of the policy document itself (e.g. "v1"). Plano validates this and rejects unsupported versions. First implementation supports v1 only.

Caching behavior

  • Cache key: policy_id with stored revision
  • On request: if cached revision >= requested revision, use cache. If requested revision > cached revision, fetch fresh.
  • Cache entries also expire via TTL as a safety net
  • If revision is omitted, cache key is just policy_id and TTL is the only invalidation

Flow

  1. Routing request comes in with policy_id and revision (and no inline policy)
  2. Plano checks local cache for policy_id
  3. If cached and cached revision >= requested revision, use cached policy
  4. Otherwise, make HTTP request to configured URL with policy_id and revision
  5. Validate response: policy_id and revision match, schema_version is supported
  6. Cache result and pass policy to RouterService::determine_route()

Resolution order

  1. Inline routing_policy in request payload (highest priority)
  2. policy_id + revision → HTTP policy provider (with cache)
  3. Config-file preferences (default fallback)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions