-
Notifications
You must be signed in to change notification settings - Fork 373
Description
Summary
Add support for fetching routing policies from an external HTTP endpoint. Plano makes an HTTP call to a configured URL, receives routing preferences as JSON, and caches them locally with a configurable TTL. This keeps Plano's policy integration generic — the external service can be backed by anything (database, config service, API gateway).
In a multitenant deployment, the caller includes a policy_id and revision in the routing request payload. Plano uses these when fetching and caching the policy — enabling per-tenant/per-customer routing policies with revision-aware caching.
Configuration
routing:
policy_provider:
url: "https://my-service.internal/v1/routing-policy"
headers:
Authorization: "Bearer $POLICY_API_KEY"
ttl_seconds: 300Routing request
{
"messages": [...],
"policy_id": "customer-abc-123",
"revision": 42
}revision is a monotonically increasing integer. When the caller sends a higher revision than what's cached, Plano fetches the updated policy.
When policy_id is present and no inline routing_policy is provided, Plano fetches the policy from the configured endpoint:
GET https://my-service.internal/v1/routing-policy?policy_id=customer-abc-123&revision=42
Routing response (returned to the caller)
{
"model": "gpt-4o",
"route": "quick",
"trace_id": "abc123..."
}Expected payload from external policy endpoint
{
"policy_id": "customer-abc-123",
"revision": 42,
"schema_version": "v1",
"routing_preferences": [
{
"model": "gpt-4o",
"routing_preferences": [
{"name": "quick response", "description": "fast lightweight responses"}
]
},
{
"model": "claude-sonnet",
"routing_preferences": [
{"name": "deep analysis", "description": "comprehensive detailed analysis"}
]
}
]
}policy_id— identifies the policy (e.g. per-customer)revision— monotonically increasing integer indicating which revision of the policyschema_version— the format of the policy document itself (e.g. "v1"). Plano validates this and rejects unsupported versions. First implementation supportsv1only.
Caching behavior
- Cache key:
policy_idwith storedrevision - On request: if cached revision >= requested revision, use cache. If requested revision > cached revision, fetch fresh.
- Cache entries also expire via TTL as a safety net
- If
revisionis omitted, cache key is justpolicy_idand TTL is the only invalidation
Flow
- Routing request comes in with
policy_idandrevision(and no inline policy) - Plano checks local cache for
policy_id - If cached and cached revision >= requested revision, use cached policy
- Otherwise, make HTTP request to configured URL with
policy_idandrevision - Validate response:
policy_idandrevisionmatch,schema_versionis supported - Cache result and pass policy to
RouterService::determine_route()
Resolution order
- Inline
routing_policyin request payload (highest priority) policy_id+revision→ HTTP policy provider (with cache)- Config-file preferences (default fallback)