Support trajectory pinning for consistent model selection in agentic loops

## Summary

In agentic loops, the same conversation hits Plano's routing endpoint multiple times. Today each call re-evaluates routing independently, so the selected model can change mid-conversation. Session pinning ensures that once a model is selected for a session, subsequent routing calls in that session return the same model.

## How it works

The caller sends an `X-Session-Id` header in the routing request. Plano caches the routing decision keyed by that ID:

- **First call** with `X-Session-Id: abc` → run routing, select model, cache `abc → gpt-4o`
- **Subsequent calls** with `X-Session-Id: abc` → skip routing, return cached `gpt-4o`
- **Cache entries expire** via configurable TTL (default 30 min)
- **No `X-Session-Id` header** → routing runs fresh every time (current behavior, no breaking change)

## Request

```
POST /routing/v1/chat/completions
X-Session-Id: session-abc-123
Content-Type: application/json

{
  "messages": [...]
}
```

## Response

```json
{
  "model": "gpt-4o",
  "route": "quick",
  "session_id": "session-abc-123",
  "pinned": true,
  "trace_id": "abc123..."
}
```

`pinned: true` indicates the result came from cache. `pinned: false` on first routing decision.

## Implementation

- Extract `X-Session-Id` from request headers in the routing handler
- Add an in-memory TTL cache in `RouterService` keyed by session ID (e.g. `HashMap<String, (String, Instant)>` behind a mutex)
- Before calling `determine_route()`, check cache for a valid (non-expired) entry
- On cache miss, run routing and store the result
- TTL configurable via `routing.session_ttl_seconds` in plano config (default 1800)
- No database or external state needed



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support trajectory pinning for consistent model selection in agentic loops #813

Summary

How it works

Request

Response

Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support trajectory pinning for consistent model selection in agentic loops #813

Description

Summary

How it works

Request

Response

Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions