Support v1/responses API with state management in the agent orchestrator path

## Motivation

Multi-turn chatbots that use Plano's agent orchestration for intent-based routing currently cannot leverage the Responses API's `previous_response_id` for stateful conversations. Users must manage conversation history client-side.

## Problem

The state management infrastructure (`StateStorage` trait, `ResponsesStateProcessor`, memory/PostgreSQL backends) is only wired into the direct proxy path (`crates/brightstaff/src/handlers/llm.rs`), not the agent orchestrator path (`crates/brightstaff/src/handlers/agent_chat_completions.rs`).

## Technical gaps

1. **`crates/brightstaff/src/main.rs`** — `state_storage` is passed to `llm_chat` but not to `agent_chat`. The `agent_chat` function signature has no `StateStorage` parameter.

2. **`crates/brightstaff/src/handlers/agent_chat_completions.rs`** — `handle_agent_chat_inner()` calls `client_request.get_messages()` early, converting to `Vec<OpenAIMessage>`. For Responses API requests with `InputItem` types (tool results, images), this conversion may lose information. No `previous_response_id` handling exists.

3. **`crates/brightstaff/src/handlers/pipeline_processor.rs`** — `invoke_agent` hardcodes the endpoint URL path to `/v1/chat/completions` when calling downstream agents. The request body is serialized generically via `ProviderRequestType::to_bytes()`, but the URL forces agents to receive calls at the chat completions endpoint regardless of the original request format.

4. **`crates/brightstaff/src/handlers/response_handler.rs`** — `create_streaming_response` in the agent path is a raw byte passthrough with no stream processing. Compare to `llm.rs` where responses go through `ObservableStreamProcessor` and optionally `ResponsesStateProcessor`.

5. **`crates/brightstaff/src/state/response_state_processor.rs`** — Only instantiated in `llm.rs`. Never used in the agent orchestration flow.

## Proposed solution (following existing patterns)

1. Pass `state_storage` to `agent_chat` from `main.rs` (same pattern as `llm_chat`).
2. For Responses API requests with `previous_response_id`: resolve stored state via `StateStorage`, convert `InputItem` → `OpenAIMessage` for `determine_orchestration()` / agent selection.
3. For the final agent's response: apply the same stream translation pipeline used in `llm.rs` — translate chat completions SSE into Responses API format (via hermesllm's translation layer), then wrap with `ResponsesStateProcessor` to capture `response_id` and output from the translated `response.completed` event.

For multi-agent chains within a single turn, the state processor should wrap only the final combined response (the orchestrator already distinguishes `is_last_agent`), so intermediate agent responses are not stored individually.

## Related

- #476 (open — Responses API support)
- #614 (closed — shipped Responses API state for the proxy path)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support v1/responses API with state management in the agent orchestrator path #791

Motivation

Problem

Technical gaps

Proposed solution (following existing patterns)

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support v1/responses API with state management in the agent orchestrator path #791

Description

Motivation

Problem

Technical gaps

Proposed solution (following existing patterns)

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions