-
Notifications
You must be signed in to change notification settings - Fork 373
Description
Motivation
Multi-turn chatbots that use Plano's agent orchestration for intent-based routing currently cannot leverage the Responses API's previous_response_id for stateful conversations. Users must manage conversation history client-side.
Problem
The state management infrastructure (StateStorage trait, ResponsesStateProcessor, memory/PostgreSQL backends) is only wired into the direct proxy path (crates/brightstaff/src/handlers/llm.rs), not the agent orchestrator path (crates/brightstaff/src/handlers/agent_chat_completions.rs).
Technical gaps
-
crates/brightstaff/src/main.rs—state_storageis passed tollm_chatbut not toagent_chat. Theagent_chatfunction signature has noStateStorageparameter. -
crates/brightstaff/src/handlers/agent_chat_completions.rs—handle_agent_chat_inner()callsclient_request.get_messages()early, converting toVec<OpenAIMessage>. For Responses API requests withInputItemtypes (tool results, images), this conversion may lose information. Noprevious_response_idhandling exists. -
crates/brightstaff/src/handlers/pipeline_processor.rs—invoke_agenthardcodes the endpoint URL path to/v1/chat/completionswhen calling downstream agents. The request body is serialized generically viaProviderRequestType::to_bytes(), but the URL forces agents to receive calls at the chat completions endpoint regardless of the original request format. -
crates/brightstaff/src/handlers/response_handler.rs—create_streaming_responsein the agent path is a raw byte passthrough with no stream processing. Compare tollm.rswhere responses go throughObservableStreamProcessorand optionallyResponsesStateProcessor. -
crates/brightstaff/src/state/response_state_processor.rs— Only instantiated inllm.rs. Never used in the agent orchestration flow.
Proposed solution (following existing patterns)
- Pass
state_storagetoagent_chatfrommain.rs(same pattern asllm_chat). - For Responses API requests with
previous_response_id: resolve stored state viaStateStorage, convertInputItem→OpenAIMessagefordetermine_orchestration()/ agent selection. - For the final agent's response: apply the same stream translation pipeline used in
llm.rs— translate chat completions SSE into Responses API format (via hermesllm's translation layer), then wrap withResponsesStateProcessorto captureresponse_idand output from the translatedresponse.completedevent.
For multi-agent chains within a single turn, the state processor should wrap only the final combined response (the orchestrator already distinguishes is_last_agent), so intermediate agent responses are not stored individually.
Related
- OpenAI Responses API #476 (open — Responses API support)
- Full support for responses API plus conversation api like functionality #614 (closed — shipped Responses API state for the proxy path)