Skip to content

feat(openai): add WebSocket mode for Responses API#149

Open
taigrr wants to merge 7 commits intocharmbracelet:mainfrom
taigrr:cd/websocket-mode
Open

feat(openai): add WebSocket mode for Responses API#149
taigrr wants to merge 7 commits intocharmbracelet:mainfrom
taigrr:cd/websocket-mode

Conversation

@taigrr
Copy link

@taigrr taigrr commented Feb 24, 2026

Add WebSocket transport support for the OpenAI Responses API, enabling lower-latency persistent connections for tool-call-heavy workflows.

Key features:

  • wsTransport manages WebSocket connection lifecycle with automatic reconnection before the 60-minute connection limit
  • previous_response_id auto-chaining for incremental continuation
  • generate:false warmup support via GenerateWarmup provider option
  • Falls back to HTTP transparently on WebSocket connection failure
  • One in-flight response at a time per connection (mutex-protected)

New provider options:

  • WithWebSocket() enables WebSocket mode (requires WithUseResponsesAPI)
  • PreviousResponseID on ResponsesProviderOptions for explicit chaining
  • GenerateWarmup on ResponsesProviderOptions for prefill/warmup

The WebSocket events use the same JSON structure as HTTP SSE events, so both Generate() and Stream() reuse existing event parsing logic.

No changes to the LanguageModel interface or consumer-facing API.

  • I have read CONTRIBUTING.md.
  • I have created a discussion that was approved by a maintainer (for new features).

Add WebSocket transport support for the OpenAI Responses API, enabling
lower-latency persistent connections for tool-call-heavy workflows.

Key features:
- wsTransport manages WebSocket connection lifecycle with automatic
  reconnection before the 60-minute connection limit
- previous_response_id auto-chaining for incremental continuation
- generate:false warmup support via GenerateWarmup provider option
- Falls back to HTTP transparently on WebSocket connection failure
- One in-flight response at a time per connection (mutex-protected)

New provider options:
- WithWebSocket() enables WebSocket mode (requires WithUseResponsesAPI)
- PreviousResponseID on ResponsesProviderOptions for explicit chaining
- GenerateWarmup on ResponsesProviderOptions for prefill/warmup

The WebSocket events use the same JSON structure as HTTP SSE events,
so both Generate() and Stream() reuse existing event parsing logic.

No changes to the LanguageModel interface or consumer-facing API.
…ponse_id

When using WebSocket mode with previous_response_id chaining, the server
already has the prior conversation context. Previously we sent the full
input array every time, which was redundant and incorrect per the spec.

Now wsTransport tracks lastInputLen (the number of input items sent in
the last successful request). On subsequent calls with previous_response_id,
only new items are sent. Additionally, function_call items are filtered
out since the server generated those as part of its own response output.

On chain breaks (previous_response_not_found errors), lastInputLen resets
to 0 so the next call sends the full prompt.
…sponsesFinishReason

- Handle response.failed events in both generateViaWebSocket and
  streamViaWebSocket (previously would return nil error with empty content)
- Fix read goroutine leak on context cancellation by setting a read
  deadline when ctx is done, allowing ReadMessage to unblock
- Use mapResponsesFinishReason in generateViaWebSocket to match the HTTP
  path's handling of incomplete_details reasons
@taigrr taigrr marked this pull request as ready for review February 24, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants