feat(openai): add WebSocket mode for Responses API#149
Open
taigrr wants to merge 7 commits intocharmbracelet:mainfrom
Open
feat(openai): add WebSocket mode for Responses API#149taigrr wants to merge 7 commits intocharmbracelet:mainfrom
taigrr wants to merge 7 commits intocharmbracelet:mainfrom
Conversation
Add WebSocket transport support for the OpenAI Responses API, enabling lower-latency persistent connections for tool-call-heavy workflows. Key features: - wsTransport manages WebSocket connection lifecycle with automatic reconnection before the 60-minute connection limit - previous_response_id auto-chaining for incremental continuation - generate:false warmup support via GenerateWarmup provider option - Falls back to HTTP transparently on WebSocket connection failure - One in-flight response at a time per connection (mutex-protected) New provider options: - WithWebSocket() enables WebSocket mode (requires WithUseResponsesAPI) - PreviousResponseID on ResponsesProviderOptions for explicit chaining - GenerateWarmup on ResponsesProviderOptions for prefill/warmup The WebSocket events use the same JSON structure as HTTP SSE events, so both Generate() and Stream() reuse existing event parsing logic. No changes to the LanguageModel interface or consumer-facing API.
…ponse_id When using WebSocket mode with previous_response_id chaining, the server already has the prior conversation context. Previously we sent the full input array every time, which was redundant and incorrect per the spec. Now wsTransport tracks lastInputLen (the number of input items sent in the last successful request). On subsequent calls with previous_response_id, only new items are sent. Additionally, function_call items are filtered out since the server generated those as part of its own response output. On chain breaks (previous_response_not_found errors), lastInputLen resets to 0 so the next call sends the full prompt.
…sponsesFinishReason - Handle response.failed events in both generateViaWebSocket and streamViaWebSocket (previously would return nil error with empty content) - Fix read goroutine leak on context cancellation by setting a read deadline when ctx is done, allowing ReadMessage to unblock - Use mapResponsesFinishReason in generateViaWebSocket to match the HTTP path's handling of incomplete_details reasons
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add WebSocket transport support for the OpenAI Responses API, enabling lower-latency persistent connections for tool-call-heavy workflows.
Key features:
New provider options:
The WebSocket events use the same JSON structure as HTTP SSE events, so both Generate() and Stream() reuse existing event parsing logic.
No changes to the LanguageModel interface or consumer-facing API.
CONTRIBUTING.md.