fix: eliminate 60-second delay for OpenRouter Pipe requests (issue #378)#384
fix: eliminate 60-second delay for OpenRouter Pipe requests (issue #378)#384cwawak wants to merge 15 commits intocogwheel0:mainfrom
Conversation
|
Root cause → Fix → Results (please note: I’m not an expert here, just doing my best to help) Root cause
Fix
Results
|
- Remove session_id, id, and chat_id from request payload to bypass OpenWebUI's async task queue (issue cogwheel0#378) - Add SSE parsing to handle streaming response directly via HTTP - Add isHttpStreamOnly flag to prevent duplicate content from WebSocket - When using SSE-only mode, skip WebSocket subscriptions entirely The 60-second delay was caused by OpenWebUI routing requests through an async task queue when session_id, chat_id, and message_id (id) were all present. By removing these identifiers, requests go directly through SSE streaming instead.
…th cost display and reasoning content
9c69496 to
f57d791
Compare
|
Quick update: I added SSE replay‑dedupe + whitespace handling for SSE‑only mode.
This resolved the duplicated responses and the “Thesky” first‑word spacing issue in my testing. |
|
Update: I rebased the PR branch onto the refactor version. What changed in the PR branch:
Result: no duplicated responses and first‑word spacing issues resolved in my tests. |
|
Hey @cwawak, thank you for the PR! Streaming via SSE has it's own problems and is not very reliable for mobile clients. I struggled to balance and keep it as a path in the initial days of the app. Duplication is just one of issues you may have come across. OWUI web also primarily relies on websockets. Anyways, could you try this and let me know if it alleviates the issue without relying on SSE?: https://docs.openwebui.com/troubleshooting/connection-error#websocket-troubleshooting |
|
Hi @cogwheel0 - thanks for being so kind. I don't think my patch is usable, but maybe helpful for someone who is troubleshooting this! I have verified that websockets are working fine. Using Chrome on my mac, I do not have any issues with Websockets. Using Conduit to Anthropic hosted or NIM hosted models works fine with the latest couple versions of the app. I only encounter the 30s+30s=60s total delay when using models that are deployed using the "Open WebUI OpenRouter Pipe". This pipe allows for easy toggle of "OpenRouter search" (where OpenRouter, for a few pennies, inserts search results into the model response), some enforcement of Zero Data Retention flags per model, and a nice little display of tokens consumed and total cost. In the Web UI, I have no problems with WS or delays, only via the Conduit app. I don't necessarily think this is something that should be fixed in Conduit, but I couldn't figure out what was wrong with the OpenRouter Pipe itself. Anyway, I greatly appreciate your kind words, I think your application is positively delightful and very easy to work on for a novice coder! |
Ah I see! If you are just looking to use the websearch for openrouter models, might I suggest another solution? If you put I haven't tried out the pipe you linked so I might be missing something as well. |
Hi @cwawak. Sorry just wanted to share that to see the consumption in real time I am using LiteLLM if you want to give it a try. Its open-source and can be run locally. It works like a proxy so all the LLMs you use can be used from 1 place and it has a lot of cool functionalities. One of them is that it shows in real time the cost and token consumption of each LLM. |
|
I also use OpenRouter Pipe; it's very useful for integrating OpenRouter features into Open Web UI, but latency is a major issue. Is there a solution? |

Summary
Fixes the ~60 second OpenRouter Pipe delay in Conduit and hardens SSE-only
streaming to restore usage/cost info and prevent duplicate output.
Root Cause
OpenWebUI routes
/api/chat/completionsthrough an async task queue whensession_id,chat_id, andid(message_id) are all present, adding ~60seconds of latency. Conduit was sending all three; the web UI does not.
Fix
the async queue and force direct SSE streaming.
content appears reliably.
content from both HTTP and sockets.
Changes
lib/core/services/api_service.dartlib/core/services/streaming_helper.dartsocket subscriptions when
httpStreamOnly.lib/features/chat/providers/chat_providers.darthttpStreamOnlyto streaming helper.lib/features/chat/widgets/assistant_message_widget.dartTesting