Conversation
|
💚 CLA has been signed |
| OPENAI_BASE_URL=http://localhost:11434/v1 | ||
| OPENAI_API_KEY=unused | ||
| CHAT_MODEL=qwen3:0.6B | ||
| CHAT_MODEL=${OPENAI_BASE_URL}@qwen3:0.6B |
There was a problem hiding this comment.
this is a bit odd, maybe cite a github issue to remove this? I think this pattern of adding an entire URL as a prefix isn't something people will want long term.
There was a problem hiding this comment.
In general the pattern is:- modelprovide@modelname
At the moment for popularly know providers it can be like this: openai@gtp-xx or xai@grok-xx because open-responses knows urls of these model providers. When it comes to local model, currently this is the only way: URL@model_name.
Long term: boot up local open-responses with model providers configured through config file.
There was a problem hiding this comment.
ok just know that this is unique to open-responses, and no other proxy requires this sort of thing that I'm aware of. I've looked at many many. I would highly encourage you to go back to the simplicity of former versions.
There was a problem hiding this comment.
the key aspect that maybe was lost in recent versions is a default provider concept. When there is only one target, there's no sense in qualifying. this is what makes things work naturally. a URL encoded model name is close to useless as a metrics dimension. there's probably a way to get back to how you had things before and how most work (focus on the api in use e.g. openai, not the destination of the proxy which should be unknowable and might be a security leak to disclose like this in a real env). static config is possible yes, but it is not dev friendly. I'm giving this hard feedback because I want you to be as easy as you were in the past, and not more difficult than others.
There was a problem hiding this comment.
Thanks for the feedback. Will check within team and get back!
There was a problem hiding this comment.
By the way, I tried this passing the whole URL through and it didn't work.
$ OPENAI_BASE_URL=http://localhost:8080/v1 uv run --exact -q --env-file env.local ../chat.py
Traceback (most recent call last):
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
yield
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 250, in handle_request
resp = self._pool.handle_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 231, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_base_client.py", line 972, in request
response = self._client.send(
^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
response = self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 1014, in _send_single_request
response = transport.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/opentelemetry/instrumentation/httpx/__init__.py", line 1028, in _handle_request_wrapper
raise exception.with_traceback(exception.__traceback__)
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/opentelemetry/instrumentation/httpx/__init__.py", line 992, in _handle_request_wrapper
response = wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 249, in handle_request
with map_httpcore_exceptions():
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.local/share/uv/python/cpython-3.12.11-macos-aarch64-none/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/codefromthecrypt/oss/observability-examples/inference-platforms/open-responses/../chat.py", line 55, in <module>
main()
File "/Users/codefromthecrypt/oss/observability-examples/inference-platforms/open-responses/../chat.py", line 48, in main
chat_completion = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_utils/_utils.py", line 287, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 1087, in create
return self._post(
^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_base_client.py", line 1249, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openinference/instrumentation/openai/_request.py", line 306, in __call__
response = wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_base_client.py", line 1004, in request
raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error. git diff .
diff --git a/inference-platforms/open-responses/docker-compose.yml b/inference-platforms/open-responses/docker-compose.yml
index 6a3b3127..aa5c56e2 100644
--- a/inference-platforms/open-responses/docker-compose.yml
+++ b/inference-platforms/open-responses/docker-compose.yml
@@ -1,6 +1,6 @@
services:
open-responses:
- image: masaicai/open-responses:0.3.2
+ image: masaicai/open-responses:0.5.1
container_name: open-responses
env_file:
- env.local
diff --git a/inference-platforms/open-responses/env.local b/inference-platforms/open-responses/env.local
index 4a453df5..ed28ba8b 100644
--- a/inference-platforms/open-responses/env.local
+++ b/inference-platforms/open-responses/env.local
@@ -1,6 +1,6 @@
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=unused
-CHAT_MODEL=qwen3:0.6B
+CHAT_MODEL=${OPENAI_BASE_URL}@qwen3:0.6B
# Disabled by default in open-responses
OTEL_SDK_DISABLED=false
There was a problem hiding this comment.
I see you are trying to access 0.5.1.
As I mentioned in comment, version 0.4.x onwards many things have changed. For now you can keep on using version 0.3.2 with convention CHAT_MODEL=${OPENAI_BASE_URL}@qwen3:0.6B
Let's continue chat on issue 180, if you have any other question or suggestion.
So far from all discussion (here and issue 180), I am carrying two action items:
- To restore back 'default provider concept'.
- Update this inference example to upcoming release: v0.5.2. (with correct port mapping 6644, + some other needed fixes).
There was a problem hiding this comment.
IMHO there's no reason to keep old versions around especially with renaming going on. maybe it is best to repurpose this PR to add a directory for AgC and delete this one (remembering to update the markdown links in the parent README). Will wait for you to proceed!
There was a problem hiding this comment.
@codefromthecrypt - PR is updated:
- to work with OPENAI_BASE_URL variable at deployment level. Model can be sent directly as qwen3:0.6B
- with examples to send request on port 6644
- renamed from open-responses to AgC at applicable places and links.
There was a problem hiding this comment.
Thanks will give a go personally Friday
|
@jas34 sorry this slipped away from me. can you please squash into a single commit and sign the CLA? |
- Updated the inference platform from open-responses to AgC - Adjusted sample requests for new ports
|
@codefromthecrypt needful has been completed. |
|
Thanks! |


Fixed issue mentioned in: #79 (comment)