Skip to content

Fixed open-responses example#80

Merged
anuraaga merged 1 commit intoelastic:mainfrom
masaic-ai-platform:main
Oct 13, 2025
Merged

Fixed open-responses example#80
anuraaga merged 1 commit intoelastic:mainfrom
masaic-ai-platform:main

Conversation

@jas34
Copy link
Contributor

@jas34 jas34 commented Sep 5, 2025

Fixed issue mentioned in: #79 (comment)

@jas34 jas34 requested a review from anuraaga as a code owner September 5, 2025 13:22
@cla-checker-service
Copy link

cla-checker-service bot commented Sep 5, 2025

💚 CLA has been signed

OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_API_KEY=unused
CHAT_MODEL=qwen3:0.6B
CHAT_MODEL=${OPENAI_BASE_URL}@qwen3:0.6B
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit odd, maybe cite a github issue to remove this? I think this pattern of adding an entire URL as a prefix isn't something people will want long term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the pattern is:- modelprovide@modelname
At the moment for popularly know providers it can be like this: openai@gtp-xx or xai@grok-xx because open-responses knows urls of these model providers. When it comes to local model, currently this is the only way: URL@model_name.
Long term: boot up local open-responses with model providers configured through config file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok just know that this is unique to open-responses, and no other proxy requires this sort of thing that I'm aware of. I've looked at many many. I would highly encourage you to go back to the simplicity of former versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the key aspect that maybe was lost in recent versions is a default provider concept. When there is only one target, there's no sense in qualifying. this is what makes things work naturally. a URL encoded model name is close to useless as a metrics dimension. there's probably a way to get back to how you had things before and how most work (focus on the api in use e.g. openai, not the destination of the proxy which should be unknowable and might be a security leak to disclose like this in a real env). static config is possible yes, but it is not dev friendly. I'm giving this hard feedback because I want you to be as easy as you were in the past, and not more difficult than others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Will check within team and get back!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I tried this passing the whole URL through and it didn't work.

Screenshot 2025-09-07 at 7 18 14 AM
$ OPENAI_BASE_URL=http://localhost:8080/v1 uv run --exact -q --env-file env.local ../chat.py

Traceback (most recent call last):
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 250, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
    return self._connection.handle_request(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
    raise exc
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
    ) = self._receive_response_headers(**kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
    event = self._receive_event(timeout=timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpcore/_sync/http11.py", line 231, in _receive_event
    raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_base_client.py", line 972, in request
    response = self._client.send(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_client.py", line 1014, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/opentelemetry/instrumentation/httpx/__init__.py", line 1028, in _handle_request_wrapper
    raise exception.with_traceback(exception.__traceback__)
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/opentelemetry/instrumentation/httpx/__init__.py", line 992, in _handle_request_wrapper
    response = wrapped(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 249, in handle_request
    with map_httpcore_exceptions():
         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.local/share/uv/python/cpython-3.12.11-macos-aarch64-none/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/codefromthecrypt/oss/observability-examples/inference-platforms/open-responses/../chat.py", line 55, in <module>
    main()
  File "/Users/codefromthecrypt/oss/observability-examples/inference-platforms/open-responses/../chat.py", line 48, in main
    chat_completion = client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_utils/_utils.py", line 287, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 1087, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_base_client.py", line 1249, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openinference/instrumentation/openai/_request.py", line 306, in __call__
    response = wrapped(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/codefromthecrypt/.cache/uv/environments-v2/chat-fa1cf70dd2d7883c/lib/python3.12/site-packages/openai/_base_client.py", line 1004, in request
    raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error.
 git diff .                                                                                                                                                                    
diff --git a/inference-platforms/open-responses/docker-compose.yml b/inference-platforms/open-responses/docker-compose.yml
index 6a3b3127..aa5c56e2 100644
--- a/inference-platforms/open-responses/docker-compose.yml
+++ b/inference-platforms/open-responses/docker-compose.yml
@@ -1,6 +1,6 @@
 services:
   open-responses:
-    image: masaicai/open-responses:0.3.2
+    image: masaicai/open-responses:0.5.1
     container_name: open-responses
     env_file:
       - env.local
diff --git a/inference-platforms/open-responses/env.local b/inference-platforms/open-responses/env.local
index 4a453df5..ed28ba8b 100644
--- a/inference-platforms/open-responses/env.local
+++ b/inference-platforms/open-responses/env.local
@@ -1,6 +1,6 @@
 OPENAI_BASE_URL=http://localhost:11434/v1
 OPENAI_API_KEY=unused
-CHAT_MODEL=qwen3:0.6B
+CHAT_MODEL=${OPENAI_BASE_URL}@qwen3:0.6B
 
 # Disabled by default in open-responses
 OTEL_SDK_DISABLED=false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are trying to access 0.5.1.
As I mentioned in comment, version 0.4.x onwards many things have changed. For now you can keep on using version 0.3.2 with convention CHAT_MODEL=${OPENAI_BASE_URL}@qwen3:0.6B
Let's continue chat on issue 180, if you have any other question or suggestion.

So far from all discussion (here and issue 180), I am carrying two action items:

  1. To restore back 'default provider concept'.
  2. Update this inference example to upcoming release: v0.5.2. (with correct port mapping 6644, + some other needed fixes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO there's no reason to keep old versions around especially with renaming going on. maybe it is best to repurpose this PR to add a directory for AgC and delete this one (remembering to update the markdown links in the parent README). Will wait for you to proceed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codefromthecrypt - PR is updated:

  • to work with OPENAI_BASE_URL variable at deployment level. Model can be sent directly as qwen3:0.6B
  • with examples to send request on port 6644
  • renamed from open-responses to AgC at applicable places and links.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will give a go personally Friday

@codefromthecrypt
Copy link
Contributor

@jas34 sorry this slipped away from me. can you please squash into a single commit and sign the CLA?

- Updated the inference platform from open-responses to AgC
- Adjusted sample requests for new ports
@jas34
Copy link
Contributor Author

jas34 commented Oct 8, 2025

@codefromthecrypt needful has been completed.

Copy link
Contributor

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-10-09 at 6 01 50 AM Screenshot 2025-10-09 at 6 00 56 AM

note in a next PR, you can add an example for the agent configuration (by having your server configure the anonymous kiwi server). take a look at ../aigw for an example of this

@anuraaga
Copy link
Collaborator

Thanks!

@anuraaga anuraaga merged commit 7ba514d into elastic:main Oct 13, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants