From 2b6a9aaf66c2e1884fd7240584304170c8e2cf4a Mon Sep 17 00:00:00 2001 From: JB Date: Fri, 5 Sep 2025 18:51:51 +0530 Subject: [PATCH] Switch inference platform to AgC; update sample ports - Updated the inference platform from open-responses to AgC - Adjusted sample requests for new ports --- inference-platforms/AgC/README.md | 48 +++++++++++++++++++ .../docker-compose.yml | 4 +- .../{open-responses => AgC}/env.local | 2 +- inference-platforms/README.md | 4 +- inference-platforms/open-responses/README.md | 48 ------------------- 5 files changed, 53 insertions(+), 53 deletions(-) create mode 100644 inference-platforms/AgC/README.md rename inference-platforms/{open-responses => AgC}/docker-compose.yml (77%) rename inference-platforms/{open-responses => AgC}/env.local (88%) delete mode 100644 inference-platforms/open-responses/README.md diff --git a/inference-platforms/AgC/README.md b/inference-platforms/AgC/README.md new file mode 100644 index 0000000..b608f68 --- /dev/null +++ b/inference-platforms/AgC/README.md @@ -0,0 +1,48 @@ +# AgC - Agentic Compute + +This shows how to use the AgC as an [OpenAI Responses adapter][docs], +using its [OpenTelemetry configuration][config]. + +AgC API requests are adapted and forwarded to Ollama as chat +completions. + +## Prerequisites + +Start Ollama and your OpenTelemetry Collector via this repository's [README](../README.md). + +## Run AgC + +```bash +docker compose up --pull always --force-recreate --remove-orphans +``` + +Clean up when finished, like this: + +```bash +docker compose down +``` + +## Call AgC with python + +Once AgC is running, use [uv][uv] to make an OpenAI request via +[chat.py](../chat.py): + +```bash +# Set the OpenAI base URL to the AgC proxy, not Ollama +OPENAI_BASE_URL=http://localhost:6644/v1 uv run --exact -q --env-file env.local ../chat.py +``` + +Or, for the AgC Responses API +```bash +OPENAI_BASE_URL=http://localhost:6644/v1 uv run --exact -q --env-file env.local ../chat.py --use-responses-api +``` + +## Notes + +AgC comes up with a platform service: open-responses (a Spring Boot application), so signals collected are adapted to +OpenTelemetry via a Otel-SDK. + +--- +[doc]: https://github.com/masaic-ai-platform/AgC +[config]: https://github.com/masaic-ai-platform/AgC/blob/main/platform/README.md#setting-up-the-opentelemetry-collector +[uv]: https://docs.astral.sh/uv/getting-started/installation/ diff --git a/inference-platforms/open-responses/docker-compose.yml b/inference-platforms/AgC/docker-compose.yml similarity index 77% rename from inference-platforms/open-responses/docker-compose.yml rename to inference-platforms/AgC/docker-compose.yml index 6a3b312..4abc715 100644 --- a/inference-platforms/open-responses/docker-compose.yml +++ b/inference-platforms/AgC/docker-compose.yml @@ -1,10 +1,10 @@ services: open-responses: - image: masaicai/open-responses:0.3.2 + image: masaicai/open-responses:0.5.2 container_name: open-responses env_file: - env.local ports: - - "8080:8080" + - "6644:6644" extra_hosts: # send localhost traffic to the docker host, e.g. your laptop - "localhost:host-gateway" diff --git a/inference-platforms/open-responses/env.local b/inference-platforms/AgC/env.local similarity index 88% rename from inference-platforms/open-responses/env.local rename to inference-platforms/AgC/env.local index 4a453df..6855d51 100644 --- a/inference-platforms/open-responses/env.local +++ b/inference-platforms/AgC/env.local @@ -2,7 +2,7 @@ OPENAI_BASE_URL=http://localhost:11434/v1 OPENAI_API_KEY=unused CHAT_MODEL=qwen3:0.6B -# Disabled by default in open-responses +# Disabled by default in AgC OTEL_SDK_DISABLED=false OTEL_SERVICE_NAME=open-responses diff --git a/inference-platforms/README.md b/inference-platforms/README.md index f467f9b..ec365dc 100644 --- a/inference-platforms/README.md +++ b/inference-platforms/README.md @@ -16,7 +16,7 @@ Elastic Stack. * [Envoy AI Gateway](aigw) - with [OpenTelemetry tracing and metrics][aigw] * [LiteLLM](litellm) - with [OpenTelemetry logging callbacks][litellm] * [LlamaStack](llama-stack) - with [OpenTelemetry sinks][llama-stack] -* [OpenResponses](open-responses) - with [OpenTelemetry export][open-responses] +* [AgC](AgC) - with [OpenTelemetry export][AgC] * [vLLM](vllm) - with [OpenTelemetry POC][vllm] configuration If you use Elastic Stack, an example would look like this in Kibana: @@ -109,7 +109,7 @@ To start and use Ollama, do the following: [archgw]: https://docs.archgw.com/guides/observability/tracing.html [litellm]: https://llama-stack.readthedocs.io/en/latest/building_applications/telemetry.html#configuration [llama-stack]: https://llama-stack.readthedocs.io/en/latest/building_applications/telemetry.html#telemetry -[open-responses]: https://github.com/masaic-ai-platform/docs/blob/main/openresponses/observability.mdx +[AgC]: https://github.com/masaic-ai-platform/AgC/blob/main/platform/README.md#setting-up-the-opentelemetry-collector [vllm]: https://github.com/vllm-project/vllm/blob/main/examples/online_serving/opentelemetry/README.md [uv]: https://docs.astral.sh/uv/getting-started/installation/ [ollama-dl]: https://ollama.com/download diff --git a/inference-platforms/open-responses/README.md b/inference-platforms/open-responses/README.md deleted file mode 100644 index f7ded44..0000000 --- a/inference-platforms/open-responses/README.md +++ /dev/null @@ -1,48 +0,0 @@ -# open-responses - -This shows how to use the Masaic OpenResponses as an [OpenAI Responses adapter][docs], -using its [OpenTelemetry configuration][config]. - -OpenAI Responses API requests are adapted and forwarded to Ollama as chat -completions. - -## Prerequisites - -Start Ollama and your OpenTelemetry Collector via this repository's [README](../README.md). - -## Run OpenResponses - -```bash -docker compose up --pull always --force-recreate --remove-orphans -``` - -Clean up when finished, like this: - -```bash -docker compose down -``` - -## Call OpenResponses with python - -Once OpenResponses is running, use [uv][uv] to make an OpenAI request via -[chat.py](../chat.py): - -```bash -# Set the OpenAI base URL to the OpenResponses proxy, not Ollama -OPENAI_BASE_URL=http://localhost:8080/v1 uv run --exact -q --env-file env.local ../chat.py -``` - -Or, for the OpenAI Responses API -```bash -OPENAI_BASE_URL=http://localhost:8080/v1 uv run --exact -q --env-file env.local ../chat.py --use-responses-api -``` - -## Notes - -OpenResponses is a Spring Boot application, so signals collected are adapted to -OpenTelemetry via a Micrometer bridge. - ---- -[doc]: https://openresponses.masaic.ai/openresponses/compatibility -[config]: https://github.com/masaic-ai-platform/docs/blob/main/openresponses/observability.mdx -[uv]: https://docs.astral.sh/uv/getting-started/installation/