Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning) by shouryamaanjain · Pull Request #5 · emplodeai/Emplode

shouryamaanjain · 2025-09-13T13:41:44Z

Summary

Migrate Emplode remote model invocation to LiteLLM OpenAI Responses API and set default models to GPT‑5 family with high reasoning effort.

Changes

Default model set to gpt-5.
emplode --fast now uses gpt-5-nano.
Switched remote (non-local) inference from streaming Chat Completions to the Responses API via LiteLLM, with:
- reasoning: { effort: "high" } for both default and fast modes.
- max_output_tokens wired to Emplode’s max_tokens.
- Custom run_code tool defined for function-calling.
Updated CLI help text and user-facing messages from GPT-4/4o to GPT-5.
Local mode behavior remains unchanged.

Nature of change

Enhancement / Refactor (API migration + model defaults).

Impact

Non-breaking for local mode.
Remote mode now uses the Responses API non-streaming flow. Assistant text and any run_code tool calls are parsed and executed as before. If streaming is required later, we can enable stream=True with the Responses API.
Azure path continues to work via model=f"azure/<deployment>" using the existing environment variables.

Why

Align Emplode with the newer OpenAI Responses API through LiteLLM for better support of reasoning settings and future features.
Standardize on GPT‑5 family and expose a fast option (gpt-5-nano) while keeping reasoning quality high.

Configuration notes

OPENAI_API_KEY required for OpenAI.
Optional: --api_base for custom OpenAI-compatible backends (uses custom/<model> path).
Azure: AZURE_API_KEY or OPENAI_API_KEY, AZURE_API_BASE, AZURE_API_VERSION, AZURE_DEPLOYMENT_NAME (wired to Responses API).

₍ᐢ•(ܫ)•ᐢ₎ Generated by Capy (view task)

…-5/gpt-5-nano with high reasoning effort

…ens-based trimming with configurable remote window

…o) to avoid 400 'Unsupported parameter'

…user/assistant), skip empty content; treat tool outputs as user text to avoid missing content errors

Copilot

Pull Request Overview

This PR migrates Emplode from using the OpenAI Chat Completions API to the newer Responses API via LiteLLM, while updating default models to GPT-5 family with high reasoning effort.

Switches remote model invocation from streaming Chat Completions to non-streaming Responses API
Updates default model from gpt-4o to gpt-5 and fast mode from gpt-4o-mini to gpt-5-nano
Implements custom tool handling with response parsing for the new API structure

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
emplode/emplode.py	Core API migration to Responses API with new response parsing logic and model defaults
emplode/cli.py	Updated CLI help text and fast mode model configuration
.capy/pr-body-cap-2-41193abd.md	Added PR description file

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-28T06:51:00Z

emplode/emplode.py

+        try:
+          messages = tt.trim(self.messages, "gpt-4o", system_message=system_message)
+        except Exception:
+          remote_window = int(os.environ.get("EMPLODE_REMOTE_CONTEXT_WINDOW", "128000"))
+          budget = max(512, remote_window - self.max_tokens - 1000)
+          messages = tt.trim(self.messages, max_tokens=budget, system_message=system_message)


The fallback to 'gpt-4o' for token trimming is inconsistent with the new GPT-5 defaults. Consider using 'gpt-5' as the fallback model instead, or use a more generic approach that doesn't hardcode specific model names.

Copilot · 2025-09-28T06:51:01Z

emplode/emplode.py

 missing_azure_info_message = """> Azure OpenAI Service API info not found

-To use `GPT-4` (recommended) please provide an Azure OpenAI API key, a API base, a deployment name and a API version.
+To use `GPT-5` (recommended) please provide an Azure OpenAI API key, an API base, a deployment name and an API version.


Grammar issue: should be 'an Azure OpenAI API key, an API base' instead of 'a API base'.

Switch to LiteLLM Responses API and update default/fast models to gpt…

bb53e01

…-5/gpt-5-nano with high reasoning effort

shouryamaanjain added the capy PR created by Capy label Sep 13, 2025

shouryamaanjain added 3 commits September 13, 2025 14:45

Fix tokentrim on unknown model (gpt-5): fallback to gpt-4o or max_tok…

01027cb

…ens-based trimming with configurable remote window

Responses API: drop temperature for reasoning models (gpt-5/gpt-5-nan…

835d116

…o) to avoid 400 'Unsupported parameter'

Responses API: transform chat messages to Responses input (developer/…

0a0c00f

…user/assistant), skip empty content; treat tool outputs as user text to avoid missing content errors

shouryamaanjain requested a review from Copilot September 28, 2025 06:50

Copilot AI reviewed Sep 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning)#5

Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning)#5
shouryamaanjain wants to merge 4 commits intomainfrom
capy/cap-2-41193abd

shouryamaanjain commented Sep 13, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 28, 2025

Uh oh!

Copilot AI Sep 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shouryamaanjain commented Sep 13, 2025

Summary

Changes

Nature of change

Impact

Why

Configuration notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants