Skip to content

Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning)#5

Open
shouryamaanjain wants to merge 4 commits intomainfrom
capy/cap-2-41193abd
Open

Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning)#5
shouryamaanjain wants to merge 4 commits intomainfrom
capy/cap-2-41193abd

Conversation

@shouryamaanjain
Copy link
Contributor

Summary

Migrate Emplode remote model invocation to LiteLLM OpenAI Responses API and set default models to GPT‑5 family with high reasoning effort.

Changes

  • Default model set to gpt-5.
  • emplode --fast now uses gpt-5-nano.
  • Switched remote (non-local) inference from streaming Chat Completions to the Responses API via LiteLLM, with:
    • reasoning: { effort: "high" } for both default and fast modes.
    • max_output_tokens wired to Emplode’s max_tokens.
    • Custom run_code tool defined for function-calling.
  • Updated CLI help text and user-facing messages from GPT-4/4o to GPT-5.
  • Local mode behavior remains unchanged.

Nature of change

Enhancement / Refactor (API migration + model defaults).

Impact

  • Non-breaking for local mode.
  • Remote mode now uses the Responses API non-streaming flow. Assistant text and any run_code tool calls are parsed and executed as before. If streaming is required later, we can enable stream=True with the Responses API.
  • Azure path continues to work via model=f"azure/<deployment>" using the existing environment variables.

Why

  • Align Emplode with the newer OpenAI Responses API through LiteLLM for better support of reasoning settings and future features.
  • Standardize on GPT‑5 family and expose a fast option (gpt-5-nano) while keeping reasoning quality high.

Configuration notes

  • OPENAI_API_KEY required for OpenAI.
  • Optional: --api_base for custom OpenAI-compatible backends (uses custom/<model> path).
  • Azure: AZURE_API_KEY or OPENAI_API_KEY, AZURE_API_BASE, AZURE_API_VERSION, AZURE_DEPLOYMENT_NAME (wired to Responses API).

₍ᐢ•(ܫ)•ᐢ₎ Generated by Capy (view task)

@shouryamaanjain shouryamaanjain added the capy PR created by Capy label Sep 13, 2025
…ens-based trimming with configurable remote window
…user/assistant), skip empty content; treat tool outputs as user text to avoid missing content errors
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates Emplode from using the OpenAI Chat Completions API to the newer Responses API via LiteLLM, while updating default models to GPT-5 family with high reasoning effort.

  • Switches remote model invocation from streaming Chat Completions to non-streaming Responses API
  • Updates default model from gpt-4o to gpt-5 and fast mode from gpt-4o-mini to gpt-5-nano
  • Implements custom tool handling with response parsing for the new API structure

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
emplode/emplode.py Core API migration to Responses API with new response parsing logic and model defaults
emplode/cli.py Updated CLI help text and fast mode model configuration
.capy/pr-body-cap-2-41193abd.md Added PR description file

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +476 to +481
try:
messages = tt.trim(self.messages, "gpt-4o", system_message=system_message)
except Exception:
remote_window = int(os.environ.get("EMPLODE_REMOTE_CONTEXT_WINDOW", "128000"))
budget = max(512, remote_window - self.max_tokens - 1000)
messages = tt.trim(self.messages, max_tokens=budget, system_message=system_message)
Copy link

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to 'gpt-4o' for token trimming is inconsistent with the new GPT-5 defaults. Consider using 'gpt-5' as the fallback model instead, or use a more generic approach that doesn't hardcode specific model names.

Copilot uses AI. Check for mistakes.
missing_azure_info_message = """> Azure OpenAI Service API info not found

To use `GPT-4` (recommended) please provide an Azure OpenAI API key, a API base, a deployment name and a API version.
To use `GPT-5` (recommended) please provide an Azure OpenAI API key, an API base, a deployment name and an API version.
Copy link

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar issue: should be 'an Azure OpenAI API key, an API base' instead of 'a API base'.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

capy PR created by Capy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants