Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning)#5
Switch to LiteLLM Responses API; default gpt-5, fast gpt-5-nano (high reasoning)#5shouryamaanjain wants to merge 4 commits intomainfrom
Conversation
…-5/gpt-5-nano with high reasoning effort
…ens-based trimming with configurable remote window
…o) to avoid 400 'Unsupported parameter'
…user/assistant), skip empty content; treat tool outputs as user text to avoid missing content errors
There was a problem hiding this comment.
Pull Request Overview
This PR migrates Emplode from using the OpenAI Chat Completions API to the newer Responses API via LiteLLM, while updating default models to GPT-5 family with high reasoning effort.
- Switches remote model invocation from streaming Chat Completions to non-streaming Responses API
- Updates default model from
gpt-4otogpt-5and fast mode fromgpt-4o-minitogpt-5-nano - Implements custom tool handling with response parsing for the new API structure
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| emplode/emplode.py | Core API migration to Responses API with new response parsing logic and model defaults |
| emplode/cli.py | Updated CLI help text and fast mode model configuration |
| .capy/pr-body-cap-2-41193abd.md | Added PR description file |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| try: | ||
| messages = tt.trim(self.messages, "gpt-4o", system_message=system_message) | ||
| except Exception: | ||
| remote_window = int(os.environ.get("EMPLODE_REMOTE_CONTEXT_WINDOW", "128000")) | ||
| budget = max(512, remote_window - self.max_tokens - 1000) | ||
| messages = tt.trim(self.messages, max_tokens=budget, system_message=system_message) |
There was a problem hiding this comment.
The fallback to 'gpt-4o' for token trimming is inconsistent with the new GPT-5 defaults. Consider using 'gpt-5' as the fallback model instead, or use a more generic approach that doesn't hardcode specific model names.
| missing_azure_info_message = """> Azure OpenAI Service API info not found | ||
|
|
||
| To use `GPT-4` (recommended) please provide an Azure OpenAI API key, a API base, a deployment name and a API version. | ||
| To use `GPT-5` (recommended) please provide an Azure OpenAI API key, an API base, a deployment name and an API version. |
There was a problem hiding this comment.
Grammar issue: should be 'an Azure OpenAI API key, an API base' instead of 'a API base'.
Summary
Migrate Emplode remote model invocation to LiteLLM OpenAI Responses API and set default models to GPT‑5 family with high reasoning effort.
Changes
gpt-5.emplode --fastnow usesgpt-5-nano.reasoning: { effort: "high" }for both default and fast modes.max_output_tokenswired to Emplode’smax_tokens.run_codetool defined for function-calling.Nature of change
Enhancement / Refactor (API migration + model defaults).
Impact
run_codetool calls are parsed and executed as before. If streaming is required later, we can enablestream=Truewith the Responses API.model=f"azure/<deployment>"using the existing environment variables.Why
gpt-5-nano) while keeping reasoning quality high.Configuration notes
--api_basefor custom OpenAI-compatible backends (usescustom/<model>path).AZURE_API_KEYorOPENAI_API_KEY,AZURE_API_BASE,AZURE_API_VERSION,AZURE_DEPLOYMENT_NAME(wired to Responses API).₍ᐢ•(ܫ)•ᐢ₎ Generated by Capy (view task)