Migrate to LiteLLM Responses API; default gpt-5; fast uses gpt-5-nano by shouryamaanjain · Pull Request #4 · emplodeai/Emplode

shouryamaanjain · 2025-09-13T13:41:10Z

Summary (why)

Default to gpt-5 for improved capability and alignment with OpenAI Responses API.
Fast mode now uses gpt-5-nano to provide a lower-latency option while keeping the same UX.
Migrate from litellm.completion (Chat Completions) to litellm.responses (OpenAI Responses API) to support reasoning effort control and modern tool-call semantics.

What changed

Default model set to gpt-5 (emplode/emplode.py).
--fast now maps to gpt-5-nano and help text updated (emplode/cli.py).
All cloud calls now use litellm.responses with:
- input=messages (instead of messages=...)
- tools=[{"type":"custom","name":"run_code", ...}] mirroring existing run_code(language, code) schema
- stream=True
- reasoning={"effort":"high"}
- temperature preserved
- Azure and custom api_base branches preserved
Streaming/event handling adapted to Responses API: normalized streamed events to existing merge_deltas buffer and preserved tool/code-exec flow.
Local/llama path unchanged.
Dependency: bump litellm to ^1.63.8 to ensure Responses API support.

Impact

CLI UX unchanged except model names.
Streaming output and code execution tool-calls continue to work.
Azure mode and custom api_base still function with responses().

Testing

Default: run emplode → model gpt-5, responses() streaming, reasoning={effort:high}.
Fast: run emplode --fast → model gpt-5-nano with same behavior.
Tool call: ask it to run a short Python snippet → confirm run_code executes and follow-up continues.
Azure: set EMPLODE_CLI_USE_AZURE=true with env vars (AZURE_API_KEY/BASE/VERSION/DEPLOYMENT_NAME) → confirm streaming works.
Custom base: pass --api_base <url> → confirm call uses model="custom/"+model and streams.
Local: --local unchanged.

Files

emplode/emplode.py: default model + responses() migration + streaming normalization for Responses events.
emplode/cli.py: --fast maps to gpt-5-nano; help updated.
pyproject.toml: litellm ^1.63.8.

Notes

If further Responses tool output chaining requires previous_response_id in more complex sessions, the normalized event handling is encapsulated for minimal churn and can be extended.

₍ᐢ•(ܫ)•ᐢ₎ Generated by Capy (view task)

…sponses with high reasoning + streaming/tool support; bump litellm to ^1.63.8

feat: default to gpt-5; --fast uses gpt-5-nano; migrate to litellm.re…

8872f08

…sponses with high reasoning + streaming/tool support; bump litellm to ^1.63.8

shouryamaanjain added the capy PR created by Capy label Sep 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to LiteLLM Responses API; default gpt-5; fast uses gpt-5-nano#4

Migrate to LiteLLM Responses API; default gpt-5; fast uses gpt-5-nano#4
shouryamaanjain wants to merge 1 commit intomainfrom
capy/set-default-model-to-a9340cba

shouryamaanjain commented Sep 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shouryamaanjain commented Sep 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant