Add expected index models to resolve_model_config.py #1497

juanmichelini · 2025-12-23T14:46:47Z

Summary

This PR adds all the expected models from the index to the allowed models list in .github/run-eval/resolve_model_config.py.

Changes

New Models Added

Model ID	LiteLLM Model Path	Display Name
`claude-4.5-opus`	`litellm_proxy/anthropic/claude-opus-4-5-20251101`	Claude 4.5 Opus
`claude-4.5-sonnet`	`litellm_proxy/anthropic/claude-sonnet-4-5-20250929`	Claude 4.5 Sonnet
`gemini-3-pro`	`litellm_proxy/gemini/gemini-3-pro-preview`	Gemini 3 Pro
`gemini-3-flash`	`litellm_proxy/gemini/gemini-3-flash-preview`	Gemini 3 Flash
`gpt-5.2-high-reasoning`	`litellm_proxy/openai/gpt-5.2-pro`	GPT-5.2 High Reasoning
`gpt-5.2`	`litellm_proxy/openai/gpt-5.2`	GPT-5.2
`minimax-m2`	`litellm_proxy/minimax/minimax-m2`	MiniMax M2
`deepseek-v3.2-reasoner`	`litellm_proxy/deepseek/deepseek-v3.2`	DeepSeek V3.2 Reasoner
`qwen-3-coder`	`litellm_proxy/qwen/qwen3-coder`	Qwen 3 Coder

Note: kimi-k2-thinking was already present in the configuration.

Test Updates

Fixed the test file to import from the correct module (resolve_model_config instead of resolve_model_configs)
Updated tests to match the current implementation (using global MODELS dictionary)
Added new tests to verify all expected models are present and correctly configured

Testing

All tests pass:

tests/github_workflows/test_resolve_model_config.py::test_find_models_by_id_single_model PASSED
tests/github_workflows/test_resolve_model_config.py::test_find_models_by_id_multiple_models PASSED
tests/github_workflows/test_resolve_model_config.py::test_find_models_by_id_preserves_order PASSED
tests/github_workflows/test_resolve_model_config.py::test_find_models_by_id_missing_model_exits PASSED
tests/github_workflows/test_resolve_model_config.py::test_find_models_by_id_empty_list PASSED
tests/github_workflows/test_resolve_model_config.py::test_find_models_by_id_preserves_full_config PASSED
tests/github_workflows/test_resolve_model_config.py::test_all_expected_models_present PASSED
tests/github_workflows/test_resolve_model_config.py::test_expected_models_have_required_fields PASSED
tests/github_workflows/test_resolve_model_config.py::test_expected_models_id_matches_key PASSED
tests/github_workflows/test_resolve_model_config.py::test_find_all_expected_models PASSED

@juanmichelini can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:a3f8318-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-a3f8318-python \
  ghcr.io/openhands/agent-server:a3f8318-python

All tags pushed for this build

ghcr.io/openhands/agent-server:a3f8318-golang-amd64
ghcr.io/openhands/agent-server:a3f8318-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:a3f8318-golang-arm64
ghcr.io/openhands/agent-server:a3f8318-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:a3f8318-java-amd64
ghcr.io/openhands/agent-server:a3f8318-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:a3f8318-java-arm64
ghcr.io/openhands/agent-server:a3f8318-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:a3f8318-python-amd64
ghcr.io/openhands/agent-server:a3f8318-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:a3f8318-python-arm64
ghcr.io/openhands/agent-server:a3f8318-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:a3f8318-golang
ghcr.io/openhands/agent-server:a3f8318-java
ghcr.io/openhands/agent-server:a3f8318-python

About Multi-Architecture Support

Each variant tag (e.g., a3f8318-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., a3f8318-python-amd64) are also available if needed

Add the following models to the allowed models list for workflows: - claude-4.5-opus (litellm_proxy/anthropic/claude-opus-4-5-20251101) - claude-4.5-sonnet (litellm_proxy/anthropic/claude-sonnet-4-5-20250929) - gemini-3-pro (litellm_proxy/gemini/gemini-3-pro-preview) - gemini-3-flash (litellm_proxy/gemini/gemini-3-flash-preview) - gpt-5.2-high-reasoning (litellm_proxy/openai/gpt-5.2-pro) - gpt-5.2 (litellm_proxy/openai/gpt-5.2) - minimax-m2 (litellm_proxy/minimax/minimax-m2) - deepseek-v3.2-reasoner (litellm_proxy/deepseek/deepseek-v3.2) - qwen-3-coder (litellm_proxy/qwen/qwen3-coder) Also fix and update tests to match the current implementation. Co-authored-by: openhands <openhands@all-hands.dev>

simonrosenberg · 2025-12-23T17:07:13Z

.github/run-eval/resolve_model_config.py

+            "temperature": 0.0,
+        },
+    },
+    "claude-4.5-sonnet": {


this model is already in the dictionary

it's the unpinned version, which highlights that it added most versions unpinned. I'll ask it to update it

juanmichelini · 2025-12-23T18:55:38Z

@OpenHands please look at MODELS = {...} inside resolve_model_config.py

Some of the models are pinned like "claude-sonnet-4-5-20250929" notice it ends in a date, others are not like "claude-4.5-opus" (notice no date)

For each of the models already added, if they are unpinned lets check if there is a pinned version.
That is a version that has the exact same name but with a date suffix.
Check if there is a pinned version of the model in https://github.com/BerriAI/litellm/
If so let's use that pinned version of the model.

Notes:

only update unpinned models, leave already pinned models as they are.
when updating an unpinned model if there are two versions of the pinned model, use the most recent one.
notice that we want to pin the model, so only a date suffix will be added. the name should not change except for the date.
the pinned model must exist in https://github.com/BerriAI/litellm/ otherwise leave the model unpinned.

openhands-ai · 2025-12-23T18:55:47Z

I'm on it! juanmichelini can track my progress at all-hands.dev

- gpt-5.2-pro -> gpt-5.2-pro-2025-12-11 - gpt-5.2 -> gpt-5.2-2025-12-11 These are the only models that have pinned versions available in litellm. Other models (deepseek-chat, kimi-k2-thinking, gemini-3-pro-preview, gemini-3-flash-preview, minimax-m2, deepseek-v3.2, qwen3-coder) do not have pinned versions in litellm and are left unchanged. Also fixed test file to use correct model ID (claude-sonnet-4-5-20250929 instead of claude-4.5-sonnet). Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2025-12-23T19:00:06Z

I've analyzed all models in the MODELS dictionary in resolve_model_config.py and checked for pinned versions in the litellm repository.

Summary of Changes

Models Pinned (2 models updated):

Model	Before	After
`gpt-5.2-pro`	`litellm_proxy/openai/gpt-5.2-pro`	`litellm_proxy/openai/gpt-5.2-pro-2025-12-11`
`gpt-5.2`	`litellm_proxy/openai/gpt-5.2`	`litellm_proxy/openai/gpt-5.2-2025-12-11`

Models Left Unchanged (no pinned version exists in litellm):

deepseek-chat - no pinned version available
kimi-k2-thinking - no pinned version available
gemini-3-pro-preview - no pinned version available
gemini-3-flash-preview - no pinned version available
minimax-m2 - no pinned version available
deepseek-v3.2 - no pinned version available
qwen3-coder - no pinned version available

Already Pinned Models (no changes needed):

claude-sonnet-4-5-20250929
claude-haiku-4-5-20251001
gpt-5-mini-2025-08-07
claude-opus-4-5-20251101 (in llm_config)

Additional Fix:

Fixed a pre-existing test issue where EXPECTED_MODELS referenced claude-4.5-sonnet but the actual model ID in MODELS is claude-sonnet-4-5-20250929.

Checklist:

✅ Only updated unpinned models
✅ Only added date suffix (name unchanged except for date)
✅ Pinned versions verified to exist in litellm
✅ Used most recent pinned version (2025-12-11)
✅ All tests pass
✅ Pre-commit hooks pass
✅ Changes pushed to PR branch

_{View full conversation}

enyst · 2025-12-26T12:05:52Z

.github/run-eval/resolve_model_config.py

+    "gpt-5.2-high-reasoning": {
+        "id": "gpt-5.2-high-reasoning",
+        "display_name": "GPT-5.2 High Reasoning",
+        "llm_config": {"model": "litellm_proxy/openai/gpt-5.2-pro-2025-12-11"},


I thought "GPT 5.2 Pro" is not the same with "GPT 5.2 with reasoning='high'", is it?

https://platform.openai.com/docs/models/gpt-5.2-pro : gpt-5.2-pro-2025-12-11

https://platform.openai.com/docs/models/gpt-5.2 : gpt-5.2-2025-12-11

Pricing is very different too; Pro is apparently $21 / input, $168 / output
while GPT-5.2 for any reasoning set, is $1.75 / $14

I'm not sure if it's available at all via AH proxy. I can access it via API with a ChatGPT subscription though.

More details, to my knowledge, in slack

Please do correct me if wrong

all-hands-bot · 2026-01-01T12:18:11Z

[Automatic Post]: It has been a while since there was any activity on this PR. @juanmichelini, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

openhands-ai bot mentioned this pull request Dec 23, 2025

Add desired index models to list of allowed models in workflow #1495

Open

juanmichelini requested a review from simonrosenberg December 23, 2025 17:05

juanmichelini marked this pull request as ready for review December 23, 2025 17:05

simonrosenberg reviewed Dec 23, 2025

View reviewed changes

Removed duplicated model

21180df

enyst reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add expected index models to resolve_model_config.py #1497

Add expected index models to resolve_model_config.py #1497

Uh oh!

juanmichelini commented Dec 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

simonrosenberg Dec 23, 2025

Uh oh!

juanmichelini Dec 23, 2025

Uh oh!

juanmichelini commented Dec 23, 2025

Uh oh!

openhands-ai bot commented Dec 23, 2025

Uh oh!

openhands-ai bot commented Dec 23, 2025

Uh oh!

enyst Dec 26, 2025

Uh oh!

enyst Jan 1, 2026

Uh oh!

all-hands-bot commented Jan 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add expected index models to resolve_model_config.py #1497

Are you sure you want to change the base?

Add expected index models to resolve_model_config.py #1497

Uh oh!

Conversation

juanmichelini commented Dec 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

New Models Added

Test Updates

Testing

Uh oh!

simonrosenberg Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

juanmichelini Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

juanmichelini commented Dec 23, 2025

Uh oh!

openhands-ai bot commented Dec 23, 2025

Uh oh!

openhands-ai bot commented Dec 23, 2025

Summary of Changes

Models Pinned (2 models updated):

Models Left Unchanged (no pinned version exists in litellm):

Already Pinned Models (no changes needed):

Additional Fix:

Checklist:

Uh oh!

enyst Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

enyst Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot commented Jan 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

juanmichelini commented Dec 23, 2025 •

edited by github-actions bot

Loading