Skip to content

Conversation

@simonrosenberg
Copy link
Collaborator

@simonrosenberg simonrosenberg commented Dec 31, 2025

Summary

This PR adds a simple workaround for the transient ConversationInfo validation error described in #1557.

Problem

During SWT-bench eval, GET /api/conversations/{id} returned HTTP 500 due to a pydantic_core.ValidationError when constructing ConversationInfo. The error indicated that required fields id, agent, and workspace were missing from the input dict.

This is a race condition where ConversationState.model_dump() can return incomplete data during concurrent state mutations.

Solution

Add a _get_state_dump_with_retry() function that:

  1. Calls state.model_dump() and checks if all required fields (id, agent, workspace) are present
  2. If fields are missing, waits 0.1s and retries
  3. If still missing, waits 0.3s and retries
  4. If still missing after all retries, raises a RuntimeError with a helpful error message linking to the issue

This is a workaround - the proper fix would be to implement locking in ConversationState.model_dump() as proposed in PR #1558. A TODO comment with a link to the issue has been added for future reference.

Testing

  • Added 4 unit tests for the new _get_state_dump_with_retry() function:
    • test_returns_immediately_when_all_fields_present - verifies no retry when fields are present
    • test_retries_when_fields_missing_then_succeeds - verifies retry after 0.1s
    • test_retries_twice_then_succeeds - verifies retry after 0.1s and 0.3s
    • test_raises_error_after_all_retries_exhausted - verifies error after all retries

All existing tests continue to pass.

Fixes #1557

@simonrosenberg can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:47d135d-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-47d135d-python \
  ghcr.io/openhands/agent-server:47d135d-python

All tags pushed for this build

ghcr.io/openhands/agent-server:47d135d-golang-amd64
ghcr.io/openhands/agent-server:47d135d-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:47d135d-golang-arm64
ghcr.io/openhands/agent-server:47d135d-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:47d135d-java-amd64
ghcr.io/openhands/agent-server:47d135d-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:47d135d-java-arm64
ghcr.io/openhands/agent-server:47d135d-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:47d135d-python-amd64
ghcr.io/openhands/agent-server:47d135d-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:47d135d-python-arm64
ghcr.io/openhands/agent-server:47d135d-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:47d135d-golang
ghcr.io/openhands/agent-server:47d135d-java
ghcr.io/openhands/agent-server:47d135d-python

About Multi-Architecture Support

  • Each variant tag (e.g., 47d135d-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 47d135d-python-amd64) are also available if needed

@github-actions
Copy link
Contributor

github-actions bot commented Dec 31, 2025

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   conversation_service.py34821737%53–56, 60, 62–63, 84, 87, 98–99, 102–105, 107, 111, 113, 116–123, 126–127, 130–134, 137–139, 141–144, 146, 153–154, 156–158, 161, 165, 167, 169, 176, 182, 190–191, 200–203, 212, 221, 226–227, 230, 243–244, 262, 265, 276–280, 282–285, 288–293, 296–299, 301–303, 306, 309–311, 316–319, 327, 332–334, 348–352, 355, 357, 360–362, 364, 368, 372, 379–383, 386–387, 391–395, 398–399, 403–407, 410–411, 417–422, 429–430, 432, 436, 438–439, 444–445, 451–452, 458–460, 478, 502, 530, 532–533, 559, 561, 563–566, 571, 573–574, 578–579, 581–582, 585–587, 590, 596, 601–604, 611–612, 616–620, 622, 627, 631–633, 637–638, 640–642, 644, 646, 659–661, 664, 667, 670–673, 680–681, 685–687, 690–691, 693
TOTAL14452682652% 

@simonrosenberg simonrosenberg force-pushed the openhands/fix-conversation-info-validation-retry branch from e9e5ffc to 6db70d0 Compare December 31, 2025 15:33
Add retry with exponential backoff (0.1s, 0.2s, 0.4s, 0.8s) in
_compose_conversation_info() when ConversationInfo validation fails due to
transient race conditions where model_dump() returns incomplete data during
concurrent state mutations.

This is a workaround for issue #1557. A TODO comment with a link to the
issue has been added for future reference.

Fixes #1557

Co-authored-by: openhands <openhands@all-hands.dev>
@simonrosenberg simonrosenberg force-pushed the openhands/fix-conversation-info-validation-retry branch from 6db70d0 to cc5ffd3 Compare December 31, 2025 15:34
@simonrosenberg simonrosenberg requested a review from enyst December 31, 2025 15:36
@simonrosenberg simonrosenberg marked this pull request as ready for review December 31, 2025 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Runtime 500: ConversationInfo validation error (missing id/agent/workspace)

3 participants