-
Notifications
You must be signed in to change notification settings - Fork 100
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Summary
During SWT-bench eval (attempt 2), the eval runtime returned a 500 on GET /api/conversations/{id}. The server log shows a pydantic_core.ValidationError when constructing ConversationInfo, with required fields id, agent, and workspace missing from the input dict. The run continued afterward and eventually completed, but this 500 is a runtime instability and should be addressed.
Context
- Benchmark: swtbench
- Eval run:
eval-20613780379-claude-son - Conversation id:
9f538871-e9c7-4bd1-908c-25fc1050867e - Runtime id:
zaerpphmsmedfvor - Pod:
runtime-zaerpphmsmedfvor-78cf77bb79-8b8sj - Time (UTC):
2025-12-31 06:56:57
Observed behavior
GET /api/conversations/9f538871-e9c7-4bd1-908c-25fc1050867e returned HTTP 500. The error is a ConversationInfo validation failure due to missing id, agent, and workspace.
Log excerpt (gcloud logs)
Unhandled exception on GET /api/conversations/9f538871-e9c7-4bd1-908c-25fc1050867e
pydantic_core._pydantic_core.ValidationError: 3 validation errors for ConversationInfo
id
Field required [type=missing, input_value={'title': None, 'metrics'...=datetime.timezone.utc)}, input_type=dict]
agent
Field required [type=missing, input_value={'title': None, 'metrics'...=datetime.timezone.utc)}, input_type=dict]
workspace
Field required [type=missing, input_value={'title': None, 'metrics'...=datetime.timezone.utc)}, input_type=dict]
Likely source
openhands-agent-server/openhands/agent_server/conversation_service.py_compose_conversation_info()ConversationInfo(**state.model_dump(), title=..., metrics=..., created_at=..., updated_at=...)- The
state.model_dump()appears to be missingid/agent/workspaceat the moment of the request.
Additional details
- A few seconds before the 500, the runtime logged “Created new conversation ...” with a full state that included
id/agent/workspace. - After this single 500, subsequent GETs returned 200 and the run completed. This suggests a transient or race condition where
state.model_dump()can be incomplete.
Suggested direction
- Investigate why
ConversationState.model_dump()can omit required fields (id,agent,workspace). - Consider hardening
_compose_conversation_info()to fall back tostored.id/agent/workspaceor return a 503 with retryable semantics instead of 500.
References
- Runtime 500 in swtbench attempt 2 (
eval-20613780379-claude-son) - Conversation archive:
gs://openhands-evaluation-results/artifacts/eval-20613780379-claude-son/conversations/sympy__sympy-23824.tar.gz
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working