-
Notifications
You must be signed in to change notification settings - Fork 100
refactor: remove reconciliation methods, use runtime agent directly #1542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Coverage Report •
|
||||||||||||||||||||||||||||||||||||||||
76b5add to
5b3198d
Compare
Remove all reconciliation methods (resolve_diff_from_deserialized) and use the runtime agent directly when restoring conversations. Key changes: - LLM: Remove resolve_diff_from_deserialized method entirely - AgentBase: Remove resolve_diff_from_deserialized method entirely - ConversationState.create(): Use runtime agent directly, no compatibility checking. User is free to change LLM, tools, condenser, agent_context, etc. between sessions. Execution flow for new conversation: 1. Create ConversationState with runtime agent (Pydantic validation happens here) 2. Initialize EventLog for event storage 3. Save initial base state to persistence 4. Return the new state Execution flow for restored conversation: 1. Load persisted base_state.json (only to get conversation metadata) 2. Verify conversation ID matches 3. Create ConversationState with the runtime agent (Pydantic validation happens here - runtime agent is always used) 4. Restore persisted conversation metadata (execution_status, etc.) 5. Attach EventLog to load persisted events 6. Save updated base state (with runtime agent) 7. Return the resumed state NOTE: There's a case for checking that tools already used in the conversation history are still available - see issue #1533. Closes #1451 Co-authored-by: openhands <openhands@all-hands.dev>
5b3198d to
82b191c
Compare
Reintroduce tools restriction from the original reconcile method: - Add AgentBase.load(persisted) method that validates tools match - Tools must match between runtime and persisted agents (they may have been used in conversation history) - All other config (LLM, agent_context, condenser, etc.) can change freely Update ConversationState.create() to use agent.load() on restore path. Co-authored-by: openhands <openhands@all-hands.dev>
xingyaoww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just need to fix a few minor things!
| resolved = agent.resolve_diff_from_deserialized(state.agent) | ||
| # Limit to the tools that match - they may have been used in conversation history. | ||
| # All other config (LLM, agent_context, condenser, etc.) can change freely. | ||
| agent.load(persisted_state.agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we get the return value of this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Currently we don't return something else than self or ValueError, but just in case.
Address review comments: - Rename AgentBase.load() to AgentBase.verify() since it's a verification method, not a load method - Update docstring to say 'Verify that we can resume...' - Capture return value: verified_agent = agent.verify(persisted_state.agent) - Update tests to use verify() instead of load() Co-authored-by: openhands <openhands@all-hands.dev>
Address xingyaoww's review comment: instead of creating state from scratch, load persisted state and update specific fields. This is more future-proof - new fields will automatically be preserved. Co-authored-by: openhands <openhands@all-hands.dev>
Load persisted state but override with runtime-provided values: - agent (verified against persisted) - workspace - max_iterations - stuck_detection Keep from persisted state: - id, persistence_dir, execution_status, confirmation_policy - activated_knowledge_skills, blocked_actions, blocked_messages - secret_registry This gives the best of both approaches: future-proof for new fields while respecting user-provided runtime configuration. Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
| # Override with runtime-provided values | ||
| state.agent = verified_agent | ||
| state.workspace = workspace | ||
| state.max_iterations = max_iterations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xingyaoww how does this look like? So we took all from persistent state: except for agent, where we just enforce some limitations; for max iterations, the idea is that if the user set "ok 700 from now on", it will be applied; as for workspace, idk, WDYT?
I'm not sure I understand why it was from persisted state, on main. Wouldn't that prevent "teleporting" a conversation to remote and back to local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, maybe I'm confused. @OpenHands what exactly does the "workspace" variable here in ConversationState refer to, and what possible values it has? Trace all execution paths it's on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! enyst can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer: What is the workspace variable in ConversationState?
The workspace field in ConversationState (line 61-64 in state.py) is of type BaseWorkspace:
workspace: BaseWorkspace = Field(
...,
description="Working directory for agent operations and tool execution",
)What it represents
BaseWorkspace is an abstract base class that provides a sandboxed environment where agents can execute commands, read/write files, and perform other operations. It has a working_dir field that specifies the root directory for agent operations.
Possible Values (Implementations)
- LocalWorkspace - Operates directly on the host filesystem (for development/testing)
- RemoteWorkspace - Connects to an OpenHands agent server via HTTP
- DockerWorkspace - Creates and manages a Docker container running an agent server
- DockerDevWorkspace - Extends DockerWorkspace, builds images on-the-fly
- OpenHandsCloudWorkspace - Connects to OpenHands Cloud (app.all-hands.dev)
- APIRemoteWorkspace - Uses OpenHands runtime API (runtime.all-hands.dev)
Execution Paths
-
ConversationState.create() - On restore, workspace is always overridden from the runtime-provided value (line 238:
state.workspace = workspace) -
LocalConversation - Accepts
str | Path | LocalWorkspace, converts to LocalWorkspace if needed, usesworkspace.working_dirfor directory creation and hook initialization -
RemoteConversation - Accepts
RemoteWorkspace, usesworkspace.hostandworkspace.api_keyfor HTTP/WebSocket connections -
Tool Execution - Tools access
conv_state.workspace.working_dirto set working directories and resolve file paths -
Agent Server - Gets workspace from stored config, asserts it's LocalWorkspace, creates the working directory
Key Insight for Your Question
The workspace is NOT persisted from the previous session - it's always taken from the runtime-provided value. This is intentional because:
- The workspace represents the current execution environment
- It may change between sessions (different machine, different container)
- It contains runtime-specific information (host URLs, API keys for remote workspaces)
This design does support "teleporting" a conversation between local and remote environments - the conversation history is preserved, but the workspace is always the one provided at runtime. So moving a conversation from remote to local (or vice versa) should work fine, as long as the new workspace has access to the same files the agent was working with.
Test coverage for: - Runtime values used on resume: workspace, max_iterations - Persisted values preserved: execution_status, stuck_detection, blocked_actions, blocked_messages - Stats reset on resume (fresh session) - Conversation ID mismatch raises error Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands I look at the diff of this PR and I don't see all I expected to see. For example, in llm.py we have a class var named OVERRIDE... something, which is, I think, only used in the reconciliation method we removed from LLM. Please track it down and confirm what I said, and if it's unused let's clean it up. Verify if in the other files in this PR, e.g. state or agentbase, we have something similar and do the same. Review the code and clean it up from such redundancies or similar. |
|
I'm on it! enyst can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
Final summary (new since last summary)Double-check: request coverage
Conciseness check
Git / delivery
Diff recap (this update only):
|
Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands Understand the goal that this PR aims to achieve. We have two tests files, modified in this PR, which are no longer named appropriately since we removed reconciliation and replace it with "verification". Rename test_agent_reconciliation.py to test_agent_loading.py (or verification if you prefer, but maybe loading will do, wdyt?) Review the tests in tests/sdk/llm/test_llm_reconciliation.py and if they're not duplicated in other tests/sdk llm-related files and they're still useful (are they?), then move them in a test llm file. |
|
I'm on it! enyst can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
Final summary (since the previous update)✅ Requests addressed (checklist)
🔍 Conciseness / diff sanity
🧾 Changes included in the pushed commit
|
Co-authored-by: openhands <openhands@all-hands.dev>
Summary
This PR removes all reconciliation methods (
resolve_diff_from_deserialized) and uses the provided Agent directly when restoring conversations. This is an alternative approach to issue #1451.What Was Happening on Main
On
main, when restoring a conversation,resolve_diff_from_deserializedwould:Override from runtime:
agent_context(skills, system_message_suffix, user_message_suffix, secrets)llmsecrets (api_key, aws credentials, litellm_extra_body)condenser.llmsecretsRestore from persistence (and require exact match with runtime):
toolsmcp_configfilter_tools_regexsystem_prompt_filenamesecurity_policy_filenamesystem_prompt_kwargscondenser(except its llm secrets)llmconfig (model, temperature, etc.)The final equality check meant users effectively couldn't change most Agent configuration between sessions.
What This PR Does
Removes reconciliation. The provided Agent is used directly - subject to limitations that would otherwise not work at all, such as, it has to be the same Agent class, or the same tools.
Users are now free to change Agent configuration between sessions:
llm(model, api_key, all settings)mcp_configfilter_tools_regexagent_contextsystem_prompt_filenamesecurity_policy_filenamesystem_prompt_kwargscondenserLimitations:
toolsExecution Flow
New Conversation:
Restored Conversation:
Validation
Pydantic validation happens when creating instances (LLM, Agent, ConversationState) via the constructor.
Note on Tools
There's a case for checking that tools already used in the conversation history are still available. See issue #1533 for discussion.
Scope: LocalConversation Only
This PR only affects LocalConversation.
For RemoteConversation, the server always creates the Agent from the persisted
meta.json- the client's Agent is ignored when restoring. Making RemoteConversation support Agent changes would require:This is out of scope for this PR but could be a follow-up.
Closes #1451
Checklist
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:85336b2-pythonRun
All tags pushed for this build
About Multi-Architecture Support
85336b2-python) is a multi-arch manifest supporting both amd64 and arm6485336b2-python-amd64) are also available if needed