Skip to content

Proposal: agent as composed of immutable instances #1451

@enyst

Description

@enyst

Problem:

We see quite a number of bug reports on restoring conversations with different settings. Many of those I’ve seen, I think, are LLM settings like model name, set encrypted reasoning, etc, but also MCP add/remove, Skills modification between sessions, and others.

The issues come from agent immutability, which seemed to imply LLM immutability, agent context immutability (like skills), condenser llm etc. The interesting part is, this “freezing” of the agent was one of the design principles of V1.

It’s useful for reproducibility, but it’s maybe not so good for user experience. As of now, we seem to patch things one by one in places where they’re reported, e.g. in metadata, in skills, in security, in condenser, in reasoning attributes.

Proposed solution:

Conceptualize the Agent as:

  • composable, that is composed of
  • immutable parts (llm, condenser, initial context, tools, MCP)

This is the minimal change I can think of, to keep a lot of the guarantees of immutability, and also unblock development of flexible user interfaces. And make these bug reports a thing of the past.

I think in practice, we already do it - sort of. We relaxed in multiple places, via multiple bug fixes for settings or entire components, which were due to the absolute freezing. agent_context is completely replaced with runtime value.

Code Design Detail

This proposal implies that LLM (and AgentContext, etc), can be immutable instances, but we have a principled way to

  • copy+update => to a new LLM instance
  • in a single place in the code
  • all the rest of the codebase works on an immutable LLM instance.

What place?

  • for LLM, I think LLMRegistry
  • another option is conversation_state. If we like this one, maybe keep the Agent state components in there? Then it would be easy to enforce the rule: have a single, principled way to copy/update/instantiate/switch. The tradeoffs seem to be 1) that LLM doesn’t quite fit, it should be a LLMRegistry responsibility, and, 2) managing all in state puts a lot of responsibility on ConversationState (I tried it, and it feels a bit god-like class)

How will it work?

  • keep LLM immutable in Pydantic implementation
  • however, a new LLM instance can be created, with user-triggered or automatic or restoration updates, freely, as freely as we can, and switch the LLM instance fully
  • in one place in the code, so that everything else keeps the guarantees.

Definitions

  • flexibility: the user/client developer can change any/many settings and just do stuff
  • composition/composability: the simple idea that an entity (e.g. agent) is composed of parts that can, themselves, be changed or switched, in place or at restore or when “teleporting”.

Note: keeping the components frozen is maybe not as flexible as it could be; but the tradeoffs here are the guarantees from pydantic and from single point where we switch instances (not attributes), which I’d suggest is less error-prone and may be worth it.

References:

  1. We actually touched this from the PR we did it as an experiment in how to make the agent stateless - and how much statelessness is possible, without affecting usability… 😅
    But it does, and badly IMO.
  2. Xingyao actually foresaw us ending this practice here.
  3. Example bug: BUG: cannot resume conversation with different settings for MCP OpenHands-CLI#240
  4. Example bug: BUG: cannot resume conversation with different settings for enable_encrypted_reasoning OpenHands-CLI#238
  5. Example bug fix: Fix nested LLM reconciliation in agent deserialization #517
  6. Example bug fix: Fix agent reconciliation to allow agent_context updates #1371

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureRelated to core architecture.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions