Closing the loop: auto-promoting observations into directives and mental models #472

santiagopereda · 2026-03-02T21:09:25Z

santiagopereda
Mar 2, 2026

First off, the observation consolidation in Hindsight is one of the things that sets it apart for me. The fact that patterns surface automatically from retained facts without pre-defining categories is exactly the right abstraction.

I've been running it with enable_observations across a multi-agent project and the consolidation pipeline is working well. Facts go in via retain, observations emerge automatically in the background. What I keep wanting is a way to close the loop on what happens after consolidation.

Right now, if an observation is strong enough to become an operational rule, I have to notice it myself and manually create a directive or mental model. In practice I don't review observations often enough, so good patterns slip through.

What I'd love to see: after observation consolidation runs, Hindsight automatically compares new observations against existing directives and mental models, then surfaces candidates. New patterns worth promoting, existing rules that got reinforced, or rules that new evidence contradicts.

The simplest version might just be a post-consolidation hook, an event that fires when consolidation finishes so external tooling can run the comparison. A more integrated version could be a bank-level config like:

{
    "enable_promotion": true,
    "promotion_mode": "suggest",
    "min_evidence_count": 3
}

Where candidates get staged for review rather than auto-promoted.

The reason I think this fits naturally: the research paper describes memory consolidation inspired by human cognition, where facts become observations over time. But in human cognition, observations also become heuristics that shape how new information gets processed. The upward transition (facts → observations) is automated. The next one (observations → directives/mental models) isn't. Completing that loop would mean the bank gets better at extracting what matters over time, without the user pre-defining what to look for.

I've been running a similar pattern manually in another project, extracting heuristics from session observations, validating them across multiple contexts, then promoting the strong ones to operational rules, and the results have been solid. The bottleneck is that it requires manual synthesis passes that don't happen consistently enough.

Happy to contribute if there's interest.

nicoloboschi · 2026-03-03T10:04:00Z

nicoloboschi
Mar 3, 2026
Maintainer

@santiagopereda this is interesting feedback!

Mental Models, Observations and Directives have 3 different roles and client/agent/user interacts with them in very different ways.

Observations are potentially infinite and you can search via natural language query - you can't find them via ID but they can and they usually contain learned patterns of the agent - which you can potentially use as enforced rule on your agent. They are a network of infinite facts which means no matter how large your memory bank is, they don't get distilled.

Mental Models are generated from a specific mission/prompt and they search in the raw facts and observation to come up with some amount of tokens. Mental Models are text - which has lot of limitations in terms of scalability. e.g. if you want to keep track of very large number of tickets per customer, mental model text can hardly contain them all.

Directives are user-written hard rules that only apply to Reflect and Mental Model generation - so they don't really emerge from Observations and they usually are safety settings - they have a big limitation of being injected in the prompt so they have to be small, clear and curated.

From your proposal, it sounds like you want Insights - patterns that emerge from the Observations without being explicitly requested by the user, that Hindsight surfaces proactively to you.

I believe we need to think about it from an operational point of view. I've been doing this for a while and the challenge is that unfocused auto-emerging insights tend to generate noise in practice, the value is really in targeting them.

Do you have a practical use case or example of a scenario where you'd like to see Hindsight automatically tell you some insights about its memory and which then you effectively use in your agent application?

0 replies

santiagopereda · 2026-03-03T12:56:22Z

santiagopereda
Mar 3, 2026
Author

You're right, Insights is the right framing: patterns that surface from observations without me asking. I'm running a shared bank across multiple projects so observations accumulate from different contexts.

Two scenarios where this would help. First, I run assessments for clients. After multiple engagements, a recurring scoring pattern was already in my observations but I only found it because I happened to reflect at the right time. If Hindsight had surfaced that, I could have added it as a check in future assessments. Second, I've hit the same category of failure across completely unrelated projects. Multiple configuration-based approaches failed for the same structural reason: the tool operated at the wrong abstraction layer. That pattern is in my observations too, but if I don't recognize it exists beforehand, I'll keep trying the same class of solution. The common thread is: the insight was there, I just didn't know to ask. On noise, tags seem like the natural scoping mechanism to keep it relevant.

I've been thinking about what I can build externally vs what needs to come from Hindsight. The comparison and classification logic can live outside. The one thing I can't build around is knowing when consolidation finishes. A post-consolidation event or webhook would unblock the whole thing. Would that be on the table?

2 replies

nicoloboschi Mar 4, 2026
Maintainer

we will have webhooks in 0.4.16! - #487

santiagopereda Mar 5, 2026
Author

That's great, thanks for the quick turnaround. consolidation.completed is exactly the event I was describing. The payload gives me everything I need to trigger external comparison logic after observations are synthesized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Closing the loop: auto-promoting observations into directives and mental models #472

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Closing the loop: auto-promoting observations into directives and mental models #472

Uh oh!

santiagopereda Mar 2, 2026

Replies: 2 comments · 2 replies

Uh oh!

nicoloboschi Mar 3, 2026 Maintainer

Uh oh!

santiagopereda Mar 3, 2026 Author

Uh oh!

nicoloboschi Mar 4, 2026 Maintainer

Uh oh!

santiagopereda Mar 5, 2026 Author

santiagopereda
Mar 2, 2026

Replies: 2 comments 2 replies

nicoloboschi
Mar 3, 2026
Maintainer

santiagopereda
Mar 3, 2026
Author

nicoloboschi Mar 4, 2026
Maintainer

santiagopereda Mar 5, 2026
Author