Skip to content

hdviettt/agenternal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agenternal

A personal AI assistant with a brain-inspired memory system that forgets, consolidates, reconsolidates, detects contradictions, and abstracts behavioral patterns — capabilities no existing AI memory system (MemGPT, Mem0, Zep, or A-Mem) has shipped in production.

The Problem

Every AI assistant today has the same memory flaw: it stores everything and retrieves by similarity. This is a filing cabinet, not a brain. As conversations accumulate, the system drowns in redundant facts, conflicting information, and outdated priorities — with no mechanism to resolve any of it.

The human brain solves this with five mechanisms that run simultaneously:

  1. Two-stage consolidation — fast capture in the hippocampus, slow distillation to the neocortex during sleep
  2. Selective forgetting — memories decay unless reinforced through spaced retrieval
  3. Reconsolidation — retrieved memories become temporarily unstable and can be refined by new context
  4. Cognitive dissonance — contradicting beliefs trigger error signals proportional to the conflict
  5. Hierarchical compression — raw experience is compressed 2,000,000:1 into schemas and generative models

We implemented all five.

Memory Architecture

Three-Layer Hierarchy

Raw conversation --> Episodic Memory --> Semantic Memory --> Schema
                    (fast capture)     (distilled facts)   (behavioral patterns)
Layer Example Created
Episodic "On March 15, user said delay the launch because engineering is behind" After each conversation turn
Semantic "User delayed product launch due to engineering delays (March 2026)" During consolidation (every 6h)
Schema "User prioritizes engineering readiness over market timing in launch decisions" When 3+ semantic memories cluster

Memory Strength (Ebbinghaus Curve + Spacing Effect)

Every memory has a strength $s \in [0.05,; 2.0]$ governed by three forces:

1. Idempotent time decay — target strength is a pure function of elapsed time (no compounding error across consolidation cycles):

$$S(m) = 1 + n_m^{\text{spaced}} \cdot 0.5$$

$$R(m, t) = \frac{1}{1 + 0.1 ;\cdot; \dfrac{\Delta t_m}{S(m)}}$$

$$s_m \leftarrow \max!\Big(0.05,;; s_m^{(0)} \cdot R(m, t)\Big)$$

where $s_m^{(0)}$ is the strength recorded at last access (idempotent base) and $n_m^{\text{spaced}}$ is the spaced access count — only retrievals with a gap $\geq 12\text{h}$ increment it (Cepeda et al. 2006; Bjork & Bjork 1992).

2. Spacing-aware retrieval reinforcement — boost scales with time since last access and diminishes near the ceiling:

$$\alpha(m) = \alpha_{\max} \cdot \underbrace{\left(1 - \frac{s_m}{s_{\max}}\right)}_{\text{diminishing at ceiling}} \cdot \underbrace{\left(1 - e^{-\Delta t_m ,/, \tau}\right)}_{\text{spacing effect}}$$

$$s_m \leftarrow \min!\Big(s_{\max},;; s_m + \alpha(m)\Big)$$

where $\alpha_{\max} = 0.15$, $s_{\max} = 2.0$, $\tau = 24\text{h}$.

Scenario Flat boost (old) Spacing-aware (new)
Recalled 30 sec ago, $s=1.0$ $+5% = 1.05$ $+0.001$ (near zero)
Recalled 1 day ago, $s=1.0$ $+5% = 1.05$ $+0.059$
Recalled 7 days ago, $s=1.0$ $+5% = 1.05$ $+0.075$ (near max)
Recalled 7 days ago, $s=1.8$ $+5% = 1.89$ $+0.015$ (diminishing)

3. Evidence-weighted contradiction (Rescorla-Wagner 1972) — penalty modulated by relative strength of new evidence vs old belief:

$$\beta(s_{\text{old}},, c_{\text{new}}) = \beta_{\min} + (\beta_{\max} - \beta_{\min}) \cdot e^{-c_{\text{new}} ,/, s_{\text{old}}}$$

$$s_{\text{old}} \leftarrow \max!\Big(0.05,;; s_{\text{old}} \cdot \beta\Big)$$

where $\beta_{\min} = 0.2$ (harshest), $\beta_{\max} = 0.85$ (mildest). Strong beliefs resist weak contradictions; weak beliefs yield to strong evidence.

Memories below $\tau = 0.1$ become invisible to the AI but remain in the database for the user to inspect.

Consolidation ("Sleep Replay")

A scheduled background daemon runs every $T = 6$ hours:

  1. Clustering: DBSCAN on cosine distance matrix $D_{ij} = 1 - \cos(\mathbf{e}_i, \mathbf{e}_j)$ with $\varepsilon = 0.35$, min_samples $= 3$
  2. Distillation: Each cluster $C_k$ with $|C_k| \geq 3$ is distilled by Claude Haiku into one semantic memory
  3. Centrality-weighted decay: Source episodics fade based on distance from cluster centroid — $\gamma_i = 0.5 + 0.4 \cdot (1 - \text{sim}(\mathbf{e}i, \bar{\mathbf{e}}{C_k}))$ — central memories fade more, peripheral ones retain unique details
  4. Schema synthesis: Re-cluster semantics ($\varepsilon = 0.45$), synthesize behavioral patterns from clusters of $\geq 3$
  5. Idempotent decay pass: Ebbinghaus curve applied to all memories not accessed in 7+ days
  6. Priority snapshots: Compares current priorities with 30-day-old snapshot, classifies as deliberate_pivot | gradual_drift | stable

Memory Reconsolidation (Lability Windows)

Based on Nader, Schafe & LeDoux (2000): when memory $m$ is retrieved at time $t_r$, it enters a labile state for $W = 6$ hours:

$$m.\text{labile} \leftarrow \text{true}, \quad m.t_{\text{recon}} \leftarrow t_r + W$$

If re-retrieved while already labile, the window extends: $m.t_{\text{recon}} \leftarrow \max(m.t_{\text{recon}},; t_r + W)$. During this window, new conversation context can passively refine the memory content without requiring an explicit contradiction. The labile set is capped at $|\mathcal{L}| \leq 10$.

This implements a dual belief-update architecture:

Pathway Trigger Behavior
Reconsolidation Memory retrieved Passive refinement: "I prefer async" $\rightarrow$ "I prefer async, except for urgent issues"
Contradiction detection Explicit conflict Evidence-weighted $\beta$ penalty + superseded_by link

Reconsolidation catches gradual belief drift that hard contradiction detection would miss. No other production agent memory system implements retrieval-triggered lability windows.

Contradiction Detection + Decision Ledger

Two-pass detection:

  • Real-time (during extraction): Every new decision or preference is checked against existing memories. Claude Haiku identifies semantic conflicts. Old memory receives evidence-weighted strength penalty.
  • Offline (during consolidation): Full audit across the memory store for subtle contradictions missed in real-time.

Decision ledger — decisions are first-class objects with:

  • decision_text + reasoning + domain + outcome
  • Explicit supersession chains: when a decision is reversed, the old one links to the new one
  • The agent can query the ledger by topic or domain to surface prior decisions with their reasoning

Context-Sensitive Retrieval

The same query produces different results depending on context. The composite query vector blends the current message with recent conversation state:

$$\mathbf{q}_{\text{composite}} = \frac{0.6 \cdot \mathbf{e}_{\text{query}} + 0.4 \cdot \frac{1}{n}\sum_{i=1}^{n} \mathbf{e}_{\text{recent}_i}}{\left| \cdots \right|}$$

Candidates are scored by a four-factor product:

$$\text{score}(m, q, I) = \text{sim}(\mathbf{e}_m, \mathbf{q}) ;\cdot; s_m ;\cdot; b_{\text{cat}}(m, I) ;\cdot; b_{\text{layer}}(m, I)$$

where $I \in {\texttt{decision}, \texttt{research}, \texttt{tasks}, \texttt{chat}}$ selects the intent-specific boost profile. Decision mode boosts decisions $1.5\times$ and schemas $1.5\times$; research mode boosts semantic and schema layers.

How It Differs

Capability MemGPT/Letta Mem0 Zep A-Mem MemoryBank FADEMEM Agenternal
Memory hierarchy 2 flat layers Flat 3 tiers Flat Flat Flat Episodic --> Semantic --> Schema
Forgetting None None Staleness Activation Ebbinghaus Adaptive exp. Idempotent Ebbinghaus
Spacing effect None None None None None None Spaced stability + spacing-scaled boost
Contradiction model None None None None None Exp. suppression Evidence-weighted (Rescorla-Wagner)
Reconsolidation None None None None None None 6h lability windows
Pattern abstraction None None None None None None DBSCAN --> behavioral schemas
Decision tracking None None None None None None Ledger with supersession chains
Context-sensitive retrieval None None User-aware None None None Intent + recency + layer weighted
Offline consolidation None None None None None None Scheduled daemon with centrality-weighted decay
Priority drift detection None None None None None None Snapshot comparison + drift classification

Tech Stack

Component Technology
LLM Claude Sonnet 4 (streaming) + Claude Haiku 4.5 (extraction, consolidation)
Frontend Next.js 16, React 19, Tailwind CSS 4
Backend FastAPI, Python 3.12
Database PostgreSQL 17 + pgvector
Embeddings fastembed BAAI/bge-small-en-v1.5 (384 dims, local ONNX)
Clustering scikit-learn DBSCAN (cosine distance)
Search DuckDuckGo (no API key)
Deployment Docker Compose / Railway

Quick Start (Local)

Prerequisites

  • Docker & Docker Compose
  • Anthropic API key

1. Configure

echo "ANTHROPIC_API_KEY=your-key-here" > backend/.env

2. Start

docker compose up -d --build

3. Open

4. Trigger consolidation manually

curl -X POST http://localhost:8000/api/consolidate

Deploy to Railway

1. Create services

In a new Railway project, create three services:

Service How Root directory Port
PostgreSQL "New" > "Database" > "PostgreSQL" auto
backend "New" > "GitHub Repo" > this repo /backend 8000
frontend "New" > "GitHub Repo" > this repo /frontend 3000

2. Set environment variables

backend:

Variable Value
ANTHROPIC_API_KEY Your Anthropic API key
DATABASE_URL Copy from Railway PostgreSQL service (auto-converts postgresql:// to postgresql+asyncpg://)
CORS_ORIGINS https://<your-frontend>.up.railway.app

frontend:

Variable Value
NEXT_PUBLIC_API_URL https://<your-backend>.up.railway.app

3. Enable pgvector

Railway's PostgreSQL supports pgvector. The backend automatically runs CREATE EXTENSION IF NOT EXISTS vector on startup.

Notes

  • The backend Dockerfile pre-downloads the ONNX embedding model at build time (~100MB) — no cold-start delay
  • The consolidation scheduler starts automatically with the backend (every 6h)
  • Health check: GET /api/health
  • Currently public (no auth) — add authentication before sharing widely

Project Structure

agenternal/
├── docker-compose.yml
├── docs/
│   └── brain-inspired-memory-research.md   # Full research document (formulas, literature review)
│
├── backend/
│   ├── main.py                             # FastAPI + consolidation scheduler
│   ├── config.py
│   ├── agent/
│   │   └── prompts.py                      # System prompts (response style, memory instructions)
│   ├── memory/
│   │   ├── archival_memory.py              # Spacing-aware search + retrieval reinforcement
│   │   ├── background_agent.py             # Post-turn extraction + evidence-weighted contradictions
│   │   ├── compression.py                  # Conversation rolling summaries
│   │   ├── consolidation.py                # Sleep replay: clustering, distillation, schema synthesis
│   │   ├── core_memory.py                  # Always-in-context user profile (4 blocks)
│   │   ├── decisions.py                    # Decision ledger with supersession chains
│   │   ├── embeddings.py                   # Local ONNX embedding model
│   │   ├── knowledge_graph.py              # Graph RAG with fuzzy entity dedup
│   │   ├── manager.py                      # Context-sensitive retrieval orchestration
│   │   ├── recall.py                       # Conversation history search
│   │   ├── reconsolidation.py              # Lability windows (Nader et al. 2000)
│   │   └── scheduler.py                    # Consolidation background task (6h interval)
│   ├── tools/
│   │   └── agent_tools.py                  # 14 agent tools (memory CRUD, search, delete, insights)
│   ├── api/
│   │   ├── chat.py                         # SSE streaming with tool use loop
│   │   ├── memory.py                       # Memory health API
│   │   ├── knowledge.py                    # Knowledge graph API
│   │   ├── tasks.py                        # Task management API
│   │   └── onboarding.py                   # First-time setup flow
│   └── db/
│       └── models.py                       # 9 tables (conversations, messages, core_memory,
│                                           #   archival_memory, entities, relationships,
│                                           #   tasks, memory_decisions, memory_schemas,
│                                           #   priority_snapshots)
│
└── frontend/
    └── src/
        ├── app/
        │   ├── page.tsx                    # Chat + sidebar + memory panel
        │   ├── memory/page.tsx             # Memory explorer (core, archival, graph)
        │   └── tasks/page.tsx              # Task manager
        ├── components/
        │   ├── ChatWindow.tsx              # Streaming chat with thinking + tool indicators
        │   ├── MessageBubble.tsx           # Message rendering with markdown
        │   ├── MemoryPanel.tsx             # Live memory activity + insights panel
        │   ├── Sidebar.tsx                 # Conversation list
        │   ├── KnowledgeGraph.tsx          # Force-directed graph visualization
        │   └── chat/                       # Sub-components (code blocks, thinking, tools, cards)
        ├── lib/
        │   ├── api.ts                      # API client + SSE streaming
        │   └── context/chat-context.tsx    # React context (chat state + memory events)
        └── types/chat.ts

Agent Tools (14)

Tool Purpose
web_search DuckDuckGo search for current information
collect_info Interactive form cards for structured input
core_memory_append Append to always-in-context memory
core_memory_replace Update or remove core memory content
graph_memory_add Create/update knowledge graph entities
graph_memory_search Search graph with 1-2 hop traversal
graph_memory_delete Remove entities and their relationships
archival_memory_insert Store facts in long-term memory
archival_memory_search Semantic search over archival memory
archival_memory_delete Remove incorrect memories
memory_insights Query abstracted behavioral patterns
decision_search Search the decision ledger by topic/domain
conversation_search Search past conversations by content
conversation_search_date Search conversations by date range

API Endpoints

Chat

  • POST /api/chat/send — SSE streaming with tool use loop
  • GET /api/chat/conversations — List conversations
  • GET /api/chat/conversations/:id/messages — Get messages
  • DELETE /api/chat/conversations/:id — Delete conversation

Memory

  • GET /api/memory/core — Core memory sections
  • PUT /api/memory/core — Update core memory
  • GET /api/memory/archival — Archival memories (with layer, strength)
  • GET /api/memory/search?q= — Semantic search
  • GET /api/memory/health — Layer stats, schemas, decisions, priority timeline
  • GET /api/memory/labile — Count of currently labile memories

Knowledge Graph

  • GET /api/knowledge/entities — List entities
  • GET /api/knowledge/entities/:id — Entity with relationships
  • GET /api/knowledge/graph — Full graph data for visualization
  • GET /api/knowledge/stats — Graph statistics

System

  • POST /api/consolidate — Manually trigger memory consolidation
  • GET /api/health — Service health check

References

Brain Science & Cognitive Psychology

  • Ebbinghaus (1885). Uber das Gedachtnis. Original forgetting curve.
  • Bjork & Bjork (1992). "A new theory of disuse." Storage strength vs retrieval strength.
  • McClelland, McNaughton, O'Reilly (1995). "Why there are complementary learning systems."
  • Nader, Schafe & LeDoux (2000). "Fear memories require protein synthesis for reconsolidation." Nature.
  • Walker et al. (2003). "Dissociable stages of memory consolidation and reconsolidation." Nature.
  • Cepeda et al. (2006). "Distributed practice in verbal recall tasks." Psychological Bulletin. Meta-analysis: spacing effect.
  • Karpicke & Roediger (2008). "The critical importance of retrieval for learning." Science.
  • Rescorla & Wagner (1972). "A theory of Pavlovian conditioning." Prediction error in belief updating.
  • Friston (2010). "The free-energy principle." Nature Reviews Neuroscience. Bayesian brain hypothesis.

AI Memory Systems

  • Packer et al. (2023). "MemGPT: Towards LLMs as Operating Systems." arXiv:2310.08560
  • MemoryBank (2023). "Enhancing LLMs with Long-Term Memory." arXiv:2305.10250
  • Zep (2025). "A Temporal Knowledge Graph Architecture for Agent Memory." arXiv:2501.13956
  • FADEMEM (2026). "Biologically-Inspired Forgetting and Adaptive Memory." arXiv:2601.18642
  • TiMem (2026). "Temporal-Hierarchical Memory Consolidation." arXiv:2601.02845
  • TraceMem (2026). "Weaving Narrative Memory Schemata." arXiv:2602.09712

See docs/brain-inspired-memory-research.md for the full research document with LaTeX formulas, literature comparison, and novelty assessment.

License

MIT

About

Treating AI agent memory as a living, evolving system rather than a database. It forgets, consolidates, detects contradictions, abstracts patterns, and retrieves differently based on context - inspired by how the human brain actually works.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors