Skip to content

Latest commit

 

History

History
87 lines (67 loc) · 2.05 KB

File metadata and controls

87 lines (67 loc) · 2.05 KB

Architecture

Overview

The Agent Memory System uses a hybrid architecture combining:

  • PostgreSQL for structured data and metrics
  • ChromaDB for semantic similarity search

Why Hybrid?

PostgreSQL Strengths

  • Time-based queries ("tasks from last week")
  • Exact filters (status, agent, project)
  • Aggregations (counts, averages, success rates)
  • Relational joins (tasks → executions → agents)
  • ACID guarantees

ChromaDB Strengths

  • Semantic similarity (find "related" tasks without exact keywords)
  • Fast vector search
  • Metadata filtering
  • Scales to millions of embeddings

Data Flow

User Input → Embed Query → ChromaDB (similarity) + Postgres (stats)
    ↓
Smart Routing (best agent based on history)
    ↓
Execute with Context (past solutions, patterns)
    ↓
Store Results (both DBs updated)

Schema Design

See schema.md for full SQL schema.

Key tables:

  • tasks - Task metadata
  • executions - Execution records with outcomes
  • techniques - Solutions/approaches used
  • agent_stats - Materialized performance metrics

Embeddings

Uses OpenAI text-embedding-3-small (1536 dimensions):

  • Cost: ~$0.0001 per 1k tokens
  • Can batch process for efficiency
  • Alternative: local sentence-transformers

Learning Loop

  1. Task arrives
  2. Query similar historical tasks (ChromaDB)
  3. Get agent performance stats (Postgres)
  4. Route to best agent with confidence score
  5. Inject context from similar successful tasks
  6. Execute
  7. Store outcome in both DBs
  8. Update agent stats (auto-trigger)

After N executions:

  • More data = better routing
  • Pattern recognition improves
  • Time estimates become accurate
  • Can explain decisions with data

Performance

  • Embedding: ~100ms per task
  • ChromaDB query: <50ms for top-10
  • Postgres aggregation: <10ms
  • Total overhead: ~150ms per task

Scales to:

  • 1M+ tasks (tested)
  • 100+ agents
  • Sub-second query times

Privacy

All local deployment:

  • No external calls (except OpenAI embeddings)
  • Data stays in your containers
  • Can use local embeddings for full offline mode