-
-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Current State
Short-term memory (STM):
- Supports both JSONL (default) and SQLite backends (v0.7.5+)
- Configurable via
CORTEXGRAPH_STORAGE_BACKEND=sqlite
Long-term memory (LTM):
- Markdown files in Obsidian vault (human-readable, git-friendly)
- Indexed via lightweight JSONL index (
ltm_index.py) - No graph or vector search capabilities
The Readability Tradeoff
Moving entirely to SQLite would sacrifice human inspectability - a core feature of cortexgraph. Being able to cat memories.jsonl or open markdown files in Obsidian makes the system trustworthy and debuggable.
Proposed Architecture: Hybrid Approach
Keep JSONL/Markdown as source of truth, enhance SQLite as a derived index layer.
┌─────────────────────────────────────┐
│ Source of Truth │
│ - JSONL for STM (human-readable) │
│ - Markdown for LTM (Obsidian) │
│ - Git version control │
└──────────────┬──────────────────────┘
│
│ Indexed by
↓
┌─────────────────────────────────────┐
│ SQLite Index (Derived) │
│ - Graph queries (CTEs) │
│ - Vector search (sqlite-vec) │
│ - FTS5 full-text search │
│ - Rebuilable from source │
└─────────────────────────────────────┘
This is an enhancement of the existing ltm_index.py pattern, not a replacement.
Implementation Plan
1. Enhanced SQLite Index Schema
-- Nodes (memories)
CREATE TABLE memories (
id TEXT PRIMARY KEY,
content TEXT,
embedding BLOB, -- sqlite-vec extension
created_at INTEGER,
last_used INTEGER,
score REAL,
source_file TEXT -- Path to JSONL/markdown source
);
-- Full-text search
CREATE VIRTUAL TABLE memories_fts USING fts5(content);
-- Edges (relations)
CREATE TABLE relations (
id TEXT PRIMARY KEY,
from_id TEXT,
to_id TEXT,
type TEXT,
strength REAL,
FOREIGN KEY(from_id) REFERENCES memories(id),
FOREIGN KEY(to_id) REFERENCES memories(id)
);
-- Vector index (sqlite-vec)
CREATE VIRTUAL TABLE vec_index USING vec0(
embedding FLOAT[384] -- all-MiniLM-L6-v2 dimension
);2. Vector Search with sqlite-vec
- Use sqlite-vec extension (by Alex Garcia, 2024)
- Modern, well-maintained, designed for embedding search
- Cosine similarity queries:
vec_distance_cosine(embedding, query_vector)
3. Graph Queries with Recursive CTEs
SQLite already supports graph traversal via Common Table Expressions:
-- Find all related memories (2 hops)
WITH RECURSIVE related(id, depth) AS (
VALUES('start-id', 0)
UNION
SELECT r.to_id, depth + 1
FROM relations r
JOIN related ON r.from_id = related.id
WHERE depth < 2
)
SELECT m.* FROM memories m JOIN related ON m.id = related.id;4. Automatic Sync
- Update index on
save_memory,promote_memory - Background sync process for batch updates
- Maintain
source_filereference for rebuilding
5. Rebuild Command
cortexgraph-maintenance rebuild-index
# Scans JSONL/Markdown source files
# Rebuilds SQLite index from scratch
# Index is always derived, never canonicalBenefits of Hybrid Approach
✅ Human readability preserved - JSONL/Markdown remains primary storage
✅ Git-friendly - Text diffs, version control, backup
✅ Obsidian integration - LTM stays in vault
✅ Performance - Complex queries use SQLite index
✅ Graph traversal - Recursive CTEs for relationships
✅ Vector search - sqlite-vec for semantic similarity
✅ Disaster recovery - Rebuild index from source anytime
✅ Trustworthy - Inspect source files directly, no black box
What Changes
File organization:
~/.config/cortexgraph/
├── jsonl/ # STM source (unchanged)
│ ├── memories.jsonl
│ └── relations.jsonl
├── index.db # Enhanced SQLite index
└── ltm_vault/ # LTM source (unchanged)
└── *.md files
Query routing:
- Simple lookups: Read JSONL/Markdown directly
- Graph queries: Use SQLite CTEs
- Vector search: Use sqlite-vec index
- Full-text search: Use SQLite FTS5
Principle: Index is derived, source is canonical.
Dependencies
- sqlite-vec - Vector similarity extension (https://github.com/asg017/sqlite-vec)
- FTS5 - Built into SQLite for full-text search
- Recursive CTEs - Built into SQLite for graph queries
Migration Path
- Enhance existing
ltm_index.pywith new schema - Add sqlite-vec extension loading
- Populate index from existing JSONL/Markdown
- Add sync hooks to save/promote operations
- Expose graph/vector query tools
- Maintain backward compatibility
Related Work
- Current LTM index:
src/cortexgraph/storage/ltm_index.py - STM SQLite implementation:
src/cortexgraph/storage/sqlite_storage.py - Graph features:
docs/graph_features.md
Priority
Medium - Enhancement to existing working system, preserves core architectural principles