Skip to content

Releases: rodbland2021/claw-recall

v2.4.0 — DB Cleanup, Cross-Session Dedup, Ingest Filtering

22 Mar 23:30
bc82a51

Choose a tag to compare

Database Cleanup & Data Quality Pipeline

This release adds a complete cleanup system for detecting and removing duplicate, noise, and junk data — plus ingest-time prevention to stop bloat before it starts.

Highlights

  • Cleanup Web UI (/cleanup) — scan, review, and delete duplicates, noise, junk, orphaned embeddings, and cross-session copies
  • Cross-session duplicate detection with similarity scoring (Exact/High/Medium) and expandable visual comparison
  • Ingest-time prevention — noise messages filtered at indexing, cross-session dedup prevents same session from being indexed twice
  • Quick-action delete buttons with chunked progress for all categories
  • Snapshot cache for instant page loads (0.3s cached, 1.5s fresh)

Data Quality Pipeline (3 layers)

  1. Ingest filtering — noise content + cross-session dedup at indexing time
  2. File exclusions — configurable patterns via exclude.conf
  3. Cleanup UI — on-demand detection and removal with visual review

See the full changelog for details.

v2.3.0 — context_chars param, 12x faster startup, README features

18 Mar 01:35
04da8a9

Choose a tag to compare

Added

  • context_chars parameter for search_memory — control result context length (default 500, max 2000)
  • Disk cache for embedding matrix — 12x faster startup (80s → 6.5s)
  • Key Features section in README

Fixed

  • Health check no longer breaks active MCP sessions
  • Stateless HTTP mode for MCP server — eliminates session tracking errors
  • Search reliability: MCP preloads cache on startup, health check validates results

Changed

  • Discord server management scripts moved to separate repo
  • PA review fixes: atomic disk writes, count accuracy, health check scope

Full changelog: CHANGELOG.md

v2.2.1 — Bug fixes & improvements

10 Mar 00:10

Choose a tag to compare

Bug fixes & improvements

Enhanced secret redaction reporting with per-type counting, improved redact_historical.py output, updated Discord invite link.

Commits since v2.2.0

  • e78f6fb Add SECRET_PATTERNS dict for per-type secret counting in reports
  • 1f5ffb9 Enhance redact_historical.py output and add example patterns config
  • 1ab58a1 Update Discord invite link to Claw Recall bot-created invite
  • 10564a6 Update README.md

v2.2.0 — Secret Redaction

08 Mar 02:34

Choose a tag to compare

Secret Redaction

Claw Recall now automatically strips sensitive data (API keys, OAuth tokens, passwords, SSH keys, etc.) from all content before it enters the database. This covers every ingestion path:

  • Session indexing — messages from OpenClaw and Claude Code sessions
  • Thought capture — CLI, HTTP, MCP captures
  • External sources — Gmail, Google Drive, Slack polling

Built-in patterns (18 categories)

Google OAuth (client ID + secret), Tailscale keys, AWS keys, generic API keys/tokens, Bearer tokens, passwords, cookie secrets, SSH private keys, Slack tokens, GitHub tokens, OpenAI keys, Anthropic keys, Stripe keys, Sendgrid keys, connection strings with embedded passwords, and custom header tokens.

Custom patterns

Add your own regex patterns to redact_patterns.conf (one per line). They're auto-loaded at startup.

Historical cleanup

Run the migration script to scan and redact existing records:

python3 -m scripts.redact_historical          # Dry run
python3 -m scripts.redact_historical --apply  # Apply changes

v2.1.1 — Fix Web UI Template Path

07 Mar 21:39

Choose a tag to compare

Fixed

  • Web UI was completely broken after v2.1.0 package refactor — _REPO_DIR resolved one level too shallow, causing TemplateNotFound: index.html on every request

v2.1.0 — Package Refactor

07 Mar 20:34

Choose a tag to compare

[2.1.0] — 2026-03-08

Package refactor: all code consolidated into claw_recall/ Python package with proper subpackages.

Changed

Package Structure

  • All source code moved from root-level .py files into claw_recall/ package with 4 subpackages: search/, capture/, indexing/, api/
  • All components now invoked via python3 -m claw_recall.xxx instead of python3 filename.py
  • Config centralized in claw_recall/config.py — single source of truth for DB_PATH, embedding settings, agent name mappings
  • Database connection management in claw_recall/database.py with get_db() context manager
  • Systemd service files updated to use module execution
  • CLI wrapper (recall) updated to call python3 -m claw_recall.cli

Documentation

  • README rewritten for beginners — numbered Quick Start steps, verification at each stage, exact MCP config file paths for Claude Code and OpenClaw, "Keep It Running" section (systemd/screen/cron), Quick Troubleshooting table
  • Prerequisites section moved before Quick Start with platform notes (WSL/Linux/macOS)
  • MCP section explains what MCP is, what stdio vs SSE means, where config files go
  • Comprehensive installation/operations guide split into docs/guide.md
  • Guide Production Deployment section rewritten with step-by-step systemd setup
  • CONTRIBUTING.md updated with correct test commands
  • Internal reference doc (claw-recall-reference) updated with package layout

Root Cleanup

  • 14 root-level Python files removed (replaced by package modules)
  • Scripts moved to scripts/ directory
  • Tests moved to tests/ directory

Fixed

  • mcporter MCP stdio config updated to reference new package module path
  • All 123 tests updated for new import paths and passing

v2.0.0 — MCP Integration, External Sources, Production Hardening

06 Mar 20:55

Choose a tag to compare

Major release: MCP integration, external source capture, SSE transport, health monitoring, and production hardening.

Highlights

  • MCP Integration — 8 tools via stdio and SSE transport for local and remote agent access
  • External Source Capture — Gmail, Google Drive, and Slack indexing with backfill support
  • Real-Time Indexing — inotify-based watcher + remote HTTP push for cross-machine sync
  • Thought Capture — Persistent insights that survive context compaction
  • Health Monitoring — Service health checks with embedding gap detection
  • Production Ready — systemd services, CSP headers, security hardening

What's New

MCP Tools

search_memory · search_thoughts · capture_thought · browse_recent · browse_activity · memory_stats · poll_sources · capture_source_status

External Sources

  • Gmail with full body extraction and PDF attachment parsing
  • Google Drive document indexing with noise filtering
  • Slack message capture
  • Historical backfill (--backfill --days 90)

Infrastructure

  • inotify file watcher with 5s debounce
  • Remote machine watcher via HTTP push
  • Incremental indexing (only new messages)
  • Production systemd service files
  • /health endpoint for monitoring

Bug Fixes

  • Shell injection vulnerability in .env loading
  • Memory leak in embedding cache (5.5GB → 123MB)
  • WSL agent misattribution
  • Resource leaks in watcher and session pusher
  • Improved error handling and logging across codebase

See CHANGELOG.md for full details.