Skip to content

Comments

Add long-term memory retrieval powered by qmd (BM25 and VSearch).#34

Open
mczabca-boop wants to merge 6 commits intomainfrom
feat/qmd-memory-retrieval-pr-ready
Open

Add long-term memory retrieval powered by qmd (BM25 and VSearch).#34
mczabca-boop wants to merge 6 commits intomainfrom
feat/qmd-memory-retrieval-pr-ready

Conversation

@mczabca-boop
Copy link
Collaborator

@mczabca-boop mczabca-boop commented Feb 13, 2026

PR Title

Improve QMD Memory Retrieval Reliability, Observability, and Claude Injection Safety

Summary

This PR hardens TinyClaw’s memory pipeline end-to-end and removes several reliability pitfalls found during real Telegram regression testing.

It keeps memory retrieval QMD-centric (BM25 + VSearch), improves retrieval quality and debug visibility, and fixes the critical issue where memory snippets were retrieved but not reliably consumed by Claude.

Why

Memory behavior was inconsistent in production-like tests due to:

  • semantic hits returning noisy/self-referential snippets
  • stale/low-confidence snippets polluting final answers
  • path normalization mismatches preventing turn hydration/reranking
  • memory being injected into a file Claude CLI did not reliably consume
  • potential cleanup overwrite risk when restoring CLAUDE.md
  • high VSearch overhead from synchronous embed timing

This PR addresses those issues while preserving optional memory and safe defaults.

Key Changes

1. QMD retrieval flow hardening

  • Retained QMD-only retrieval path:
    • BM25 (qmd search)
    • VSearch (qmd vsearch)
  • Added stronger fallback behavior:
    • if VSearch results are filtered to unusable snippets, fallback to BM25
  • Reused lexical query variants across precheck/main BM25 flow to avoid redundant recomputation.

2. Reranking and snippet quality improvements

  • Moved rerank heuristics into a dedicated module:
    • src/lib/memory-rerank.ts
  • Added configurable rerank settings under memory.rerank.
  • Added stronger low-confidence filtering:
    • low-confidence assistant snippets are now filtered out from injection (not only downranked).
  • Added rerank debug output to inspect top selected snippets in logs.

3. Turn hydration bug fixes

  • Fixed source-to-turn-file resolution robustness:
    • handles qmd source normalization differences (case/punctuation variants like _ vs -)
  • Standardized newly persisted turn filenames to lowercase to reduce future mismatch risk.
  • Result: rerank/hydration can reliably use full User/Assistant turn content instead of raw patch-like snippets.

4. Claude memory injection redesign

  • Replaced runtime injection target:
    • from .claude/MEMORY.md
    • to .claude/CLAUDE.md runtime section
  • Added safer cleanup strategy:
    • inject with unique start/end markers
    • cleanup removes only marker-bounded runtime block
    • avoids full-file rollback overwrite risk when file changes during invocation
  • Retained defensive cleanup for legacy MEMORY.md.

5. Embed/update behavior and latency improvements

  • Increased default embed interval to reduce runtime overhead:
    • default embed_interval_seconds: 600
  • Changed embed trigger to asynchronous fire-and-forget (non-blocking query path).
  • Added in-flight guard per collection to avoid duplicate embed runs.

6. Logging and observability

  • Added/kept clearer memory-source logs:
    • qmd-bm25 / qmd-vsearch
  • Added injection-path logs.
  • Added optional debug logging for:
    • mode, timeout, query-used, rerank summary, fallback behavior.
  • Improved warning behavior to be agent-scoped instead of process-global for qmd-unavailable cases.

7. Configuration/setup/docs updates

  • Setup wizard wording keeps VSearch explicitly marked as experimental:
    • Use semantic search (vector, experimental)? [y/N]
  • Updated setup defaults and README to reflect:
    • safer semantic behavior
    • async embed behavior
    • longer embed interval
    • retention/rerank controls
    • memory source observability.

Validation Performed

  • npm run build:main passed after changes.
  • bash -n lib/setup-wizard.sh passed.
  • End-to-end Telegram regressions validated:
    • correct memory source logs (qmd-vsearch, qmd-bm25 where applicable)
    • runtime injection log now shows .claude/CLAUDE.md (runtime section)
    • previously failing case (Who likes rock?) now correctly answers from retrieved memory after injection fix
    • repeated reset does not wipe persisted QMD memory (session reset behavior preserved).

Backward Compatibility

  • Memory remains optional.
  • TinyClaw still runs without qmd/bun; memory retrieval degrades gracefully when unavailable.
  • Existing deployments continue to work; behavior is now more explicit and debuggable.

Notes / Reviewer Focus

Please focus review on:

  • src/lib/memory.ts retrieval flow + fallback + hydration/rerank
  • src/lib/memory-rerank.ts configuration-driven heuristics
  • src/lib/invoke.ts marker-based CLAUDE.md runtime injection/cleanup safety
  • setup/README defaults and wording around experimental semantic search.

@mczabca-boop mczabca-boop changed the title feat(memory): add optional qmd retrieval with safe defaults ## Summary Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. ## Changes - add src/lib/memory.ts for memory retrieval integration - wire memory retrieval into invoke/queue flow - extend config types for memory/qmd options - document qmd installation and Linux/WSL dependency in README - add troubleshooting notes for memory/qmd behavior ## Config Notes - memory remains opt-in (memory.enabled: false by default) - qmd can be enabled independently under memory.qmd - supports configurable top_k, min_score, max_chars, and update interval ## Validation - npm run build passes - manual Telegram flow tested for: - memory disabled: no retrieval injection - memory enabled: retrieval hit appears in logs ## Risk / Compatibility - no breaking change for existing users - users without qmd are unaffected when memory is disabled Feb 13, 2026
@mczabca-boop mczabca-boop changed the title ## Summary Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. ## Changes - add src/lib/memory.ts for memory retrieval integration - wire memory retrieval into invoke/queue flow - extend config types for memory/qmd options - document qmd installation and Linux/WSL dependency in README - add troubleshooting notes for memory/qmd behavior ## Config Notes - memory remains opt-in (memory.enabled: false by default) - qmd can be enabled independently under memory.qmd - supports configurable top_k, min_score, max_chars, and update interval ## Validation - npm run build passes - manual Telegram flow tested for: - memory disabled: no retrieval injection - memory enabled: retrieval hit appears in logs ## Risk / Compatibility - no breaking change for existing users - users without qmd are unaffected when memory is disabled Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. Feb 13, 2026
@mczabca-boop mczabca-boop force-pushed the feat/qmd-memory-retrieval-pr-ready branch from a9ae0ca to 5b0f7ce Compare February 15, 2026 06:13
* fix(whatsapp): fail fast when Puppeteer Chrome is missing

* fix(whatsapp): validate Puppeteer executable path exists
@mczabca-boop mczabca-boop changed the title Add optional long-term memory retrieval powered by qmd, with safe defaults and documentation updates. Add long-term memory retrieval powered by qmd (BM25 and VSearch). Feb 18, 2026
@mczabca-boop mczabca-boop requested a review from jlia0 February 18, 2026 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant