feat(capture): add captureExclude and captureSkipMarker#139
feat(capture): add captureExclude and captureSkipMarker#139vvvvroot wants to merge 1 commit intonowledge-co:mainfrom
Conversation
Add two-layer filtering to skip unwanted sessions from auto-capture: 1. Pattern-based exclusion (captureExclude): array of glob patterns matched against ctx.sessionKey. Glob `*` matches within a colon-delimited segment. Example: "agent:*:cron:*" excludes all cron job sessions without affecting other conversations. 2. Marker-based exclusion (captureSkipMarker): when any message in the session contains the marker text (default: "#nmem-skip"), the entire session is skipped. Gives users ad-hoc control over which conversations are captured. Both layers apply to buildAgentEndCaptureHandler (thread append + triage/distill) and buildBeforeResetCaptureHandler (thread-only checkpoints). When a session is excluded, neither thread creation nor distillation occurs. Use case: OpenClaw users with scheduled cron jobs (news digests, token reports) and executor subagent sessions that produce hundreds of low-value threads, degrading Working Memory quality. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR introduces two new session auto-capture controls: Changes
Sequence Diagram(s)sequenceDiagram
actor Trigger as Capture Event<br/>(End/Reset)
participant Handler as Capture Handler
participant Config as Config<br/>captureExclude<br/>captureSkipMarker
participant Filter as Filter Logic
participant Session as Session<br/>Capture
Trigger->>Handler: Trigger event
Handler->>Handler: Derive sessionKey
Handler->>Filter: Check matchesExcludePattern(sessionKey, patterns)
Filter-->>Handler: Pattern matched?
alt Pattern Match
Handler->>Session: Return early<br/>(skip capture)
else No Match
Handler->>Filter: Check hasSkipMarker(messages, marker)
Filter-->>Handler: Marker found?
alt Marker Present
Handler->>Session: Return early<br/>(skip capture)
else No Marker
Handler->>Session: Proceed with capture
Session->>Session: Append/Distill thread
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can enable review details to help with troubleshooting, context usage and more.Enable the |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nowledge-mem-openclaw-plugin/src/hooks/capture.js (1)
454-454:⚠️ Potential issue | 🔴 CriticalBug:
_cfgis undefined — should becfg.Line 454 references
_cfg?.maxThreadMessageChars, but the parameter was renamed from_cfgtocfgon line 429. This causesmaxThreadMessageCharsto always fall back to the default (800), ignoring user configuration.🐛 Proposed fix
await appendOrCreateThread({ client, logger, event, ctx, reason, - maxMessageChars: _cfg?.maxThreadMessageChars, + maxMessageChars: cfg.maxThreadMessageChars, });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nowledge-mem-openclaw-plugin/src/hooks/capture.js` at line 454, The maxMessageChars property is using the old variable name `_cfg` which is undefined after the parameter rename to `cfg`, so update the reference to use `cfg?.maxThreadMessageChars` (replace `_cfg?.maxThreadMessageChars` with `cfg?.maxThreadMessageChars`) where maxMessageChars is set to ensure user-provided maxThreadMessageChars is respected; verify this change in the function using the cfg parameter that constructs the options object (look for maxMessageChars and cfg in capture.js).
🧹 Nitpick comments (1)
nowledge-mem-openclaw-plugin/src/config.js (1)
348-371: Consider adding_sourcestracking for consistency.Other config fields track their origin in
_sourcesfor diagnostic reporting (e.g.,_sources.sessionContext = sc.source). The newcaptureExcludeandcaptureSkipMarkerfields don't populate_sources, which creates an inconsistency in the config diagnostics.♻️ Proposed fix to add source tracking
// --- captureExclude: file > pluginConfig > default --- const captureExclude = (() => { const fromFile = Array.isArray(resolvedFile.captureExclude) ? resolvedFile.captureExclude : null; const fromPlugin = Array.isArray(resolvedPlugin.captureExclude) ? resolvedPlugin.captureExclude : null; const raw = fromFile ?? fromPlugin ?? []; + _sources.captureExclude = fromFile ? "file" : fromPlugin ? "pluginConfig" : "default"; return raw.filter((v) => typeof v === "string" && v.trim()); })(); // --- captureSkipMarker: file > pluginConfig > default --- const captureSkipMarker = (() => { const fromFile = typeof resolvedFile.captureSkipMarker === "string" ? resolvedFile.captureSkipMarker.trim() : undefined; const fromPlugin = typeof resolvedPlugin.captureSkipMarker === "string" ? resolvedPlugin.captureSkipMarker.trim() : undefined; + _sources.captureSkipMarker = fromFile ? "file" : fromPlugin ? "pluginConfig" : "default"; return fromFile || fromPlugin || "#nmem-skip"; })();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nowledge-mem-openclaw-plugin/src/config.js` around lines 348 - 371, The new captureExclude and captureSkipMarker config values don't record their origin in _sources; update the blocks that compute captureExclude and captureSkipMarker to also set _sources.captureExclude and _sources.captureSkipMarker to indicate where the value came from (use "file" when coming from resolvedFile, "plugin" when from resolvedPlugin, and "default" otherwise), referencing the existing computed values from resolvedFile and resolvedPlugin so diagnostics remain consistent with other fields like _sources.sessionContext.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@nowledge-mem-openclaw-plugin/src/hooks/capture.js`:
- Line 454: The maxMessageChars property is using the old variable name `_cfg`
which is undefined after the parameter rename to `cfg`, so update the reference
to use `cfg?.maxThreadMessageChars` (replace `_cfg?.maxThreadMessageChars` with
`cfg?.maxThreadMessageChars`) where maxMessageChars is set to ensure
user-provided maxThreadMessageChars is respected; verify this change in the
function using the cfg parameter that constructs the options object (look for
maxMessageChars and cfg in capture.js).
---
Nitpick comments:
In `@nowledge-mem-openclaw-plugin/src/config.js`:
- Around line 348-371: The new captureExclude and captureSkipMarker config
values don't record their origin in _sources; update the blocks that compute
captureExclude and captureSkipMarker to also set _sources.captureExclude and
_sources.captureSkipMarker to indicate where the value came from (use "file"
when coming from resolvedFile, "plugin" when from resolvedPlugin, and "default"
otherwise), referencing the existing computed values from resolvedFile and
resolvedPlugin so diagnostics remain consistent with other fields like
_sources.sessionContext.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d7ba281b-b4a2-4fff-a050-36eb00d46e8e
📒 Files selected for processing (3)
nowledge-mem-openclaw-plugin/openclaw.plugin.jsonnowledge-mem-openclaw-plugin/src/config.jsnowledge-mem-openclaw-plugin/src/hooks/capture.js
wey-gu
left a comment
There was a problem hiding this comment.
Review: PR #139 — captureExclude and captureSkipMarker
Good motivation, clean implementation. Two things to address before merge:
1. Context Engine path not covered (gap)
matchesExcludePattern() and hasSkipMarker() only guard the hooks path (buildAgentEndCaptureHandler, buildBeforeResetCaptureHandler). The Context Engine's afterTurn() in context-engine.js calls appendOrCreateThread + triageAndDistill directly — it does not check captureExclude or captureSkipMarker.
When a user has CE active (plugins.slots.contextEngine: "nowledge-mem"), excluded sessions will still be captured on every turn. This is the path that most active OpenClaw power users are on.
Suggested fix: export matchesExcludePattern and hasSkipMarker from capture.js, then add an early-return guard in afterTurn() in context-engine.js:
// context-engine.js, inside afterTurn()
if (matchesExcludePattern(sessionKey, cfg.captureExclude)) {
logger.debug?.(`ce: skipped excluded session ${sessionKey}`);
return;
}
// For hasSkipMarker, you'd need access to recent messages —
// check if event.messages or the session transcript is available2. Minor: regex compiled on every call
matchesExcludePattern creates new RegExp() per pattern per invocation. For the hooks path (session-end only) this is fine. But if extended to the CE afterTurn() (per-turn), consider caching compiled patterns — cfg.captureExclude is immutable after parse, so patterns can be compiled once in parseConfig() or lazily on first call.
Overall
The two-layer design (deterministic glob + ad-hoc marker) is elegant. Config precedence follows the established pattern. Schema additions are clean. Backward compatibility is solid.
Merge-ready once the CE path gap is addressed.
Summary
Adds two config options to filter unwanted sessions from auto-capture:
captureExclude(array of glob patterns): matched againstctx.sessionKeyto skip sessions deterministically. Glob*matches within a colon-delimited segment. Example:["agent:*:cron:*", "agent:*:subagent:*"]excludes all cron jobs and executor subagent sessions.captureSkipMarker(string, default#nmem-skip): when any message in the session contains this marker, the entire session is skipped. Gives users ad-hoc control directly from conversations.Both layers apply to
buildAgentEndCaptureHandlerandbuildBeforeResetCaptureHandler. When excluded, neither thread creation nor triage/distillation occurs.Motivation
OpenClaw users running scheduled cron jobs (daily news digests, token usage reports) and executor subagent sessions accumulate hundreds of low-value threads that degrade Working Memory quality. In one production instance: 422 cron threads + 46 subagent threads out of 541 total (87% noise).
The existing
sessionDigesttoggle is all-or-nothing. Users need granular control to exclude specific session types while keeping valuable conversations (Telegram, Discord, direct interactions).Config example
{ "sessionDigest": true, "captureExclude": [ "agent:*:cron:*", "agent:*:subagent:*" ], "captureSkipMarker": "#nmem-skip" }Changes
config.js: AddcaptureExcludeandcaptureSkipMarkertoALLOWED_KEYSandparseConfig(file > pluginConfig > defaults)hooks/capture.js: AddmatchesExcludePattern()andhasSkipMarker()helpers; guard both handler buildersopenclaw.plugin.json: Add schema properties and UI hintsBackward compatibility
captureExcludedefaults to[](no exclusions) — existing configs unchangedcaptureSkipMarkerdefaults to"#nmem-skip"— no impact unless marker is used🤖 Generated with Claude Code
Summary by CodeRabbit