diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index f32daf9..cbdadeb 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -28,8 +28,20 @@ jobs: - name: Lint run: pnpm lint + - name: Docs Contract + run: pnpm test:docs-contract + + - name: Reviewer Smoke + run: pnpm test:reviewer-smoke + + - name: CLI Smoke + run: pnpm test:cli-smoke + - name: Test run: pnpm test - name: Build run: pnpm build + + - name: Pack Check + run: pnpm pack:check diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 04ee05a..1da6d9f 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -21,6 +21,9 @@ When proposing changes, evaluate them against that product contract first. ```bash pnpm install pnpm lint +pnpm test:docs-contract +pnpm test:reviewer-smoke +pnpm test:cli-smoke pnpm test pnpm build ``` @@ -41,6 +44,7 @@ Use Node 20+ and `pnpm`. - Keep file formats human-readable. - Avoid over-engineering. Start with the simplest version that keeps future migration possible. - Keep comments in English. +- Keep reviewer-only warnings and confidence prose in audit/reviewer surfaces; they should not become continuity body content. ## Documentation Guidelines @@ -48,14 +52,17 @@ If your change affects one of these areas, update the matching file: - Claude behavior parity: `docs/claude-reference.md` - internals and storage model: `docs/architecture.md` +- reviewer continuity contract: `docs/session-continuity.md` +- release-time reviewer checks: `docs/release-checklist.md` - future native compatibility: `docs/native-migration.md` -- onboarding and positioning: `README.md` +- onboarding and positioning: `README.md` and `README.en.md` The repository now uses a bilingual public-doc setup: - `README.md` is the default Chinese landing page - `README.en.md` is the English landing page - `docs/claude-reference.*`, `docs/architecture.*`, and `docs/native-migration.*` are maintained in both Chinese and English +- `docs/session-continuity.md` and `docs/release-checklist.md` are English-first maintainer/reviewer docs and should still be updated when reviewer surfaces or command contracts change If you change shared meaning in one of those files, update the sibling language version in the same task or explicitly note the follow-up gap in your handoff. diff --git a/README.en.md b/README.en.md index 7862686..56be9fa 100644 --- a/README.en.md +++ b/README.en.md @@ -113,6 +113,7 @@ Not a good fit: | Native hooks / memory integration | Built in | Experimental / under development | Compatibility seam only | `cam memory` is intentionally an inspection and audit surface. It exposes the quoted startup files that actually made it into the startup payload, currently the scoped `MEMORY.md` / index content, plus the startup budget, on-demand topic refs, edit paths, and recent durable sync audit events behind `--recent [count]`. Those topic refs are lookup pointers, not proof that topic bodies were eagerly loaded at startup. +Recent durable sync audit events now also surface conservatively suppressed conflict candidates so contradictory rollout output does not silently merge into durable memory. Those recent sync events come from `~/.codex-auto-memory/projects//audit/sync-log.jsonl` and only cover sync-flow `applied`, `no-op`, and `skipped` events. Manual `cam remember` / `cam forget` updates stay outside that audit stream by design. When primary memory files were written but the reviewer sidecar did not complete, `cam memory` will try to expose a pending sync recovery marker so reviewers can see that partial-success state explicitly; that marker is only cleared when the same rollout/session later completes successfully, not by an unrelated successful sync. Explicit updates still happen through `cam remember`, `cam forget`, or direct Markdown edits rather than a `/memory`-style in-command editor. @@ -170,11 +171,11 @@ cam audit # check the repository for unexpected sensitive content | :-- | :-- | | `cam run` / `cam exec` / `cam resume` | compile startup memory and launch Codex through the wrapper | | `cam sync` | manually sync the latest rollout into durable memory | -| `cam memory` | inspect the quoted startup files that actually entered the payload, on-demand topic refs, startup budget, edit paths, and durable sync audit events via `--recent [count]` | +| `cam memory` | inspect the quoted startup files that actually entered the payload, on-demand topic refs, startup budget, edit paths, and durable sync audit events plus suppressed conflict candidates via `--recent [count]` | | `cam remember` / `cam forget` | explicitly add or remove durable memory | | `cam session save` | merge / incremental save; append rollout-derived continuity without cleaning stale state immediately | | `cam session refresh` | replace / clean regeneration; rebuild continuity from the selected provenance and replace the selected scope | -| `cam session load` / `status` | continuity reviewer surface for the latest audit drill-down, compact prior preview, and any pending continuity recovery marker | +| `cam session load` / `status` | continuity reviewer surface for the latest continuity diagnostics, including `confidence` / warnings when present, plus the latest audit drill-down, compact prior preview, and any pending continuity recovery marker | | `cam session clear` / `open` | clear active continuity files or open the local continuity directory | | `cam audit` | run privacy and secret-hygiene checks against the repository | | `cam doctor` | inspect local companion wiring and native-readiness posture | @@ -182,10 +183,11 @@ cam audit # check the repository for unexpected sensitive content ## Audit Surface Map - `cam audit`: repository-level privacy and secret-hygiene audit. -- `cam memory --recent [count]`: durable sync audit for recent `applied`, `no-op`, and `skipped` sync events, without mixing in manual `remember` / `forget`. +- `cam memory --recent [count]`: durable sync audit for recent `applied`, `no-op`, and `skipped` sync events, without mixing in manual `remember` / `forget`; suppressed conflict candidates stay reviewer-visible here instead of silently merging. - `cam session save`: the merge path for the continuity audit surface. It records the latest diagnostics, latest rollout, and latest audit drill-down, but it remains an incremental save and does not immediately clean polluted state. - `cam session refresh`: the replace path for the continuity audit surface. It regenerates continuity from selected provenance and replaces the selected scope; `--json` additionally exposes `action`, `writeMode`, and `rolloutSelection`. -- `cam session load|status`: reviewer surface for the latest continuity diagnostics, latest rollout, latest audit drill-down, and a compact prior audit preview sourced from the continuity audit log that excludes the latest entry, coalesces consecutive repeats, and is not a full prior-history replay. Their `--json` output continues to expose raw recent audit entries. +- `cam session load|status`: reviewer surface for the latest continuity diagnostics, latest rollout, latest audit drill-down, and a compact prior audit preview sourced from the continuity audit log that excludes the latest entry, coalesces consecutive repeats, and is not a full prior-history replay. Their `--json` output continues to expose raw recent audit entries, plus continuity `confidence` and warnings for conservative summaries. +- continuity reviewer warnings still belong to the audit/reviewer surface rather than the continuity body; the current implementation applies a minimal deterministic scrub so obvious reviewer warning prose is not written back into continuity Markdown. - `pending continuity recovery marker`: a visible warning that continuity Markdown was written but the audit sidecar failed. It is not a general repair mechanism and is not equivalent to `cam session refresh`. ## How it works @@ -205,10 +207,12 @@ flowchart TD B --> C[Inject quoted MEMORY.md startup files plus on-demand topic refs] C --> D[Run Codex] D --> E[Read rollout JSONL] - E --> F[Extract durable memory operations] + E --> F[Extract candidate durable memory operations] E --> G[Optional continuity summary] - F --> H[Update MEMORY.md and topic files] - G --> I[Update shared and local continuity files] + F --> H[Contradiction review and conservative suppression] + H --> I[Update MEMORY.md and topic files] + I --> J[Append durable sync audit] + G --> K[Update shared and local continuity files] ``` ### Why the project does not switch to native memory yet @@ -286,7 +290,7 @@ Current public-ready status: - stronger contradiction handling - clearer `cam memory` and `cam session` reviewer UX -- tighter continuity diagnostics and reviewer packets +- tighter continuity diagnostics and reviewer packets, with explicit confidence and warning surfaces - keep a compatibility seam for future hook surfaces ### v0.3+ diff --git a/README.md b/README.md index f1b1964..f87a5c4 100644 --- a/README.md +++ b/README.md @@ -113,6 +113,7 @@ Claude Code 已经公开了一套相对清晰的 auto memory 产品契约: | native hooks / memory | Built in | Experimental / under development | 当前只保留 compatibility seam | `cam memory` 当前是 inspection / audit surface:它会暴露真正进入 startup payload 的 quoted startup files(当前是各 scope 的 `MEMORY.md` / index 内容)、startup budget、按需 topic refs、edit paths,以及 `--recent [count]` 下的 recent durable sync audit。这里的 topic refs 只是按需定位信息,不表示 topic body 已在启动阶段 eager 读取。 +recent durable sync audit 现在也会显式暴露被保守 suppress 的 conflict candidates,避免在同一 rollout 或与现有 durable memory 冲突时发生静默 merge。 这些 recent sync events 来自 `~/.codex-auto-memory/projects//audit/sync-log.jsonl`,只覆盖 sync flow 的 `applied` / `no-op` / `skipped` 事件,不包含 manual `cam remember` / `cam forget`。 如果主 memory 文件已经写入,但 reviewer sidecar(audit / processed-state)没有完整落盘,`cam memory` 会尽力暴露一个 pending sync recovery marker,帮助 reviewer 识别 partial-success 状态;该 marker 只会在同一 rollout/session 后续成功补齐时清理,不会被不相关的成功 sync 顺手抹掉。 显式更新仍通过 `cam remember`、`cam forget` 或直接编辑 Markdown 文件完成,而不是提供 `/memory` 风格的命令内编辑器。 @@ -170,11 +171,11 @@ cam audit # 检查仓库有没有意外的敏感内容 | :-- | :-- | | `cam run` / `cam exec` / `cam resume` | 编译 startup memory 并通过 wrapper 启动 Codex | | `cam sync` | 手动把最近 rollout 同步进 durable memory | -| `cam memory` | 查看真正进入 startup payload 的 quoted startup files、按需 topic refs、startup budget、edit paths,以及 `--recent [count]` 下的 durable sync audit | +| `cam memory` | 查看真正进入 startup payload 的 quoted startup files、按需 topic refs、startup budget、edit paths,以及 `--recent [count]` 下的 durable sync audit 与 suppressed conflict candidates | | `cam remember` / `cam forget` | 显式新增或删除 memory | | `cam session save` | merge / incremental save;从 rollout 增量写入 continuity,不主动清掉已有污染状态 | | `cam session refresh` | replace / clean regeneration;从选定 provenance 重新生成 continuity 并覆盖所选 scope | -| `cam session load` / `status` | continuity reviewer surface;显示 latest audit drill-down、compact prior preview,以及 pending continuity recovery marker | +| `cam session load` / `status` | continuity reviewer surface;显示 latest continuity diagnostics(含 `confidence` / warnings)、latest audit drill-down、compact prior preview,以及 pending continuity recovery marker | | `cam session clear` / `open` | 清理当前 active continuity,或打开 local continuity 目录 | | `cam audit` | 做仓库级隐私 / secret hygiene 审查 | | `cam doctor` | 检查当前 companion wiring 与 native readiness posture | @@ -182,10 +183,11 @@ cam audit # 检查仓库有没有意外的敏感内容 ## 审计面地图 - `cam audit`: 仓库级的 privacy / secret hygiene 审计。 -- `cam memory --recent [count]`: durable sync audit,查看 recent `applied` / `no-op` / `skipped` sync 事件,不混入 manual `remember` / `forget`。 +- `cam memory --recent [count]`: durable sync audit,查看 recent `applied` / `no-op` / `skipped` sync 事件,不混入 manual `remember` / `forget`;当本轮提取结果因冲突而被保守 suppress 时,也会在 reviewer surface 中显式暴露。 - `cam session save`: continuity audit surface 的 merge 路径,记录最新 continuity diagnostics、latest rollout 与 latest audit drill-down;它是 incremental save,不会立刻把已有污染状态“洗干净”。 - `cam session refresh`: continuity audit surface 的 replace 路径,从选定 provenance 重新生成 continuity,并覆盖所选 scope;`--json` 会额外暴露 `action`、`writeMode` 与 `rolloutSelection`。 -- `cam session load|status`: reviewer surface,继续展示 latest continuity diagnostics、latest rollout、latest audit drill-down,以及 compact prior audit preview(来自 continuity audit log,排除 latest,并收敛连续重复项,不是完整 prior history 回放);两个命令的 `--json` 继续返回 raw recent audit entries。 +- `cam session load|status`: reviewer surface,继续展示 latest continuity diagnostics、latest rollout、latest audit drill-down,以及 compact prior audit preview(来自 continuity audit log,排除 latest,并收敛连续重复项,不是完整 prior history 回放);最新 diagnostics 现在也会显式带出 `confidence` 与 warnings,帮助 reviewer 区分稳定事实、临时状态与需二次核实的冲突/噪音。 +- continuity reviewer warnings 仍属于 audit / reviewer surface,而不是 continuity body;当前实现会对明显的 reviewer warning prose 做最小 deterministic scrub,避免它们被模型原样写回 continuity Markdown。 - `pending continuity recovery marker`: continuity Markdown 已写入但 audit sidecar 失败时的可见警告;它不等于 `cam session refresh` 会自动修复一切,只会在逻辑身份匹配的后续成功写入后被清理。 ## 工作方式 @@ -205,10 +207,12 @@ flowchart TD B --> C[注入 quoted MEMORY.md startup files 与按需 topic refs] C --> D[运行 Codex] D --> E[读取 rollout JSONL] - E --> F[提取 durable memory 操作] + E --> F[提取 candidate durable memory 操作] E --> G[可选 continuity 总结] - F --> H[更新 MEMORY.md 与 topic files] - G --> I[更新 shared / local continuity] + F --> H[contradiction review / conservative suppression] + H --> I[更新 MEMORY.md 与 topic files] + I --> J[追加 durable sync audit] + G --> K[更新 shared / local continuity] ``` ### 为什么不是直接上 native memory @@ -286,7 +290,7 @@ Session continuity: - 更稳的 contradiction handling - 更清晰的 `cam memory` / `cam session` 审查 UX -- continuity diagnostics 与 reviewer packet 继续收紧信息层次 +- continuity diagnostics 与 reviewer packet 继续收紧信息层次,并显式暴露 confidence / warnings - 继续保留对未来 hook surface 的 compatibility seam ### v0.3+ diff --git a/docs/architecture.en.md b/docs/architecture.en.md index 5ddcac3..7bfedab 100644 --- a/docs/architecture.en.md +++ b/docs/architecture.en.md @@ -31,10 +31,12 @@ flowchart TD B --> C[Inject quoted MEMORY.md startup files plus on-demand topic refs] C --> D[Run Codex] D --> E[Read rollout JSONL after session] - E --> F[Extract durable memory operations] + E --> F[Extract durable memory candidates] E --> G[Optional continuity summary] - F --> H[Update MEMORY.md and topic files] - G --> I[Update shared and local continuity files] + F --> H[Run contradiction review and conservative suppression] + H --> I[Update MEMORY.md and topic files] + I --> J[Append durable sync audit] + G --> K[Update shared and local continuity files] ``` ## 1. Startup path @@ -63,15 +65,18 @@ The sync path turns session evidence into durable Markdown memory: 1. read the relevant rollout JSONL 2. parse user messages, tool calls, and tool outputs -3. let the extractor produce memory operations -4. apply upserts and deletes to the Markdown store -5. rebuild `MEMORY.md` for the affected scope +3. let the extractor produce candidate memory operations +4. run contradiction review so conflicting candidates can be conservatively suppressed while explicit corrections still win +5. apply the reviewed upserts and deletes to the Markdown store +6. rebuild `MEMORY.md` for the affected scope +7. append durable sync audit entries that keep suppressed conflict candidates reviewer-visible The extractor is expected to: - keep stable, future-useful knowledge - avoid transcript replay - handle explicit corrections conservatively +- prefer provable corrections over silent conflict merges - keep temporary next-step noise out of durable memory ## 3. Optional session continuity path @@ -80,6 +85,8 @@ Session continuity is a separate companion layer, not part of the durable memory - shared continuity: project-wide working state shared across worktrees - project-local continuity: worktree-specific working state +- reviewer warnings and confidence remain audit-side metadata, not continuity-body content +- startup provenance only lists continuity files that were actually read for the injected block Its purpose is session recovery, not long-term memory. diff --git a/docs/architecture.md b/docs/architecture.md index bea4d6b..f4080d0 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -31,10 +31,12 @@ flowchart TD B --> C[注入 quoted MEMORY.md startup files 与按需 topic refs] C --> D[运行 Codex] D --> E[会话结束后读取 rollout JSONL] - E --> F[提取 durable memory operations] + E --> F[提取 durable memory candidates] E --> G[可选 continuity summary] - F --> H[更新 MEMORY.md 与 topic files] - G --> I[更新 shared / local continuity files] + F --> H[contradiction review 与保守 suppress] + H --> I[更新 MEMORY.md 与 topic files] + I --> J[写入 durable sync audit] + G --> K[更新 shared / local continuity files] ``` ## 1. Startup path @@ -63,15 +65,18 @@ sync path 的职责是把“值得长期保存的信息”写回 durable memory 1. 读取相关 rollout JSONL 2. 解析 user messages、tool calls、tool outputs -3. 由 extractor 生成 memory operations -4. 将 upsert / delete 应用到 Markdown store -5. 重建对应 scope 的 `MEMORY.md` +3. 由 extractor 生成 candidate memory operations +4. 经过 contradiction review,对冲突 candidate 做保守 suppress,并优先保留明确更正 +5. 将审查后的 upsert / delete 应用到 Markdown store +6. 重建对应 scope 的 `MEMORY.md` +7. 追加 durable sync audit,显式暴露 suppressed conflict candidates 供 reviewer 审查 当前 extractor 的设计目标是: - 保存稳定、未来有用的信息 - 避免保存原始会话回放 - 对显式 correction 做保守替换 +- 冲突场景下优先保留可证明的更正,而不是静默 merge - 避免把临时 next step / local edit noise 写进 durable memory ## 3. Optional session continuity path @@ -80,6 +85,8 @@ session continuity 是独立 companion layer,不属于 durable memory 契约 - shared continuity:跨 worktree 共享的项目级 working state - project-local continuity:当前 worktree 的本地 working state +- reviewer warning / confidence 属于 audit side metadata,不属于 continuity body +- startup provenance 只列出这次注入时真实读取到的 continuity 文件 它的存在是为了帮助会话恢复,而不是替代 memory。 diff --git a/docs/release-checklist.md b/docs/release-checklist.md index f68ebe7..3360e8a 100644 --- a/docs/release-checklist.md +++ b/docs/release-checklist.md @@ -19,22 +19,29 @@ Use this checklist before cutting any alpha or beta release of `codex-auto-memor ## Code and runtime checks - Run `pnpm lint` +- Run `pnpm test:docs-contract` +- Run `pnpm test:reviewer-smoke` +- Run `pnpm test:cli-smoke` - Run `pnpm test` - Run `pnpm build` -- Run `cam audit` -- Run `cam session refresh --json` and confirm `action`, `writeMode`, and `rolloutSelection` reflect the selected provenance. -- Run `cam session load --json` and confirm older JSON consumers still receive the existing core fields. -- Run `cam session status --json` and confirm the latest explicit audit drill-down matches the newest audit-log entry when present. +- Run `pnpm pack:check` +- Run `pnpm exec tsx src/cli.ts audit` if you want the repository privacy scan; keep it as a manual release-time check instead of a CI gate. +- Run `pnpm exec tsx src/cli.ts session refresh --json` and confirm `action`, `writeMode`, and `rolloutSelection` reflect the selected provenance. +- Run `pnpm exec tsx src/cli.ts session load --json` and confirm older JSON consumers still receive the existing core fields. +- Run `pnpm exec tsx src/cli.ts session status --json` and confirm the latest explicit audit drill-down matches the newest audit-log entry when present. +- Run `pnpm exec tsx src/cli.ts memory --recent --json` and confirm suppressed conflict candidates remain reviewer-visible instead of being silently merged. +- Confirm `pnpm exec tsx src/cli.ts session load --json` / `status --json` still expose `confidence` and warnings when the rollout required a conservative continuity summary. +- Confirm continuity reviewer warnings stay in diagnostics / audit surfaces and are not written into continuity Markdown body text. - Run a local smoke flow: - - `cam init` - - `cam remember "..."` - - `cam memory --recent --print-startup` - - `cam session status` - - `cam session save` - - `cam session refresh` - - `cam session load --print-startup` - - `cam forget "..."` - - `cam doctor` + - `pnpm exec tsx src/cli.ts init` + - `pnpm exec tsx src/cli.ts remember "..."` + - `pnpm exec tsx src/cli.ts memory --recent --print-startup` + - `pnpm exec tsx src/cli.ts session status` + - `pnpm exec tsx src/cli.ts session save` + - `pnpm exec tsx src/cli.ts session refresh` + - `pnpm exec tsx src/cli.ts session load --print-startup` + - `pnpm exec tsx src/cli.ts forget "..."` + - `pnpm exec tsx src/cli.ts doctor` ## Documentation checks @@ -45,7 +52,7 @@ Use this checklist before cutting any alpha or beta release of `codex-auto-memor ## Native compatibility checks - Run `cam doctor` and record the current `memories` / `codex_hooks` status. -- Run `cam audit` and record whether any medium/high findings remain. +- Run `pnpm exec tsx src/cli.ts audit` and record whether any medium/high findings remain. - Confirm that any native-facing code still preserves companion fallback. - Confirm that Markdown memory remains the user-facing source of truth. @@ -55,6 +62,5 @@ Do not tag a release unless: - tests are green - docs are current -- changelog is updated - review artifacts are in place - the current milestone can be explained without reading every commit in the repository diff --git a/docs/session-continuity.md b/docs/session-continuity.md index b9e94c0..f3160f4 100644 --- a/docs/session-continuity.md +++ b/docs/session-continuity.md @@ -173,6 +173,7 @@ Command contract: - project-local continuity - the effective merged resume brief - the latest continuity generation path and fallback status +- the latest continuity confidence and reviewer warnings when present - the latest rollout path - a small latest-generation drill-down for evidence counts and written continuity paths - a compact prior-generation audit preview sourced from the continuity audit log that excludes the latest entry, coalesces consecutive repeats, and does not attempt to replay full prior history @@ -203,6 +204,7 @@ Each save, refresh, or wrapper auto-save records: - whether the preferred path was `codex` or `heuristic` - which path actually produced the saved continuity - why Codex fell back when it did +- a compact `confidence` level plus reviewer warnings for conflict/noise cases - evidence counts for commands, file writes, next steps, and untried items - the rollout path and written continuity files - `trigger`: `manual-save`, `manual-refresh`, or `wrapper-auto-save` @@ -216,6 +218,7 @@ Reason: - reviewer/debug data belongs in an audit surface, not in the working-state note itself - the latest audit entry now remains exposed explicitly as `latestContinuityAuditEntry` through `cam session save --json`, `cam session refresh --json`, `cam session load --json`, and `cam session status --json` - the compatibility summary field `latestContinuityDiagnostics` still exposes the latest path/fallback view for existing consumers +- those same diagnostics now also expose `confidence` and reviewer warnings so consumers can distinguish explicit evidence from conservative fallback or noisy/contradictory rollouts - the same commands now also expose raw recent audit entries so reviewers can verify a short audit window without opening the JSONL directly - the default `load` / `status` text surfaces now show the latest rollout, the latest evidence counts and written paths, plus a compact prior audit preview without becoming a dedicated history browser - compact prior audit preview grouping now includes normalized `trigger` and `writeMode`, so a save and a refresh from the same rollout are still shown as distinct reviewer events diff --git a/package.json b/package.json index 3316dc2..8828aba 100644 --- a/package.json +++ b/package.json @@ -20,10 +20,15 @@ "packageManager": "pnpm@10.11.0", "scripts": { "build": "tsc -p tsconfig.build.json", + "ci": "pnpm lint && pnpm test:docs-contract && pnpm test:reviewer-smoke && pnpm test:cli-smoke && pnpm test && pnpm build && pnpm pack:check", "clean": "rimraf dist coverage .tmp", "dev": "tsx src/cli.ts", "lint": "tsc --noEmit -p tsconfig.json", + "pack:check": "npm pack --dry-run", "test": "vitest run", + "test:cli-smoke": "vitest run test/audit.test.ts test/memory-command.test.ts test/session-command.test.ts", + "test:docs-contract": "vitest run test/docs-contract.test.ts", + "test:reviewer-smoke": "vitest run test/docs-contract.test.ts test/memory-command.test.ts test/session-command.test.ts test/session-continuity.test.ts", "test:watch": "vitest" }, "keywords": [ diff --git a/src/lib/commands/memory.ts b/src/lib/commands/memory.ts index cd47329..318cddc 100644 --- a/src/lib/commands/memory.ts +++ b/src/lib/commands/memory.ts @@ -26,16 +26,26 @@ interface MemoryOptions { } function formatPendingSyncRecovery(record: SyncRecoveryRecord, recoveryPath: string): string[] { - return [ + const lines = [ "Pending sync recovery:", `- Recovery file: ${recoveryPath}`, `- Failed stage: ${record.failedStage}`, `- Rollout: ${record.rolloutPath}`, `- Session: ${record.sessionId ?? "unknown"}`, `- Status: ${record.status} (${record.appliedCount} operation${record.appliedCount === 1 ? "" : "s"})`, + `- Suppressed: ${record.suppressedOperationCount ?? 0}`, `- Audit entry written: ${record.auditEntryWritten}`, `- Failure: ${record.failureMessage}` ]; + + if (record.conflicts?.length) { + lines.push("- Conflict review:"); + for (const conflict of record.conflicts) { + lines.push(` - [${conflict.source}] ${conflict.topic}: ${conflict.candidateSummary}`); + } + } + + return lines; } function syncAuditSignature(entry: MemorySyncAuditEntry): string { @@ -50,7 +60,9 @@ function syncAuditSignature(entry: MemorySyncAuditEntry): string { actualExtractorMode: entry.actualExtractorMode, actualExtractorName: entry.actualExtractorName, appliedCount: entry.appliedCount, + suppressedOperationCount: entry.suppressedOperationCount ?? 0, scopesTouched: entry.scopesTouched, + conflicts: entry.conflicts ?? [], resultSummary: entry.resultSummary }); } diff --git a/src/lib/commands/session.ts b/src/lib/commands/session.ts index 928e674..3b66693 100644 --- a/src/lib/commands/session.ts +++ b/src/lib/commands/session.ts @@ -7,6 +7,7 @@ import { buildSessionContinuityAuditEntry, formatSessionContinuityAuditDrillDown, formatSessionContinuityDiagnostics, + normalizeContinuityRecoveryRecord, normalizeSessionContinuityAuditTrigger, normalizeSessionContinuityWriteMode, toSessionContinuityDiagnostics @@ -108,6 +109,8 @@ function formatRecentGenerationLines(entries: SessionContinuityAuditEntry[]): st writeMode: normalizeSessionContinuityWriteMode(entry.writeMode), preferredPath: entry.preferredPath, actualPath: entry.actualPath, + confidence: entry.confidence ?? "high", + warnings: entry.warnings ?? [], fallbackReason: entry.fallbackReason ?? null, codexExitCode: entry.codexExitCode ?? null, evidenceCounts: { @@ -126,7 +129,7 @@ function formatRecentGenerationLines(entries: SessionContinuityAuditEntry[]): st } const lines = preview.groups.flatMap((group) => [ - `- ${group.latest.generatedAt}: ${formatSessionContinuityDiagnostics(group.latest)}`, + `- ${group.latest.generatedAt}: ${formatSessionContinuityDiagnostics(toSessionContinuityDiagnostics(group.latest))}`, ` Rollout: ${group.latest.rolloutPath}`, ...(group.rawCount > 1 ? [` Repeated similar generations hidden: ${group.rawCount - 1}`] @@ -144,25 +147,37 @@ function formatPendingContinuityRecovery( record: ContinuityRecoveryRecord, recoveryPath: string ): string[] { + const normalized = normalizeContinuityRecoveryRecord(record); + const warnings = normalized.warnings ?? []; const lines = [ "Pending continuity recovery:", `- Recovery file: ${recoveryPath}`, - `- Failed stage: ${record.failedStage}`, - `- Rollout: ${record.rolloutPath}`, - ...(record.trigger ? [`- Trigger: ${record.trigger}`] : []), - ...(record.writeMode ? [`- Write mode: ${record.writeMode}`] : []), - `- Scope: ${record.scope}`, - `- Generation: ${record.actualPath} | preferred ${record.preferredPath}`, - `- Failure: ${record.failureMessage}` + `- Failed stage: ${normalized.failedStage}`, + `- Rollout: ${normalized.rolloutPath}`, + ...(normalized.trigger ? [`- Trigger: ${normalized.trigger}`] : []), + ...(normalized.writeMode ? [`- Write mode: ${normalized.writeMode}`] : []), + `- Scope: ${normalized.scope}`, + `- Generation: ${normalized.actualPath} | preferred ${normalized.preferredPath}${normalized.confidence ? ` | confidence ${normalized.confidence}` : ""}`, + `- Failure: ${normalized.failureMessage}` ]; - if (record.writtenPaths.length > 0) { - lines.push(...record.writtenPaths.map((filePath) => `- Written: ${filePath}`)); + if (warnings.length > 0) { + lines.push(...warnings.map((warning) => `- Warning: ${warning}`)); + } + + if (normalized.writtenPaths.length > 0) { + lines.push(...normalized.writtenPaths.map((filePath) => `- Written: ${filePath}`)); } return lines; } +function existingContinuitySourceFiles( + ...locations: Array<{ path: string; exists: boolean }> +): string[] { + return locations.filter((location) => location.exists).map((location) => location.path); +} + async function selectRefreshRollout( runtime: SessionRuntime, scope: SessionContinuityScope | "both", @@ -267,6 +282,8 @@ async function persistSessionContinuity( await options.runtime.sessionContinuityStore.readRecentAuditEntries( recentContinuityPreviewReadLimit ); + const pendingContinuityRecoveryRecord = + await options.runtime.sessionContinuityStore.readRecoveryRecord(); return { rolloutPath: options.rolloutPath, @@ -282,7 +299,9 @@ async function persistSessionContinuity( 0, recentContinuityAuditLimit ), - pendingContinuityRecovery: await options.runtime.sessionContinuityStore.readRecoveryRecord(), + pendingContinuityRecovery: pendingContinuityRecoveryRecord + ? normalizeContinuityRecoveryRecord(pendingContinuityRecoveryRecord) + : null, continuityAuditPath: options.runtime.sessionContinuityStore.paths.auditFile, continuityRecoveryPath: options.runtime.sessionContinuityStore.getRecoveryPath() }; @@ -390,7 +409,6 @@ export async function runSession( recentContinuityAuditLimit ); const latestContinuityAuditEntry = recentContinuityAuditPreviewEntries[0] ?? null; - const pendingContinuityRecovery = await runtime.sessionContinuityStore.readRecoveryRecord(); const latestContinuityDiagnostics = latestContinuityAuditEntry ? toSessionContinuityDiagnostics(latestContinuityAuditEntry) : null; @@ -401,9 +419,13 @@ export async function runSession( runtime.project.projectId, runtime.project.worktreeId ); + const pendingContinuityRecoveryRecord = await runtime.sessionContinuityStore.readRecoveryRecord(); + const pendingContinuityRecovery = pendingContinuityRecoveryRecord + ? normalizeContinuityRecoveryRecord(pendingContinuityRecoveryRecord) + : null; const startup = compileSessionContinuity( mergedState, - [projectLocation.path, localLocation.path].filter(Boolean), + existingContinuitySourceFiles(projectLocation, localLocation), runtime.loadedConfig.config.maxSessionContinuityLines ); diff --git a/src/lib/commands/wrapper.ts b/src/lib/commands/wrapper.ts index 7ba7a7d..2a7325a 100644 --- a/src/lib/commands/wrapper.ts +++ b/src/lib/commands/wrapper.ts @@ -82,7 +82,9 @@ async function compileStartupPayload(cwd: string): Promise { const localLocation = await runtime.sessionContinuityStore.getLocation("project-local"); const continuity = compileSessionContinuity( merged, - [projectLocation.path, localLocation.path].filter(Boolean), + [projectLocation, localLocation] + .filter((location) => location.exists) + .map((location) => location.path), runtime.loadedConfig.config.maxSessionContinuityLines ); return `${continuity.text.trimEnd()}\n\n${durable.text.trimStart()}`; diff --git a/src/lib/domain/memory-sync-audit.ts b/src/lib/domain/memory-sync-audit.ts index c3d57ae..19eed12 100644 --- a/src/lib/domain/memory-sync-audit.ts +++ b/src/lib/domain/memory-sync-audit.ts @@ -1,5 +1,6 @@ import type { AppConfig, + MemoryConflictCandidate, MemoryOperation, MemoryScope, MemorySyncAuditEntry, @@ -24,6 +25,14 @@ function isStringArray(value: unknown): value is string[] { return Array.isArray(value) && value.every((item) => typeof item === "string"); } +function isConflictSource(value: unknown): value is MemoryConflictCandidate["source"] { + return value === "within-rollout" || value === "existing-memory"; +} + +function isConflictResolution(value: unknown): value is MemoryConflictCandidate["resolution"] { + return value === "suppressed"; +} + function isExtractorMode(value: unknown): value is AppConfig["extractorMode"] { return value === "codex" || value === "heuristic"; } @@ -46,6 +55,22 @@ function isMemoryOperation(value: unknown): value is MemoryOperation { ); } +function isMemoryConflictCandidate(value: unknown): value is MemoryConflictCandidate { + if (!value || typeof value !== "object") { + return false; + } + + const candidate = value as Record; + return ( + isMemoryScope(candidate.scope) && + typeof candidate.topic === "string" && + typeof candidate.candidateSummary === "string" && + isStringArray(candidate.conflictsWith) && + isConflictSource(candidate.source) && + isConflictResolution(candidate.resolution) + ); +} + function summaryForStatus( status: MemorySyncAuditStatus, appliedCount: number, @@ -73,6 +98,13 @@ export function parseMemorySyncAuditEntry(value: unknown): MemorySyncAuditEntry const actualExtractorName = entry.actualExtractorName ?? entry.extractorName; const configuredExtractorMode = entry.configuredExtractorMode ?? actualExtractorMode; const configuredExtractorName = entry.configuredExtractorName ?? actualExtractorName; + const conflicts = Array.isArray(entry.conflicts) + ? entry.conflicts.filter((candidate): candidate is MemoryConflictCandidate => + isMemoryConflictCandidate(candidate) + ) + : []; + const suppressedOperationCount = + typeof entry.suppressedOperationCount === "number" ? entry.suppressedOperationCount : 0; if ( typeof entry.appliedAt !== "string" || @@ -88,6 +120,7 @@ export function parseMemorySyncAuditEntry(value: unknown): MemorySyncAuditEntry !isMemorySyncAuditStatus(entry.status) || !isMemorySyncAuditSkipReason(entry.skipReason) || typeof entry.appliedCount !== "number" || + suppressedOperationCount < 0 || !Array.isArray(entry.scopesTouched) || !entry.scopesTouched.every((scope) => isMemoryScope(scope)) || typeof entry.resultSummary !== "string" || @@ -114,8 +147,10 @@ export function parseMemorySyncAuditEntry(value: unknown): MemorySyncAuditEntry skipReason: entry.status === "skipped" ? entry.skipReason : undefined, ...(entry.isRecovery === true ? { isRecovery: true } : {}), appliedCount: entry.appliedCount, + suppressedOperationCount, scopesTouched: entry.scopesTouched, resultSummary: entry.resultSummary, + conflicts, operations: entry.operations }; } @@ -137,6 +172,8 @@ interface BuildMemorySyncAuditEntryOptions { sessionId?: string; skipReason?: MemorySyncAuditSkipReason; isRecovery?: boolean; + suppressedOperationCount?: number; + conflicts?: MemoryConflictCandidate[]; operations?: MemoryOperation[]; } @@ -144,6 +181,7 @@ export function buildMemorySyncAuditEntry( options: BuildMemorySyncAuditEntryOptions ): MemorySyncAuditEntry { const operations = options.operations ?? []; + const conflicts = options.conflicts ?? []; const scopesTouched = Array.from(new Set(operations.map((operation) => operation.scope))); const appliedCount = operations.length; @@ -164,8 +202,10 @@ export function buildMemorySyncAuditEntry( skipReason: options.status === "skipped" ? options.skipReason : undefined, ...(options.isRecovery ? { isRecovery: true } : {}), appliedCount, + suppressedOperationCount: options.suppressedOperationCount ?? 0, scopesTouched, resultSummary: summaryForStatus(options.status, appliedCount, options.skipReason), + conflicts, operations }; } @@ -174,7 +214,7 @@ export function formatMemorySyncAuditEntry(entry: MemorySyncAuditEntry): string[ const lines = [ `- ${entry.appliedAt}: [${entry.status}]${entry.isRecovery ? ' [recovery]' : ''} ${entry.resultSummary}`, ` Session: ${entry.sessionId ?? "unknown"} | Extractor: ${entry.actualExtractorName || entry.actualExtractorMode}`, - ` Applied: ${entry.appliedCount} | Scopes: ${entry.scopesTouched.length ? entry.scopesTouched.join(", ") : "none"}` + ` Applied: ${entry.appliedCount} | Suppressed: ${entry.suppressedOperationCount ?? 0} | Scopes: ${entry.scopesTouched.length ? entry.scopesTouched.join(", ") : "none"}` ]; if ( @@ -191,5 +231,17 @@ export function formatMemorySyncAuditEntry(entry: MemorySyncAuditEntry): string[ } lines.push(` Rollout: ${entry.rolloutPath}`); + + if (entry.conflicts?.length) { + lines.push(" Conflict review:"); + for (const conflict of entry.conflicts) { + lines.push( + ` - [${conflict.source}] ${conflict.topic}: ${conflict.candidateSummary}` + ); + for (const conflictingSummary of conflict.conflictsWith) { + lines.push(` vs ${conflictingSummary}`); + } + } + } return lines; } diff --git a/src/lib/domain/recovery-records.ts b/src/lib/domain/recovery-records.ts index 23b7f82..7cd27d6 100644 --- a/src/lib/domain/recovery-records.ts +++ b/src/lib/domain/recovery-records.ts @@ -2,7 +2,9 @@ import type { AppConfig, ContinuityRecoveryRecord, ContinuityRecoveryFailedStage, + MemoryConflictCandidate, MemoryScope, + SessionContinuityConfidence, SessionContinuityAuditTrigger, SessionContinuityDiagnostics, SessionContinuityEvidenceCounts, @@ -29,6 +31,30 @@ function isStringArray(value: unknown): value is string[] { return Array.isArray(value) && value.every((item) => typeof item === "string"); } +function isConflictSource(value: unknown): value is MemoryConflictCandidate["source"] { + return value === "within-rollout" || value === "existing-memory"; +} + +function isConflictResolution(value: unknown): value is MemoryConflictCandidate["resolution"] { + return value === "suppressed"; +} + +function isMemoryConflictCandidate(value: unknown): value is MemoryConflictCandidate { + if (!value || typeof value !== "object") { + return false; + } + + const candidate = value as Record; + return ( + isMemoryScope(candidate.scope) && + typeof candidate.topic === "string" && + typeof candidate.candidateSummary === "string" && + isStringArray(candidate.conflictsWith) && + isConflictSource(candidate.source) && + isConflictResolution(candidate.resolution) + ); +} + function isContinuityTrigger(value: unknown): value is SessionContinuityAuditTrigger { return ( value === undefined || @@ -42,6 +68,10 @@ function isWriteMode(value: unknown): value is SessionContinuityWriteMode { return value === undefined || value === "merge" || value === "replace"; } +function isContinuityConfidence(value: unknown): value is SessionContinuityConfidence { + return value === "high" || value === "medium" || value === "low"; +} + function isEvidenceCounts(value: unknown): value is SessionContinuityEvidenceCounts { if (!value || typeof value !== "object") { return false; @@ -84,6 +114,13 @@ export function isSyncRecoveryRecord(value: unknown): value is SyncRecoveryRecor } const record = value as Record; + const conflicts = Array.isArray(record.conflicts) + ? record.conflicts.filter((candidate): candidate is MemoryConflictCandidate => + isMemoryConflictCandidate(candidate) + ) + : []; + const suppressedOperationCount = + typeof record.suppressedOperationCount === "number" ? record.suppressedOperationCount : 0; return ( typeof record.recordedAt === "string" && typeof record.projectId === "string" && @@ -96,8 +133,10 @@ export function isSyncRecoveryRecord(value: unknown): value is SyncRecoveryRecor typeof record.actualExtractorName === "string" && (record.status === "applied" || record.status === "no-op") && typeof record.appliedCount === "number" && + suppressedOperationCount >= 0 && Array.isArray(record.scopesTouched) && record.scopesTouched.every((scope) => isMemoryScope(scope)) && + conflicts.length === (Array.isArray(record.conflicts) ? record.conflicts.length : 0) && isSyncRecoveryFailedStage(record.failedStage) && typeof record.failureMessage === "string" && typeof record.auditEntryWritten === "boolean" @@ -124,6 +163,8 @@ export function isContinuityRecoveryRecord( isStringArray(record.writtenPaths) && isExtractorPath(record.preferredPath) && isExtractorPath(record.actualPath) && + (record.confidence === undefined || isContinuityConfidence(record.confidence)) && + (record.warnings === undefined || isStringArray(record.warnings)) && isContinuityFallbackReason(record.fallbackReason) && (record.codexExitCode === undefined || typeof record.codexExitCode === "number") && isEvidenceCounts(record.evidenceCounts) && @@ -143,7 +184,9 @@ interface BuildSyncRecoveryRecordOptions { actualExtractorName: string; status: "applied" | "no-op"; appliedCount: number; + suppressedOperationCount?: number; scopesTouched: MemoryScope[]; + conflicts?: MemoryConflictCandidate[]; failedStage: SyncRecoveryFailedStage; failureMessage: string; auditEntryWritten: boolean; @@ -164,7 +207,9 @@ export function buildSyncRecoveryRecord( actualExtractorName: options.actualExtractorName, status: options.status, appliedCount: options.appliedCount, + suppressedOperationCount: options.suppressedOperationCount ?? 0, scopesTouched: options.scopesTouched, + conflicts: options.conflicts ?? [], failedStage: options.failedStage, failureMessage: options.failureMessage, auditEntryWritten: options.auditEntryWritten @@ -219,6 +264,8 @@ export function buildContinuityRecoveryRecord( writtenPaths: options.writtenPaths, preferredPath: options.diagnostics.preferredPath, actualPath: options.diagnostics.actualPath, + confidence: options.diagnostics.confidence, + warnings: options.diagnostics.warnings, fallbackReason: options.diagnostics.fallbackReason, codexExitCode: options.diagnostics.codexExitCode, evidenceCounts: options.diagnostics.evidenceCounts, diff --git a/src/lib/domain/session-continuity-diagnostics.ts b/src/lib/domain/session-continuity-diagnostics.ts index bc9dced..e9d5734 100644 --- a/src/lib/domain/session-continuity-diagnostics.ts +++ b/src/lib/domain/session-continuity-diagnostics.ts @@ -1,8 +1,10 @@ import type { AppConfig, + ContinuityRecoveryRecord, ProjectContext, SessionContinuityAuditEntry, SessionContinuityAuditTrigger, + SessionContinuityConfidence, SessionContinuityDiagnostics, SessionContinuityFallbackReason, SessionContinuityScope, @@ -39,6 +41,10 @@ function isAuditTrigger(value: unknown): value is SessionContinuityAuditTrigger ); } +function isConfidence(value: unknown): value is SessionContinuityConfidence { + return value === "high" || value === "medium" || value === "low"; +} + function isWriteMode(value: unknown): value is SessionContinuityWriteMode { return value === undefined || value === "merge" || value === "replace"; } @@ -71,12 +77,35 @@ function isEvidenceCounts( ); } +export function normalizeSessionContinuityWarnings(value: unknown): string[] { + return Array.isArray(value) && value.every((item) => typeof item === "string") + ? value + : []; +} + +export function normalizeSessionContinuityConfidence( + confidence: unknown, + warnings: string[], + fallbackReason?: SessionContinuityFallbackReason +): SessionContinuityConfidence { + if (isConfidence(confidence)) { + return confidence; + } + + if (fallbackReason) { + return "low"; + } + + return warnings.length > 0 ? "medium" : "high"; +} + export function isSessionContinuityAuditEntry(value: unknown): value is SessionContinuityAuditEntry { if (!value || typeof value !== "object") { return false; } const entry = value as Record; + const warnings = normalizeSessionContinuityWarnings(entry.warnings); return ( typeof entry.generatedAt === "string" && typeof entry.projectId === "string" && @@ -89,6 +118,8 @@ export function isSessionContinuityAuditEntry(value: unknown): value is SessionC typeof entry.sourceSessionId === "string" && isExtractorPath(entry.preferredPath) && isExtractorPath(entry.actualPath) && + (entry.confidence === undefined || isConfidence(entry.confidence)) && + warnings.length === (Array.isArray(entry.warnings) ? entry.warnings.length : 0) && isFallbackReason(entry.fallbackReason) && (entry.codexExitCode === undefined || typeof entry.codexExitCode === "number") && isEvidenceCounts(entry.evidenceCounts) && @@ -102,7 +133,8 @@ export function formatSessionContinuityDiagnostics( ): string { const parts = [ `Generation: ${diagnostics.actualPath}`, - `preferred ${diagnostics.preferredPath}` + `preferred ${diagnostics.preferredPath}`, + `confidence ${diagnostics.confidence}` ]; if (diagnostics.fallbackReason) { @@ -127,7 +159,15 @@ function formatEvidenceCounts(entry: SessionContinuityAuditEntry): string { export function formatSessionContinuityAuditDrillDown( entry: SessionContinuityAuditEntry ): string[] { - const lines = [`Evidence: ${formatEvidenceCounts(entry)}`]; + const warnings = normalizeSessionContinuityWarnings(entry.warnings); + const lines = [ + `Confidence: ${normalizeSessionContinuityConfidence(entry.confidence, warnings, entry.fallbackReason)}`, + `Evidence: ${formatEvidenceCounts(entry)}` + ]; + + if (warnings.length > 0) { + lines.push("Warnings:", ...warnings.map((warning) => `- ${warning}`)); + } if (entry.writtenPaths.length === 0) { lines.push("Written paths: none"); @@ -141,18 +181,36 @@ export function formatSessionContinuityAuditDrillDown( export function toSessionContinuityDiagnostics( entry: SessionContinuityAuditEntry ): SessionContinuityDiagnostics { + const warnings = normalizeSessionContinuityWarnings(entry.warnings); return { generatedAt: entry.generatedAt, rolloutPath: entry.rolloutPath, sourceSessionId: entry.sourceSessionId, preferredPath: entry.preferredPath, actualPath: entry.actualPath, + confidence: normalizeSessionContinuityConfidence(entry.confidence, warnings, entry.fallbackReason), + warnings, fallbackReason: entry.fallbackReason, codexExitCode: entry.codexExitCode, evidenceCounts: entry.evidenceCounts }; } +export function normalizeContinuityRecoveryRecord( + record: ContinuityRecoveryRecord +): ContinuityRecoveryRecord { + const warnings = normalizeSessionContinuityWarnings(record.warnings); + return { + ...record, + confidence: normalizeSessionContinuityConfidence( + record.confidence, + warnings, + record.fallbackReason + ), + warnings + }; +} + export function normalizeSessionContinuityAuditTrigger( trigger?: SessionContinuityAuditTrigger ): SessionContinuityAuditTrigger | "legacy" { @@ -190,6 +248,8 @@ export function buildSessionContinuityAuditEntry( sourceSessionId: diagnostics.sourceSessionId, preferredPath: diagnostics.preferredPath, actualPath: diagnostics.actualPath, + confidence: diagnostics.confidence, + warnings: diagnostics.warnings, fallbackReason: diagnostics.fallbackReason, codexExitCode: diagnostics.codexExitCode, evidenceCounts: diagnostics.evidenceCounts, diff --git a/src/lib/domain/sync-service.ts b/src/lib/domain/sync-service.ts index 3bffea5..4135aa0 100644 --- a/src/lib/domain/sync-service.ts +++ b/src/lib/domain/sync-service.ts @@ -3,6 +3,7 @@ import path from "node:path"; import { fileURLToPath } from "node:url"; import type { AppConfig, + MemoryConflictCandidate, MemoryEntry, MemoryOperation, ProcessedRolloutIdentity, @@ -13,6 +14,7 @@ import type { import { MemoryStore } from "./memory-store.js"; import { HeuristicExtractor } from "../extractor/heuristic-extractor.js"; import { CodexExtractor } from "../extractor/codex-extractor.js"; +import { reviewExtractedMemoryOperations } from "../extractor/contradiction-review.js"; import { filterMemoryOperations } from "../extractor/safety.js"; import type { MemoryExtractorAdapter } from "../runtime/contracts.js"; import { RolloutSessionSource } from "../runtime/rollout-session-source.js"; @@ -119,9 +121,11 @@ export class SyncService { ]; const extraction = await this.extractOperations(evidence, existingEntries); - const applied = await this.store.applyOperations( - filterMemoryOperations(extraction.operations) + const reviewedOperations = reviewExtractedMemoryOperations( + filterMemoryOperations(extraction.operations), + existingEntries ); + const applied = await this.store.applyOperations(reviewedOperations.operations); const status = applied.length === 0 ? "no-op" : "applied"; const auditEntry = buildMemorySyncAuditEntry({ project: this.project, @@ -133,6 +137,8 @@ export class SyncService { actualExtractorName: extraction.actualExtractorName, sessionSource: this.sessionSource.name, status, + suppressedOperationCount: reviewedOperations.suppressedOperationCount, + conflicts: reviewedOperations.conflicts, operations: applied, ...(isRecovery ? { isRecovery: true } : {}) }); @@ -147,7 +153,9 @@ export class SyncService { actualExtractorName: extraction.actualExtractorName, status, appliedCount: auditEntry.appliedCount, + suppressedOperationCount: auditEntry.suppressedOperationCount ?? 0, scopesTouched: auditEntry.scopesTouched, + conflicts: auditEntry.conflicts ?? [], failedStage: "audit-write", failureMessage: errorMessage(error), auditEntryWritten: false @@ -165,7 +173,9 @@ export class SyncService { actualExtractorName: extraction.actualExtractorName, status, appliedCount: auditEntry.appliedCount, + suppressedOperationCount: auditEntry.suppressedOperationCount ?? 0, scopesTouched: auditEntry.scopesTouched, + conflicts: auditEntry.conflicts ?? [], failedStage: "processed-state-write", failureMessage: errorMessage(error), auditEntryWritten: true @@ -247,7 +257,9 @@ export class SyncService { actualExtractorName: string; status: "applied" | "no-op"; appliedCount: number; + suppressedOperationCount: number; scopesTouched: MemoryOperation["scope"][]; + conflicts: MemoryConflictCandidate[]; failedStage: "audit-write" | "processed-state-write"; failureMessage: string; auditEntryWritten: boolean; @@ -265,7 +277,9 @@ export class SyncService { actualExtractorName: options.actualExtractorName, status: options.status, appliedCount: options.appliedCount, + suppressedOperationCount: options.suppressedOperationCount, scopesTouched: options.scopesTouched, + conflicts: options.conflicts, failedStage: options.failedStage, failureMessage: options.failureMessage, auditEntryWritten: options.auditEntryWritten diff --git a/src/lib/extractor/contradiction-review.ts b/src/lib/extractor/contradiction-review.ts new file mode 100644 index 0000000..b477081 --- /dev/null +++ b/src/lib/extractor/contradiction-review.ts @@ -0,0 +1,354 @@ +import type { + MemoryConflictCandidate, + MemoryEntry, + MemoryOperation, + MemoryScope +} from "../types.js"; + +interface DirectiveChoice { + key: string; + value: string; +} + +interface CandidateReview { + index: number; + operation: MemoryOperation; + groupKey: string; + choices: DirectiveChoice[]; + highConfidence: boolean; +} + +export interface ReviewedMemoryOperations { + operations: MemoryOperation[]; + suppressedOperationCount: number; + conflicts: MemoryConflictCandidate[]; +} + +const reviewableTopics = new Set(["preferences", "workflow", "commands"]); +const replacementDeleteReasonPattern = /^Superseded by a newer /u; +const packageManagerValues = ["pnpm", "npm", "yarn", "bun"] as const; +const repoSearchValues = ["rg", "ripgrep", "grep"] as const; +const hedgedCorrectionPattern = + /(?:\bmaybe\b|\bperhaps\b|\bif possible\b|\bwhen possible\b|\bfor now\b|\bprobably\b|\busually\b|\bsometimes\b|\btry\b|\bconsider\b|\bmight\b|\bcould\b|尽量|如果可以|可能|暂时)/iu; + +function escapeRegExp(value: string): string { + return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); +} + +function buildGroupKey(scope: MemoryScope, topic: string): string { + return `${scope}::${topic}`; +} + +function isHighConfidenceReplacement(operation: MemoryOperation): boolean { + if (operation.action !== "upsert") { + return false; + } + + if (operation.reason === "Explicit user correction that should replace stale memory.") { + return !hedgedCorrectionPattern.test(operation.summary ?? ""); + } + + return false; +} + +function hasCommandReplacementDelete( + operations: MemoryOperation[], + operation: MemoryOperation +): boolean { + return operations.some( + (candidate) => + candidate.action === "delete" && + candidate.scope === operation.scope && + candidate.topic === operation.topic && + isReplacementDelete(candidate) + ); +} + +function commandSignature(command: string): string | null { + const normalized = command.toLowerCase().trim(); + if (/\b(?:pnpm|npm|bun|yarn)\s+(test|lint|build|install)\b/u.test(normalized)) { + return normalized.match(/\b(?:pnpm|npm|bun|yarn)\s+(test|lint|build|install)\b/u)?.[1] ?? null; + } + + if (/\bcargo\s+(test|build|check)\b/u.test(normalized)) { + return normalized.match(/\bcargo\s+(test|build|check)\b/u)?.[1] ?? null; + } + + if (/\b(?:pytest|jest|vitest|go test|dotnet test|rake)\b/u.test(normalized)) { + return "test"; + } + + if (/\b(?:tsc|vite build|next build|gradle|mvn|make)\b/u.test(normalized)) { + return "build"; + } + + return null; +} + +function extractCommandChoice(text: string): DirectiveChoice[] { + const commandMatch = text.match(/`([^`]+)`/u); + const command = commandMatch?.[1]?.trim(); + if (!command) { + return []; + } + + const signature = commandSignature(command); + if (!signature) { + return []; + } + + return [ + { + key: `command:${signature}`, + value: command.toLowerCase() + } + ]; +} + +function extractValueChoice( + text: string, + values: readonly string[], + key: string +): DirectiveChoice[] { + const normalized = text.toLowerCase(); + + for (const value of values) { + const escaped = escapeRegExp(value); + const patterns = [ + new RegExp(`\\b(?:we\\s+)?use\\s+${escaped}\\b`, "u"), + new RegExp(`\\bprefer\\s+${escaped}\\b`, "u"), + new RegExp(`\\balways\\s+use\\s+${escaped}\\b`, "u"), + new RegExp(`\\bnot\\s+[^,.]+[,,]\\s*(?:actually\\s+)?use\\s+${escaped}\\b`, "u"), + new RegExp(`(?:使用|用|优先用|优先使用)\\s*${escaped}\\b`, "u"), + new RegExp(`(?:别用|不要用)[^,,。.;;]*[,,]\\s*用\\s*${escaped}\\b`, "u") + ]; + + if (patterns.some((pattern) => pattern.test(normalized))) { + return [ + { + key, + value + } + ]; + } + } + + return []; +} + +function extractDirectiveChoices(operation: MemoryOperation): DirectiveChoice[] { + if (operation.action !== "upsert" || !operation.summary) { + return []; + } + + if (operation.topic === "commands") { + return extractCommandChoice(operation.summary); + } + + return [ + ...extractValueChoice(operation.summary, packageManagerValues, "package-manager"), + ...extractValueChoice(operation.summary, repoSearchValues, "repo-search") + ]; +} + +function choicesConflict(left: DirectiveChoice[], right: DirectiveChoice[]): boolean { + return left.some((leftChoice) => + right.some( + (rightChoice) => + leftChoice.key === rightChoice.key && leftChoice.value !== rightChoice.value + ) + ); +} + +function buildConflictCandidate( + operation: MemoryOperation, + source: MemoryConflictCandidate["source"], + conflictsWith: string[] +): MemoryConflictCandidate | null { + if (operation.action !== "upsert" || !operation.summary || conflictsWith.length === 0) { + return null; + } + + return { + scope: operation.scope, + topic: operation.topic, + candidateSummary: operation.summary, + conflictsWith, + source, + resolution: "suppressed" + }; +} + +function findPreferredWithinRolloutWinner( + review: CandidateReview, + conflictingReviews: CandidateReview[] +): CandidateReview | null { + const highConfidenceReviews = [review, ...conflictingReviews] + .filter((candidate) => candidate.highConfidence) + .sort((left, right) => right.index - left.index); + + return highConfidenceReviews[0] ?? null; +} + +function hasRetainedHighConfidenceCandidate( + reviews: CandidateReview[], + retainedIndices: Set, + groupKey: string +): boolean { + return reviews.some( + (review) => + review.groupKey === groupKey && + review.highConfidence && + retainedIndices.has(review.index) + ); +} + +function isReplacementDelete(operation: MemoryOperation): boolean { + return ( + operation.action === "delete" && + typeof operation.reason === "string" && + replacementDeleteReasonPattern.test(operation.reason) + ); +} + +export function reviewExtractedMemoryOperations( + operations: MemoryOperation[], + existingEntries: MemoryEntry[] +): ReviewedMemoryOperations { + const reviews = operations + .map((operation, index): CandidateReview | null => { + if ( + operation.action !== "upsert" || + !operation.summary || + !reviewableTopics.has(operation.topic) + ) { + return null; + } + + const choices = extractDirectiveChoices(operation); + if (choices.length === 0) { + return null; + } + + return { + index, + operation, + groupKey: buildGroupKey(operation.scope, operation.topic), + choices, + highConfidence: + isHighConfidenceReplacement(operation) || + (operation.topic === "commands" && hasCommandReplacementDelete(operations, operation)) + }; + }) + .filter((review): review is CandidateReview => Boolean(review)); + + if (reviews.length === 0) { + return { + operations, + suppressedOperationCount: 0, + conflicts: [] + }; + } + + const suppressedIndices = new Set(); + const retainedIndices = new Set(reviews.map((review) => review.index)); + const conflicts: MemoryConflictCandidate[] = []; + + for (const review of reviews) { + const conflictingReviews = reviews + .filter( + (candidate) => + candidate.index !== review.index && + candidate.groupKey === review.groupKey && + choicesConflict(review.choices, candidate.choices) + ); + const conflictingCandidates = conflictingReviews + .map((candidate) => candidate.operation.summary) + .filter((summary): summary is string => typeof summary === "string"); + const preferredWithinRolloutWinner = findPreferredWithinRolloutWinner( + review, + conflictingReviews + ); + + const conflictingExisting = existingEntries + .filter( + (entry) => + entry.scope === review.operation.scope && + entry.topic === review.operation.topic && + choicesConflict(review.choices, extractDirectiveChoices({ + action: "upsert", + scope: entry.scope, + topic: entry.topic, + id: entry.id, + summary: entry.summary, + details: entry.details, + sources: entry.sources, + reason: entry.reason + })) + ) + .map((entry) => entry.summary); + + const shouldSuppressForWithinRollout = + conflictingReviews.length > 0 && + (preferredWithinRolloutWinner + ? preferredWithinRolloutWinner.index !== review.index + : true); + const hasExistingConflict = conflictingExisting.length > 0; + const shouldSuppress = + shouldSuppressForWithinRollout || (hasExistingConflict && !review.highConfidence); + + if (!shouldSuppress) { + continue; + } + + suppressedIndices.add(review.index); + retainedIndices.delete(review.index); + + const withinRolloutConflict = buildConflictCandidate( + review.operation, + "within-rollout", + conflictingCandidates + ); + if (withinRolloutConflict) { + conflicts.push(withinRolloutConflict); + } + + const existingMemoryConflict = buildConflictCandidate( + review.operation, + "existing-memory", + conflictingExisting + ); + if (existingMemoryConflict) { + conflicts.push(existingMemoryConflict); + } + } + + const groupsNeedingDeleteSuppression = new Set(); + for (const review of reviews) { + if (!suppressedIndices.has(review.index)) { + continue; + } + + if (!hasRetainedHighConfidenceCandidate(reviews, retainedIndices, review.groupKey)) { + groupsNeedingDeleteSuppression.add(review.groupKey); + } + } + + const keptOperations = operations.filter((operation, index) => { + if (suppressedIndices.has(index)) { + return false; + } + + if (!isReplacementDelete(operation)) { + return true; + } + + return !groupsNeedingDeleteSuppression.has(buildGroupKey(operation.scope, operation.topic)); + }); + + return { + operations: keptOperations, + suppressedOperationCount: operations.length - keptOperations.length, + conflicts + }; +} diff --git a/src/lib/extractor/session-continuity-evidence.ts b/src/lib/extractor/session-continuity-evidence.ts index 4e48eea..98dfcf4 100644 --- a/src/lib/extractor/session-continuity-evidence.ts +++ b/src/lib/extractor/session-continuity-evidence.ts @@ -42,6 +42,11 @@ const PROGRESS_NARRATION_PATTERNS = [ /^(?:我会|我们会|我先|我将|下面我会|现在我会|随后我会|最后我再|接下来我会|我会做)/u ]; +const packageManagerValues = ["pnpm", "npm", "yarn", "bun"] as const; +const repoSearchValues = ["rg", "ripgrep", "grep"] as const; +const hedgedDirectivePattern = + /(?:\bmaybe\b|\bperhaps\b|\bif possible\b|\bwhen possible\b|\bfor now\b|\bprobably\b|\busually\b|\bsometimes\b|\btry\b|\bconsider\b|\bmight\b|\bcould\b|尽量|如果可以|可能|暂时)/iu; + function isPromptLikeContinuityMessage(text: string): boolean { return CONTINUITY_PROMPT_PATTERNS.some((pattern) => pattern.test(text)); } @@ -76,6 +81,7 @@ export interface SessionContinuityEvidenceBuckets { detectedFileWrites: string[]; explicitNextSteps: string[]; explicitUntried: string[]; + warningHints: string[]; } export function normalizeMessage(message: string, maxLength = 240): string { @@ -211,6 +217,126 @@ export function summarizeFileWrite(toolCall: RolloutToolCall): string | null { return `File modified: ${trimText(basename, 120)}`; } +function escapeRegExp(value: string): string { + return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); +} + +interface DirectiveSignal { + key: string; + value: string; + authoritative: boolean; +} + +function extractDirectiveChoice( + text: string, + values: readonly string[], + key: string +): DirectiveSignal | null { + const normalized = text.toLowerCase(); + const hedged = hedgedDirectivePattern.test(text); + + for (const value of values) { + const escaped = escapeRegExp(value); + const authoritativePatterns = [ + new RegExp(`\\b(?:actually\\s+)?use\\s+${escaped}\\s*,\\s*not\\s+[^,.]+`, "iu"), + new RegExp(`\\b(?:actually\\s+)?use\\s+${escaped}\\s+instead of\\s+[^,.]+`, "iu"), + new RegExp(`\\bprefer\\s+${escaped}\\s+over\\s+[^,.]+`, "iu"), + new RegExp(`\\bnot\\s+[^,.]+[,,]\\s*(?:actually\\s+)?use\\s+${escaped}\\b`, "iu"), + new RegExp(`我们用\\s*${escaped}\\s*[,,]\\s*不用\\s*.+`, "u"), + new RegExp(`(?:别用|不要用).+[,,]\\s*用\\s*${escaped}\\b`, "u"), + new RegExp(`实际上用\\s*${escaped}\\s*[,,]\\s*不要用\\s*.+`, "u") + ]; + const genericPatterns = [ + new RegExp(`\\b(?:we\\s+)?use\\s+${escaped}\\b`, "iu"), + new RegExp(`\\bprefer\\s+${escaped}\\b`, "iu"), + new RegExp(`\\balways\\s+use\\s+${escaped}\\b`, "iu"), + new RegExp(`(?:使用|用|优先用|优先使用)\\s*${escaped}\\b`, "u") + ]; + + if (!hedged && authoritativePatterns.some((pattern) => pattern.test(text))) { + return { + key, + value, + authoritative: true + }; + } + + if (genericPatterns.some((pattern) => pattern.test(normalized))) { + return { + key, + value, + authoritative: false + }; + } + } + + return null; +} + +function collectWarningHints(agentMessages: string[], userMessages: string[]): string[] { + const warnings = new Set(); + const directiveValues = new Map>(); + let promptNoiseDetected = false; + + const applySignal = (signal: DirectiveSignal) => { + const values = directiveValues.get(signal.key) ?? new Set(); + if (signal.authoritative) { + values.clear(); + } + values.add(signal.value); + directiveValues.set(signal.key, values); + }; + + for (const message of agentMessages) { + if (isPromptLikeContinuityMessage(message)) { + promptNoiseDetected = true; + continue; + } + const choices = [ + extractDirectiveChoice(message, packageManagerValues, "package-manager"), + extractDirectiveChoice(message, repoSearchValues, "repo-search") + ].filter((choice): choice is DirectiveSignal => Boolean(choice)); + + for (const choice of choices) { + applySignal(choice); + } + } + + for (const message of userMessages) { + if (isPromptLikeContinuityMessage(message)) { + promptNoiseDetected = true; + continue; + } + + const choices = [ + extractDirectiveChoice(message, packageManagerValues, "package-manager"), + extractDirectiveChoice(message, repoSearchValues, "repo-search") + ].filter((choice): choice is DirectiveSignal => Boolean(choice)); + + for (const choice of choices) { + applySignal(choice); + } + } + + if (promptNoiseDetected) { + warnings.add( + "Reviewer or subagent prompt noise was detected in the rollout; continuity extraction ignored non-product transcript lines." + ); + } + + for (const [key, values] of directiveValues) { + if (values.size <= 1) { + continue; + } + + warnings.add( + `Conflicting ${key.replace(/-/g, " ")} signals were detected in the rollout; verify the current preference before trusting this continuity summary.` + ); + } + + return [...warnings]; +} + export function collectSessionContinuityEvidenceBuckets( evidence: RolloutEvidence ): SessionContinuityEvidenceBuckets { @@ -247,7 +373,8 @@ export function collectSessionContinuityEvidenceBuckets( recentFailedCommands, detectedFileWrites, explicitNextSteps: extractPatternMatches(recentMessagesReversed, NEXT_STEP_PATTERNS, 4), - explicitUntried: extractPatternMatches(recentMessagesReversed, UNTRIED_PATTERNS, 6) + explicitUntried: extractPatternMatches(recentMessagesReversed, UNTRIED_PATTERNS, 6), + warningHints: collectWarningHints(recentAgentMessages, recentUserMessages) }; } diff --git a/src/lib/extractor/session-continuity-prompt.ts b/src/lib/extractor/session-continuity-prompt.ts index af3a505..a5bc1fa 100644 --- a/src/lib/extractor/session-continuity-prompt.ts +++ b/src/lib/extractor/session-continuity-prompt.ts @@ -68,7 +68,9 @@ function formatEvidenceBuckets(buckets: SessionContinuityEvidenceBuckets): strin "", formatBucket("Candidate explicit next-step phrases", buckets.explicitNextSteps), "", - formatBucket("Candidate explicit untried phrases", buckets.explicitUntried) + formatBucket("Candidate explicit untried phrases", buckets.explicitUntried), + "", + formatBucket("Reviewer warning hints", buckets.warningHints) ].join("\n"); } @@ -120,6 +122,7 @@ Important product rule: - Put project-wide prerequisites or decisions in project. - Do not guess untried options; only include them when the rollout explicitly suggests them. - Do not mark something as confirmed working unless there is concrete evidence in tool output or clear confirmation in the conversation. +- Reviewer warning hints are reviewer-only confidence context. Do not copy those warning phrases into project or projectLocal continuity items. Current rollout: - Session id: ${evidence.sessionId} @@ -144,11 +147,12 @@ ${formatEvidenceBuckets( buckets ?? { recentSuccessfulCommands: [], recentFailedCommands: [], - detectedFileWrites: [], - explicitNextSteps: [], - explicitUntried: [] - } -)} + detectedFileWrites: [], + explicitNextSteps: [], + explicitUntried: [], + warningHints: [] + } + )} Return JSON only, matching the provided schema. `.trim(); diff --git a/src/lib/extractor/session-continuity-summarizer.ts b/src/lib/extractor/session-continuity-summarizer.ts index 7d6c0d8..1cdb4ad 100644 --- a/src/lib/extractor/session-continuity-summarizer.ts +++ b/src/lib/extractor/session-continuity-summarizer.ts @@ -7,6 +7,7 @@ import type { AppConfig, ExistingSessionContinuityState, RolloutEvidence, + SessionContinuityConfidence, SessionContinuityDiagnostics, SessionContinuityGenerationResult, SessionContinuityLayerSummary, @@ -87,6 +88,12 @@ const layerKeys = [ "filesDecisionsEnvironment" ] satisfies Array; +const REVIEWER_WARNING_PATTERNS = [ + /\breviewer or subagent prompt noise\b/iu, + /\bconflicting .+ signals were detected in the rollout\b/iu, + /\bverify the current preference before trusting this continuity summary\b/iu +]; + function isStringArray(value: unknown): value is string[] { return Array.isArray(value) && value.every((item) => typeof item === "string"); } @@ -136,11 +143,85 @@ function shouldFallbackForLowSignal( return hasEvidenceBuckets(buckets) && !hasEvidenceBearingContent(summary); } +function normalizeWarningComparableText(value: string): string { + return value.replace(/\s+/g, " ").trim().toLowerCase(); +} + +function shouldStripReviewerWarningProse(item: string, warningHints: string[]): boolean { + const normalizedItem = normalizeWarningComparableText(item); + if (!normalizedItem) { + return false; + } + + if (REVIEWER_WARNING_PATTERNS.some((pattern) => pattern.test(normalizedItem))) { + return true; + } + + return warningHints.some((hint) => { + const normalizedHint = normalizeWarningComparableText(hint); + return ( + normalizedHint.length > 0 && + (normalizedItem === normalizedHint || + normalizedItem.includes(normalizedHint) || + normalizedHint.includes(normalizedItem)) + ); + }); +} + +function scrubReviewerWarningProseFromLayer( + layer: SessionContinuityLayerSummary, + warningHints: string[] +): SessionContinuityLayerSummary { + const stripItems = (items: string[]) => + items.filter((item) => !shouldStripReviewerWarningProse(item, warningHints)); + + return { + goal: shouldStripReviewerWarningProse(layer.goal, warningHints) ? "" : layer.goal, + confirmedWorking: stripItems(layer.confirmedWorking), + triedAndFailed: stripItems(layer.triedAndFailed), + notYetTried: stripItems(layer.notYetTried), + incompleteNext: stripItems(layer.incompleteNext), + filesDecisionsEnvironment: stripItems(layer.filesDecisionsEnvironment) + }; +} + +function scrubReviewerWarningProse( + summary: SessionContinuitySummary, + warningHints: string[] +): SessionContinuitySummary { + if (warningHints.length === 0) { + return summary; + } + + return { + ...summary, + project: scrubReviewerWarningProseFromLayer(summary.project, warningHints), + projectLocal: scrubReviewerWarningProseFromLayer(summary.projectLocal, warningHints) + }; +} + +function determineConfidence( + actualPath: SessionContinuityDiagnostics["actualPath"], + warnings: string[], + fallbackReason?: SessionContinuityDiagnostics["fallbackReason"], + usedFallbackNext = false +): SessionContinuityConfidence { + if (fallbackReason || warnings.length > 0 || usedFallbackNext) { + return "low"; + } + + if (actualPath === "codex") { + return "high"; + } + + return "medium"; +} + function heuristicSummary( evidence: RolloutEvidence, existingState?: ExistingSessionContinuityState, buckets = collectSessionContinuityEvidenceBuckets(evidence) -): SessionContinuitySummary { +): { summary: SessionContinuitySummary; usedFallbackNext: boolean } { const recentUserMessages = evidence.userMessages.map((message) => trimText(message, 240)); const recentAgentMessages = evidence.agentMessages.map((message) => trimText(message, 240)); const recentMessages = [...recentAgentMessages.slice(-10), ...recentUserMessages.slice(-10)]; @@ -167,23 +248,26 @@ function heuristicSummary( const sharedGoal = recentUserMessages.at(-1) ?? existingProject?.goal ?? existingLocal?.goal ?? ""; return { - sourceSessionId: evidence.sessionId, - project: buildLayerSummary(existingProject, { - goal: sharedGoal, - confirmedWorking: buckets.recentSuccessfulCommands, - triedAndFailed: buckets.recentFailedCommands, - notYetTried: projectUntried, - filesDecisionsEnvironment: notes.project - }), - projectLocal: buildLayerSummary(existingLocal, { - goal: "", - notYetTried: localUntried, - incompleteNext: fallbackNext, - filesDecisionsEnvironment: [ - ...buckets.detectedFileWrites, - ...notes.projectLocal - ] - }) + summary: { + sourceSessionId: evidence.sessionId, + project: buildLayerSummary(existingProject, { + goal: sharedGoal, + confirmedWorking: buckets.recentSuccessfulCommands, + triedAndFailed: buckets.recentFailedCommands, + notYetTried: projectUntried, + filesDecisionsEnvironment: notes.project + }), + projectLocal: buildLayerSummary(existingLocal, { + goal: "", + notYetTried: localUntried, + incompleteNext: fallbackNext, + filesDecisionsEnvironment: [ + ...buckets.detectedFileWrites, + ...notes.projectLocal + ] + }) + }, + usedFallbackNext: nextSteps.length === 0 && fallbackNext.length > 0 }; } @@ -193,14 +277,30 @@ function buildDiagnostics( actualPath: SessionContinuityDiagnostics["actualPath"], buckets: SessionContinuityEvidenceBuckets, fallbackReason?: SessionContinuityDiagnostics["fallbackReason"], - codexExitCode?: number + codexExitCode?: number, + warnings: string[] = [], + usedFallbackNext = false ): SessionContinuityDiagnostics { + const normalizedWarnings = [...new Set(warnings)]; + if (usedFallbackNext) { + normalizedWarnings.push( + "Next steps were inferred from the latest request because the rollout did not contain an explicit next-step phrase." + ); + } + return { generatedAt: new Date().toISOString(), rolloutPath: evidence.rolloutPath, sourceSessionId: evidence.sessionId, preferredPath, actualPath, + confidence: determineConfidence( + actualPath, + normalizedWarnings, + fallbackReason, + usedFallbackNext + ), + warnings: normalizedWarnings, fallbackReason, codexExitCode, evidenceCounts: buildSessionContinuityEvidenceCounts(buckets) @@ -236,14 +336,18 @@ export class SessionContinuitySummarizer { ): Promise { const buckets = collectSessionContinuityEvidenceBuckets(evidence); if (this.config.extractorMode !== "codex") { + const heuristic = heuristicSummary(evidence, existingState, buckets); return { - summary: heuristicSummary(evidence, existingState, buckets), + summary: heuristic.summary, diagnostics: buildDiagnostics( evidence, "heuristic", "heuristic", buckets, - "configured-heuristic" + "configured-heuristic", + undefined, + buckets.warningHints, + heuristic.usedFallbackNext ) }; } @@ -258,20 +362,24 @@ export class SessionContinuitySummarizer { "codex", buckets, undefined, - attempt.codexExitCode + attempt.codexExitCode, + buckets.warningHints ) }; } + const heuristic = heuristicSummary(evidence, existingState, buckets); return { - summary: heuristicSummary(evidence, existingState, buckets), + summary: heuristic.summary, diagnostics: buildDiagnostics( evidence, "codex", "heuristic", buckets, attempt.fallbackReason, - attempt.codexExitCode + attempt.codexExitCode, + buckets.warningHints, + heuristic.usedFallbackNext ) }; } @@ -329,10 +437,13 @@ export class SessionContinuitySummarizer { }; } - const summary: SessionContinuitySummary = { - ...parsed, - sourceSessionId: parsed.sourceSessionId ?? evidence.sessionId - }; + const summary = scrubReviewerWarningProse( + { + ...parsed, + sourceSessionId: parsed.sourceSessionId ?? evidence.sessionId + }, + buckets.warningHints + ); if (shouldFallbackForLowSignal(summary, buckets)) { return { summary: null, diff --git a/src/lib/types.ts b/src/lib/types.ts index debad4d..2565c34 100644 --- a/src/lib/types.ts +++ b/src/lib/types.ts @@ -29,6 +29,19 @@ export interface MemoryOperation { reason?: string; } +export type MemoryConflictSource = "within-rollout" | "existing-memory"; + +export type MemoryConflictResolution = "suppressed"; + +export interface MemoryConflictCandidate { + scope: MemoryScope; + topic: string; + candidateSummary: string; + conflictsWith: string[]; + source: MemoryConflictSource; + resolution: MemoryConflictResolution; +} + export interface CompiledStartupMemory { text: string; lineCount: number; @@ -157,6 +170,8 @@ export interface SessionContinuitySummary { export type SessionContinuityExtractorPath = "codex" | "heuristic"; +export type SessionContinuityConfidence = "high" | "medium" | "low"; + export type SessionContinuityFallbackReason = | "codex-command-failed" | "invalid-json" @@ -178,6 +193,8 @@ export interface SessionContinuityDiagnostics { sourceSessionId: string; preferredPath: SessionContinuityExtractorPath; actualPath: SessionContinuityExtractorPath; + confidence: SessionContinuityConfidence; + warnings: string[]; fallbackReason?: SessionContinuityFallbackReason; codexExitCode?: number; evidenceCounts: SessionContinuityEvidenceCounts; @@ -200,6 +217,8 @@ export interface SessionContinuityAuditEntry { sourceSessionId: string; preferredPath: SessionContinuityExtractorPath; actualPath: SessionContinuityExtractorPath; + confidence?: SessionContinuityConfidence; + warnings?: string[]; fallbackReason?: SessionContinuityFallbackReason; codexExitCode?: number; evidenceCounts: SessionContinuityEvidenceCounts; @@ -244,8 +263,10 @@ export interface MemorySyncAuditEntry { skipReason?: MemorySyncAuditSkipReason; isRecovery?: boolean; appliedCount: number; + suppressedOperationCount?: number; scopesTouched: MemoryScope[]; resultSummary: string; + conflicts?: MemoryConflictCandidate[]; operations: MemoryOperation[]; } @@ -263,7 +284,9 @@ export interface SyncRecoveryRecord { actualExtractorName: string; status: "applied" | "no-op"; appliedCount: number; + suppressedOperationCount?: number; scopesTouched: MemoryScope[]; + conflicts?: MemoryConflictCandidate[]; failedStage: SyncRecoveryFailedStage; failureMessage: string; auditEntryWritten: boolean; @@ -348,6 +371,8 @@ export interface ContinuityRecoveryRecord { writtenPaths: string[]; preferredPath: SessionContinuityExtractorPath; actualPath: SessionContinuityExtractorPath; + confidence?: SessionContinuityConfidence; + warnings?: string[]; fallbackReason?: SessionContinuityFallbackReason; codexExitCode?: number; evidenceCounts: SessionContinuityEvidenceCounts; diff --git a/test/docs-contract.test.ts b/test/docs-contract.test.ts new file mode 100644 index 0000000..1bac61d --- /dev/null +++ b/test/docs-contract.test.ts @@ -0,0 +1,58 @@ +import fs from "node:fs/promises"; +import path from "node:path"; +import { describe, expect, it } from "vitest"; + +async function readDoc(relativePath: string): Promise { + return fs.readFile(path.join(process.cwd(), relativePath), "utf8"); +} + +describe("docs contract", () => { + it("keeps the public reviewer command surface and deterministic verification entry points documented", async () => { + const readme = await readDoc("README.md"); + const readmeEn = await readDoc("README.en.md"); + const releaseChecklist = await readDoc("docs/release-checklist.md"); + const contributing = await readDoc("CONTRIBUTING.md"); + + expect(readme).toContain("cam memory"); + expect(readme).toContain("cam session status"); + expect(readme).toContain("cam session refresh"); + expect(readme).toContain("reviewer warning prose"); + expect(readmeEn).toContain("cam memory"); + expect(readmeEn).toContain("cam session status"); + expect(readmeEn).toContain("confidence"); + expect(readmeEn).toContain("deterministic scrub"); + expect(releaseChecklist).toContain("pnpm exec tsx src/cli.ts audit"); + expect(releaseChecklist).toContain("pnpm test:docs-contract"); + expect(releaseChecklist).toContain("pnpm test:reviewer-smoke"); + expect(releaseChecklist).toContain("pnpm test:cli-smoke"); + expect(releaseChecklist).toContain("pnpm pack:check"); + expect(releaseChecklist).toContain("pnpm exec tsx src/cli.ts session refresh --json"); + expect(releaseChecklist).toContain("pnpm exec tsx src/cli.ts session load --json"); + expect(releaseChecklist).toContain("pnpm exec tsx src/cli.ts session status --json"); + expect(contributing).toContain("reviewer-only warnings"); + expect(contributing).toContain("pnpm test:docs-contract"); + }); + + it("keeps continuity, architecture, and migration wording aligned with the current product posture", async () => { + const continuityDoc = await readDoc("docs/session-continuity.md"); + const nativeMigrationDoc = await readDoc("docs/native-migration.md"); + const architecture = await readDoc("docs/architecture.md"); + const architectureEn = await readDoc("docs/architecture.en.md"); + const readme = await readDoc("README.md"); + const readmeEn = await readDoc("README.en.md"); + + expect(continuityDoc).toContain("save` keeps merge semantics"); + expect(continuityDoc).toContain("refresh` ignores existing continuity"); + expect(continuityDoc).toContain("pending continuity recovery marker"); + expect(continuityDoc).toContain("**not** written into the continuity Markdown files themselves"); + expect(continuityDoc).toContain("reviewer/debug data belongs in an audit surface"); + expect(nativeMigrationDoc).toContain("companion-first"); + expect(nativeMigrationDoc).toContain("trusted primary path"); + expect(architecture).toContain("reviewer warning / confidence 属于 audit side metadata"); + expect(architecture).toContain("startup provenance 只列出这次注入时真实读取到的 continuity 文件"); + expect(architectureEn).toContain("reviewer warnings and confidence remain audit-side metadata"); + expect(architectureEn).toContain("startup provenance only lists continuity files that were actually read"); + expect(readme).toContain("companion-first"); + expect(readmeEn).toContain("companion-first"); + }); +}); diff --git a/test/extractor.test.ts b/test/extractor.test.ts index 31991cc..78c8f54 100644 --- a/test/extractor.test.ts +++ b/test/extractor.test.ts @@ -4,6 +4,7 @@ import path from "node:path"; import { afterEach, describe, expect, it } from "vitest"; import { parseRolloutEvidence } from "../src/lib/domain/rollout.js"; import { CodexExtractor } from "../src/lib/extractor/codex-extractor.js"; +import { reviewExtractedMemoryOperations } from "../src/lib/extractor/contradiction-review.js"; import { HeuristicExtractor } from "../src/lib/extractor/heuristic-extractor.js"; import { filterMemoryOperations } from "../src/lib/extractor/safety.js"; import type { MemoryEntry, RolloutEvidence } from "../src/lib/types.js"; @@ -463,6 +464,137 @@ describe("HeuristicExtractor", () => { ) ).toBe(false); }); + + it("extracts conflicting same-rollout preference candidates from a real fixture for reviewer-side suppression", async () => { + const extractor = new HeuristicExtractor(); + const evidence = await parseRolloutEvidence( + path.join(process.cwd(), "test/fixtures/rollouts/within-rollout-preference-conflict.jsonl") + ); + + expect(evidence).not.toBeNull(); + + const operations = await extractor.extract(evidence!, []); + const upserts = operations.filter((operation) => operation.action === "upsert"); + + expect(upserts).toHaveLength(2); + expect(upserts.map((operation) => operation.summary)).toEqual( + expect.arrayContaining([ + "we use pnpm in this repository", + "we use bun in this repository" + ]) + ); + }); + + it("treats mixed-language explicit corrections in noisy rollouts as high-confidence replacements", async () => { + const extractor = new HeuristicExtractor(); + const evidence = await parseRolloutEvidence( + path.join(process.cwd(), "test/fixtures/rollouts/mixed-language-reviewer-noise.jsonl") + ); + + expect(evidence).not.toBeNull(); + + const operations = await extractor.extract(evidence!, [ + { + id: "use-bun", + scope: "project", + topic: "preferences", + summary: "Use bun in this repository.", + details: ["Use bun instead of pnpm in this repository."], + updatedAt: "2026-03-14T00:00:00.000Z", + sources: ["old"] + } + ]); + + expect( + operations.some((operation) => operation.action === "delete" && operation.id === "use-bun") + ).toBe(true); + expect( + operations.some( + (operation) => + operation.action === "upsert" && + operation.summary?.includes("实际上用 pnpm,不要用 bun") + ) + ).toBe(true); + expect( + operations.some( + (operation) => + operation.action === "upsert" && + /reviewer sub-agent|cookie middleware/u.test(operation.summary ?? "") + ) + ).toBe(false); + }); + + it("keeps a hedged conflict candidate available for later conservative suppression", async () => { + const extractor = new HeuristicExtractor(); + const evidence = await parseRolloutEvidence( + path.join(process.cwd(), "test/fixtures/rollouts/hedged-preference-conflict.jsonl") + ); + + expect(evidence).not.toBeNull(); + + const operations = await extractor.extract(evidence!, [ + { + id: "use-pnpm", + scope: "project", + topic: "preferences", + summary: "Use pnpm in this repository.", + details: ["Use pnpm instead of npm in this repository."], + updatedAt: "2026-03-14T00:00:00.000Z", + sources: ["old"] + } + ]); + + expect( + operations.some((operation) => operation.action === "delete" && operation.id === "use-pnpm") + ).toBe(true); + expect( + operations.some( + (operation) => + operation.action === "upsert" && + operation.summary?.includes("maybe use bun instead of pnpm") + ) + ).toBe(true); + }); + + it("keeps the latest high-confidence correction and suppresses stale same-rollout candidates", async () => { + const extractor = new HeuristicExtractor(); + const operations = await extractor.extract( + baseEvidence({ + userMessages: [ + "remember that we use bun in this repository", + "Actually use pnpm, not bun." + ] + }), + [] + ); + + const reviewed = reviewExtractedMemoryOperations(operations, []); + + expect( + reviewed.operations.some( + (operation) => + operation.action === "upsert" && + operation.summary === "Actually use pnpm, not bun" + ) + ).toBe(true); + expect( + reviewed.operations.some( + (operation) => + operation.action === "upsert" && + operation.summary === "we use bun in this repository" + ) + ).toBe(false); + expect(reviewed.suppressedOperationCount).toBe(1); + expect(reviewed.conflicts).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + source: "within-rollout", + candidateSummary: "we use bun in this repository", + conflictsWith: ["Actually use pnpm, not bun"] + }) + ]) + ); + }); }); describe("safety filter", () => { diff --git a/test/fixtures/rollouts/hedged-preference-conflict.jsonl b/test/fixtures/rollouts/hedged-preference-conflict.jsonl new file mode 100644 index 0000000..39e166b --- /dev/null +++ b/test/fixtures/rollouts/hedged-preference-conflict.jsonl @@ -0,0 +1,4 @@ +{"type":"session_meta","payload":{"id":"fixture-hedged-preference-conflict","timestamp":"2026-03-19T00:00:00.000Z","cwd":"/tmp/project"}} +{"type":"event_msg","payload":{"type":"user_message","message":"Remember that maybe use bun instead of pnpm in this repository."}} +{"type":"event_msg","payload":{"type":"agent_message","message":"The package manager note still needs reviewer confirmation before it becomes durable memory."}} + diff --git a/test/fixtures/rollouts/mixed-language-reviewer-noise.jsonl b/test/fixtures/rollouts/mixed-language-reviewer-noise.jsonl new file mode 100644 index 0000000..49ecc3a --- /dev/null +++ b/test/fixtures/rollouts/mixed-language-reviewer-noise.jsonl @@ -0,0 +1,12 @@ +{"type":"session_meta","payload":{"id":"fixture-mixed-language-reviewer-noise","timestamp":"2026-03-19T00:00:00.000Z","cwd":"/tmp/project"}} +{"type":"event_msg","payload":{"type":"user_message","message":"We haven't tried switching the login route to cookies() yet."}} +{"type":"event_msg","payload":{"type":"user_message","message":"下一步:更新 src/auth/login.ts,并补 cookie middleware。"}} +{"type":"event_msg","payload":{"type":"agent_message","message":"You are reviewer sub-agent 4 for a high-accountability code review. Work read-only. Focus on docs and contract surfaces only."}} +{"type":"event_msg","payload":{"type":"agent_message","message":"Use bun in this repo for faster installs."}} +{"type":"event_msg","payload":{"type":"user_message","message":"实际上用 pnpm,不要用 bun。"}} +{"type":"response_item","payload":{"type":"function_call","name":"exec_command","call_id":"call-1","arguments":"{\"cmd\":\"pnpm test\"}"}} +{"type":"response_item","payload":{"type":"function_call_output","call_id":"call-1","output":"PASS auth cookie suite\\n0 failing\\nDone in 2.3s"}} +{"type":"response_item","payload":{"type":"function_call","name":"exec_command","call_id":"call-2","arguments":"{\"cmd\":\"pnpm build\"}"}} +{"type":"response_item","payload":{"type":"function_call_output","call_id":"call-2","output":"Error: missing NEXTAUTH_URL\\nProcess exited with code 1"}} +{"type":"response_item","payload":{"type":"function_call","name":"edit_file","call_id":"call-3","arguments":"{\"path\":\"src/auth/login.ts\"}"}} +{"type":"response_item","payload":{"type":"function_call","name":"apply_patch_freeform","call_id":"call-4","arguments":"diff --git a/src/auth/login.ts b/src/auth/login.ts\nindex abc..def 100644\n--- a/src/auth/login.ts\n+++ b/src/auth/login.ts\n@@ -1,3 +1,4 @@\n+setCookie(token);\n export {};"}} diff --git a/test/fixtures/rollouts/within-rollout-preference-conflict.jsonl b/test/fixtures/rollouts/within-rollout-preference-conflict.jsonl new file mode 100644 index 0000000..7a149a2 --- /dev/null +++ b/test/fixtures/rollouts/within-rollout-preference-conflict.jsonl @@ -0,0 +1,6 @@ +{"type":"session_meta","payload":{"id":"fixture-within-rollout-preference-conflict","timestamp":"2026-03-19T00:00:00.000Z","cwd":"/tmp/project"}} +{"type":"event_msg","payload":{"type":"user_message","message":"Remember that we use pnpm in this repository."}} +{"type":"event_msg","payload":{"type":"user_message","message":"Remember that we use bun in this repository."}} +{"type":"event_msg","payload":{"type":"agent_message","message":"Next step: keep only one package manager choice in the docs and startup notes."}} +{"type":"response_item","payload":{"type":"function_call","name":"edit_file","call_id":"call-1","arguments":"{\"path\":\"docs/setup.md\"}"}} + diff --git a/test/memory-command.test.ts b/test/memory-command.test.ts index 287adf5..1ef6be3 100644 --- a/test/memory-command.test.ts +++ b/test/memory-command.test.ts @@ -6,6 +6,7 @@ import { runMemory } from "../src/lib/commands/memory.js"; import { configPaths } from "../src/lib/config/load-config.js"; import { detectProjectContext } from "../src/lib/domain/project-context.js"; import { MemoryStore } from "../src/lib/domain/memory-store.js"; +import { runCommandCapture } from "../src/lib/util/process.js"; import type { AppConfig, MemoryCommandOutput } from "../src/lib/types.js"; import { makeAppConfig, @@ -14,6 +15,10 @@ import { const tempDirs: string[] = []; const originalHome = process.env.HOME; +const sourceCliPath = path.resolve("src/cli.ts"); +const tsxBinaryPath = path.resolve( + process.platform === "win32" ? "node_modules/.bin/tsx.cmd" : "node_modules/.bin/tsx" +); async function tempDir(prefix: string): Promise { const dir = await fs.mkdtemp(path.join(os.tmpdir(), prefix)); @@ -29,6 +34,10 @@ afterEach(async () => { const buildProjectConfig = makeAppConfig; const writeProjectConfig = writeCamConfig; +function runCli(repoDir: string, args: string[]) { + return runCommandCapture(tsxBinaryPath, [sourceCliPath, ...args], repoDir); +} + describe("runMemory", () => { it("shows scope details and recent audit entries", async () => { const homeDir = await tempDir("cam-memory-home-"); @@ -70,8 +79,19 @@ describe("runMemory", () => { sessionSource: "rollout-jsonl", status: "applied", appliedCount: 1, + suppressedOperationCount: 1, scopesTouched: ["project"], resultSummary: "1 operation(s) applied", + conflicts: [ + { + scope: "project", + topic: "preferences", + candidateSummary: "Maybe use bun instead of pnpm in this repository.", + conflictsWith: ["Prefer pnpm in this repository."], + source: "existing-memory", + resolution: "suppressed" + } + ], operations: [ { action: "upsert", @@ -152,7 +172,10 @@ describe("runMemory", () => { expect(output).toContain("[skipped] Skipped rollout; it was already processed"); expect(output).toContain("Configured: codex-ephemeral (codex) -> Actual: heuristic (heuristic)"); expect(output).toContain("Skip reason: already-processed"); - expect(output).toContain("Applied: 0 | Scopes: none"); + expect(output).toContain("Applied: 0 | Suppressed: 0 | Scopes: none"); + expect(output).toContain("Suppressed: 1"); + expect(output).toContain("Conflict review:"); + expect(output).toContain("[existing-memory] preferences: Maybe use bun instead of pnpm in this repository."); }); it("adds startupFilesByScope, recentSyncAudit, and syncAuditPath in json output", async () => { @@ -195,8 +218,19 @@ describe("runMemory", () => { sessionSource: "rollout-jsonl", status: "applied", appliedCount: 1, + suppressedOperationCount: 1, scopesTouched: ["project"], resultSummary: "1 operation(s) applied", + conflicts: [ + { + scope: "project", + topic: "preferences", + candidateSummary: "Maybe use bun instead of pnpm in this repository.", + conflictsWith: ["Prefer pnpm in this repository."], + source: "existing-memory", + resolution: "suppressed" + } + ], operations: [ { action: "upsert", @@ -247,9 +281,18 @@ describe("runMemory", () => { rolloutPath: "/tmp/rollout-1.jsonl", status: "applied", appliedCount: 1, + suppressedOperationCount: 1, configuredExtractorMode: "codex", actualExtractorMode: "heuristic" }); + expect(output.recentSyncAudit[0]?.conflicts).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + source: "existing-memory", + resolution: "suppressed" + }) + ]) + ); expect(output.recentAudit).toEqual(output.recentSyncAudit); const textOutput = await runMemory({ @@ -406,6 +449,77 @@ describe("runMemory", () => { expect(textOutput.match(/\/tmp\/rollout-repeat\.jsonl/g) ?? []).toHaveLength(1); }); + it("supports memory --recent --json and --print-startup from the CLI command surface", async () => { + const homeDir = await tempDir("cam-memory-cli-home-"); + const projectDir = await tempDir("cam-memory-cli-project-"); + const memoryRoot = await tempDir("cam-memory-cli-root-"); + process.env.HOME = homeDir; + + const projectConfig = buildProjectConfig(); + await writeProjectConfig(projectDir, projectConfig, { + autoMemoryDirectory: memoryRoot + }); + + const project = detectProjectContext(projectDir); + const store = new MemoryStore(project, { + ...projectConfig, + autoMemoryDirectory: memoryRoot + }); + await store.ensureLayout(); + await store.remember( + "project", + "workflow", + "prefer-pnpm", + "Prefer pnpm in this repository.", + ["Use pnpm instead of npm in this repository."], + "Manual note." + ); + await store.appendSyncAuditEntry({ + appliedAt: "2026-03-14T12:00:00.000Z", + projectId: project.projectId, + worktreeId: project.worktreeId, + rolloutPath: "/tmp/rollout-memory-cli.jsonl", + sessionId: "session-memory-cli", + configuredExtractorMode: "heuristic", + configuredExtractorName: "heuristic", + actualExtractorMode: "heuristic", + actualExtractorName: "heuristic", + extractorMode: "heuristic", + extractorName: "heuristic", + sessionSource: "rollout-jsonl", + status: "applied", + appliedCount: 1, + scopesTouched: ["project"], + resultSummary: "1 operation(s) applied", + operations: [ + { + action: "upsert", + scope: "project", + topic: "workflow", + id: "prefer-pnpm", + summary: "Prefer pnpm in this repository.", + details: ["Use pnpm instead of npm in this repository."], + reason: "Manual note.", + sources: ["manual"] + } + ] + }); + + const jsonResult = runCli(projectDir, ["memory", "--recent", "2", "--json"]); + expect(jsonResult.exitCode).toBe(0); + const jsonOutput = JSON.parse(jsonResult.stdout) as MemoryCommandOutput; + expect(jsonOutput.recentSyncAudit).toHaveLength(1); + expect(jsonOutput.recentSyncAudit[0]?.rolloutPath).toBe("/tmp/rollout-memory-cli.jsonl"); + expect(jsonOutput.syncAuditPath).toBe(store.getSyncAuditPath()); + + const textResult = runCli(projectDir, ["memory", "--recent", "2", "--print-startup"]); + expect(textResult.exitCode).toBe(0); + expect(textResult.stdout).toContain("Startup memory:"); + expect(textResult.stdout).toContain("# Codex Auto Memory"); + expect(textResult.stdout).toContain("Recent sync events (1 grouped):"); + expect(textResult.stdout).toContain(store.getMemoryFile("project")); + }, 30_000); + it("does not report startup-loaded files when the startup budget cannot fit quoted lines", async () => { const homeDir = await tempDir("cam-memory-header-only-home-"); const projectDir = await tempDir("cam-memory-header-only-project-"); diff --git a/test/memory-sync-audit.test.ts b/test/memory-sync-audit.test.ts index a92248e..0e759bd 100644 --- a/test/memory-sync-audit.test.ts +++ b/test/memory-sync-audit.test.ts @@ -103,7 +103,7 @@ describe("memory-sync-audit", () => { expect(lines[0]).toContain("[skipped] [recovery]"); expect(lines[1]).toContain("Session: unknown"); - expect(lines[2]).toContain("Applied: 0 | Scopes: none"); + expect(lines[2]).toContain("Applied: 0 | Suppressed: 0 | Scopes: none"); expect(lines).toContain( " Configured: codex-ephemeral (codex) -> Actual: heuristic (heuristic)" ); diff --git a/test/recovery-records.test.ts b/test/recovery-records.test.ts index a28f7f1..f9c1b98 100644 --- a/test/recovery-records.test.ts +++ b/test/recovery-records.test.ts @@ -97,6 +97,8 @@ describe("recovery-records", () => { sourceSessionId: "session-1", preferredPath: "codex", actualPath: "heuristic", + confidence: "low", + warnings: ["Low-signal continuity fallback."], fallbackReason: "low-signal", codexExitCode: 17, evidenceCounts: { @@ -151,6 +153,8 @@ describe("recovery-records", () => { sourceSessionId: "session-1", preferredPath: "heuristic", actualPath: "heuristic", + confidence: "low", + warnings: [], fallbackReason: "configured-heuristic", evidenceCounts: { successfulCommands: 1, diff --git a/test/session-command.test.ts b/test/session-command.test.ts index 49fba85..1b2b9bf 100644 --- a/test/session-command.test.ts +++ b/test/session-command.test.ts @@ -148,6 +148,7 @@ describe("runSession", () => { }); expect(saveOutput).toContain("Saved session continuity"); expect(saveOutput).toContain("Generation: heuristic"); + expect(saveOutput).toContain("confidence low"); expect(saveOutput).toContain("Evidence: successful"); expect(saveOutput).toContain("Written paths:"); @@ -162,10 +163,14 @@ describe("runSession", () => { diagnostics: { preferredPath: string; actualPath: string; + confidence: string; + warnings: string[]; fallbackReason?: string; }; latestContinuityAuditEntry: { rolloutPath: string; + confidence?: string; + warnings?: string[]; fallbackReason?: string; evidenceCounts: { successfulCommands: number; @@ -184,8 +189,20 @@ describe("runSession", () => { }; expect(saveJson.diagnostics.preferredPath).toBe("heuristic"); expect(saveJson.diagnostics.actualPath).toBe("heuristic"); + expect(saveJson.diagnostics.confidence).toBe("low"); + expect(saveJson.diagnostics.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Next steps were inferred from the latest request") + ]) + ); expect(saveJson.diagnostics.fallbackReason).toBe("configured-heuristic"); expect(saveJson.latestContinuityAuditEntry?.rolloutPath).toBe(secondRolloutPath); + expect(saveJson.latestContinuityAuditEntry?.confidence).toBe("low"); + expect(saveJson.latestContinuityAuditEntry?.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Next steps were inferred from the latest request") + ]) + ); expect(saveJson.latestContinuityAuditEntry?.fallbackReason).toBe("configured-heuristic"); expect(saveJson.latestContinuityAuditEntry?.evidenceCounts.successfulCommands).toBeGreaterThan(0); expect(saveJson.latestContinuityAuditEntry?.writtenPaths.length).toBeGreaterThan(0); @@ -218,6 +235,8 @@ describe("runSession", () => { } | null; latestContinuityDiagnostics: { actualPath: string; + confidence: string; + warnings: string[]; fallbackReason?: string; } | null; recentContinuityAuditEntries: Array<{ @@ -234,6 +253,12 @@ describe("runSession", () => { expect(loadJson.latestContinuityAuditEntry?.writtenPaths.length).toBeGreaterThan(0); expect(loadJson.latestContinuityAuditEntry?.evidenceCounts.successfulCommands).toBeGreaterThan(0); expect(loadJson.latestContinuityDiagnostics?.actualPath).toBe("heuristic"); + expect(loadJson.latestContinuityDiagnostics?.confidence).toBe("low"); + expect(loadJson.latestContinuityDiagnostics?.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Next steps were inferred from the latest request") + ]) + ); expect(loadJson.latestContinuityDiagnostics?.fallbackReason).toBe("configured-heuristic"); expect(loadJson.recentContinuityAuditEntries).toHaveLength(2); expect(loadJson.recentContinuityAuditEntries[0]?.rolloutPath).toBe(secondRolloutPath); @@ -241,6 +266,10 @@ describe("runSession", () => { const loadOutput = await runSession("load", { cwd: repoDir }); expect(loadOutput).toContain("Evidence: successful"); + expect(loadOutput).toContain("Warnings:"); + expect(loadOutput).toContain( + "Next steps were inferred from the latest request because the rollout did not contain an explicit next-step phrase." + ); expect(loadOutput).toContain("Written paths:"); expect(loadOutput).toContain( "Merged resume brief combines shared continuity with any project-local overrides." @@ -262,7 +291,7 @@ describe("runSession", () => { rolloutPath: string; writtenPaths: string[]; } | null; - latestContinuityDiagnostics: { actualPath: string } | null; + latestContinuityDiagnostics: { actualPath: string; confidence: string; warnings: string[] } | null; recentContinuityAuditEntries: Array<{ rolloutPath: string }>; continuityAuditPath: string; }; @@ -270,12 +299,22 @@ describe("runSession", () => { expect(statusJson.latestContinuityAuditEntry?.rolloutPath).toBe(secondRolloutPath); expect(statusJson.latestContinuityAuditEntry?.writtenPaths.length).toBeGreaterThan(0); expect(statusJson.latestContinuityDiagnostics?.actualPath).toBe("heuristic"); + expect(statusJson.latestContinuityDiagnostics?.confidence).toBe("low"); + expect(statusJson.latestContinuityDiagnostics?.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Next steps were inferred from the latest request") + ]) + ); expect(statusJson.recentContinuityAuditEntries).toHaveLength(2); expect(statusJson.recentContinuityAuditEntries[0]?.rolloutPath).toBe(secondRolloutPath); expect(statusJson.continuityAuditPath).toContain("session-continuity-log.jsonl"); const statusOutput = await runSession("status", { cwd: repoDir }); expect(statusOutput).toContain("Evidence: successful"); + expect(statusOutput).toContain("Warnings:"); + expect(statusOutput).toContain( + "Next steps were inferred from the latest request because the rollout did not contain an explicit next-step phrase." + ); expect(statusOutput).toContain("Written paths:"); expect(statusOutput).toContain( "Merged resume brief combines shared continuity with any project-local overrides." @@ -410,6 +449,66 @@ describe("runSession", () => { expect(merged?.filesDecisionsEnvironment.join("\n")).not.toContain("Stale local file note"); }, 30_000); + it("supports session load/status from the real CLI surface, including startup source files", async () => { + const repoDir = await tempDir("cam-session-load-cli-repo-"); + const memoryRoot = await tempDir("cam-session-load-cli-memory-"); + await initRepo(repoDir); + + await writeProjectConfig( + repoDir, + configJson(), + { autoMemoryDirectory: memoryRoot } + ); + + const store = new SessionContinuityStore(detectProjectContext(repoDir), { + ...configJson(), + autoMemoryDirectory: memoryRoot + }); + await store.saveSummary( + { + project: { + goal: "Resume the shared continuity path.", + confirmedWorking: ["Shared continuity already exists."], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [] + }, + projectLocal: { + goal: "", + confirmedWorking: [], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [] + } + }, + "project" + ); + + const loadResult = runCli(repoDir, ["session", "load", "--json", "--print-startup"]); + const statusResult = runCli(repoDir, ["session", "status", "--json"]); + + expect(loadResult.exitCode).toBe(0); + expect(statusResult.exitCode).toBe(0); + + const loadPayload = JSON.parse(loadResult.stdout) as { + startup: { text: string; sourceFiles: string[] }; + projectLocation: { path: string }; + localLocation: { path: string }; + }; + const statusPayload = JSON.parse(statusResult.stdout) as { + projectLocation: { exists: boolean; path: string }; + localLocation: { exists: boolean; path: string }; + }; + + expect(loadPayload.startup.text).toContain("# Session Continuity"); + expect(loadPayload.startup.sourceFiles).toEqual([loadPayload.projectLocation.path]); + expect(loadPayload.startup.sourceFiles).not.toContain(loadPayload.localLocation.path); + expect(statusPayload.projectLocation.exists).toBe(true); + expect(statusPayload.localLocation.exists).toBe(false); + }, 30_000); + it("refresh replaces only the selected scope", async () => { const repoDir = await tempDir("cam-session-refresh-scope-repo-"); const memoryRoot = await tempDir("cam-session-refresh-scope-memory-"); @@ -490,6 +589,62 @@ describe("runSession", () => { ); }, 30_000); + it("supports session load and status from the CLI command surface", async () => { + const repoDir = await tempDir("cam-session-load-status-cli-repo-"); + const memoryRoot = await tempDir("cam-session-load-status-cli-memory-"); + await initRepo(repoDir); + + await writeProjectConfig( + repoDir, + configJson(), + { autoMemoryDirectory: memoryRoot } + ); + + const rolloutPath = path.join(repoDir, "rollout.jsonl"); + await fs.writeFile( + rolloutPath, + rolloutFixture(repoDir, "Continue the login cookie work and add middleware."), + "utf8" + ); + + const saveResult = runCli(repoDir, ["session", "save", "--json", "--rollout", rolloutPath]); + expect(saveResult.exitCode).toBe(0); + + const loadResult = runCli(repoDir, ["session", "load", "--json", "--print-startup"]); + expect(loadResult.exitCode).toBe(0); + const loadPayload = JSON.parse(loadResult.stdout) as { + startup: { text: string }; + latestContinuityAuditEntry: { rolloutPath: string } | null; + latestContinuityDiagnostics: { actualPath: string; warnings: string[] } | null; + recentContinuityAuditEntries: Array<{ rolloutPath: string }>; + }; + expect(loadPayload.startup.text).toContain("# Session Continuity"); + expect(loadPayload.latestContinuityAuditEntry?.rolloutPath).toBe(rolloutPath); + expect(loadPayload.latestContinuityDiagnostics?.actualPath).toBe("heuristic"); + expect(loadPayload.latestContinuityDiagnostics?.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Next steps were inferred from the latest request") + ]) + ); + expect(loadPayload.recentContinuityAuditEntries[0]?.rolloutPath).toBe(rolloutPath); + + const statusResult = runCli(repoDir, ["session", "status", "--json"]); + expect(statusResult.exitCode).toBe(0); + const statusPayload = JSON.parse(statusResult.stdout) as { + latestContinuityAuditEntry: { rolloutPath: string } | null; + latestContinuityDiagnostics: { actualPath: string; warnings: string[] } | null; + recentContinuityAuditEntries: Array<{ rolloutPath: string }>; + }; + expect(statusPayload.latestContinuityAuditEntry?.rolloutPath).toBe(rolloutPath); + expect(statusPayload.latestContinuityDiagnostics?.actualPath).toBe("heuristic"); + expect(statusPayload.latestContinuityDiagnostics?.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Next steps were inferred from the latest request") + ]) + ); + expect(statusPayload.recentContinuityAuditEntries[0]?.rolloutPath).toBe(rolloutPath); + }, 30_000); + it("refresh prefers a matching recovery marker over audit and latest primary rollout", async () => { const repoDir = await tempDir("cam-session-refresh-recovery-priority-repo-"); const memoryRoot = await tempDir("cam-session-refresh-recovery-priority-memory-"); @@ -1406,24 +1561,135 @@ describe("runSession", () => { trigger?: string; writeMode?: string; } | null; + latestContinuityDiagnostics: { + confidence: string; + warnings: string[]; + fallbackReason?: string; + } | null; pendingContinuityRecovery: { rolloutPath: string; trigger?: string; writeMode?: string; + confidence?: string; + warnings?: string[]; } | null; }; expect(loadJson.latestContinuityAuditEntry?.rolloutPath).toBe("/tmp/rollout-legacy.jsonl"); expect(loadJson.latestContinuityAuditEntry?.trigger).toBeUndefined(); expect(loadJson.latestContinuityAuditEntry?.writeMode).toBeUndefined(); + expect(loadJson.latestContinuityDiagnostics?.confidence).toBe("low"); + expect(loadJson.latestContinuityDiagnostics?.warnings).toEqual([]); + expect(loadJson.latestContinuityDiagnostics?.fallbackReason).toBe("configured-heuristic"); expect(loadJson.pendingContinuityRecovery?.rolloutPath).toBe( "/tmp/rollout-legacy-recovery.jsonl" ); expect(loadJson.pendingContinuityRecovery?.trigger).toBeUndefined(); expect(loadJson.pendingContinuityRecovery?.writeMode).toBeUndefined(); + expect(loadJson.pendingContinuityRecovery?.confidence).toBe("high"); + expect(loadJson.pendingContinuityRecovery?.warnings).toEqual([]); const statusOutput = await runSession("status", { cwd: repoDir }); expect(statusOutput).toContain("/tmp/rollout-legacy.jsonl"); expect(statusOutput).toContain("/tmp/rollout-legacy-recovery.jsonl"); + expect(statusOutput).toContain("preferred heuristic | confidence high"); + }, 30_000); + + it("normalizes legacy audit and recovery warnings into json and text outputs", async () => { + const repoDir = await tempDir("cam-session-legacy-warning-repo-"); + const memoryRoot = await tempDir("cam-session-legacy-warning-memory-"); + await initRepo(repoDir); + + await writeProjectConfig( + repoDir, + configJson(), + { autoMemoryDirectory: memoryRoot } + ); + + const auditWarning = "Legacy audit reviewer warning."; + const recoveryWarning = "Legacy recovery reviewer warning."; + const project = detectProjectContext(repoDir); + const store = new SessionContinuityStore(project, { + ...configJson(), + autoMemoryDirectory: memoryRoot + }); + await store.ensureAuditLayout(); + await fs.writeFile( + store.paths.auditFile, + `${JSON.stringify({ + generatedAt: "2026-03-17T00:00:00.000Z", + projectId: project.projectId, + worktreeId: project.worktreeId, + configuredExtractorMode: "heuristic", + scope: "both", + rolloutPath: "/tmp/rollout-legacy-warning.jsonl", + sourceSessionId: "session-legacy-warning", + preferredPath: "heuristic", + actualPath: "heuristic", + warnings: [auditWarning], + evidenceCounts: makeEvidenceCounts(), + writtenPaths: ["/tmp/legacy-warning-continuity.md"] + })}\n`, + "utf8" + ); + await fs.writeFile( + store.getRecoveryPath(), + JSON.stringify({ + recordedAt: "2026-03-17T00:01:00.000Z", + projectId: project.projectId, + worktreeId: project.worktreeId, + rolloutPath: "/tmp/rollout-legacy-warning-recovery.jsonl", + sourceSessionId: "session-legacy-warning-recovery", + scope: "both", + writtenPaths: ["/tmp/legacy-warning-recovery.md"], + preferredPath: "heuristic", + actualPath: "heuristic", + warnings: [recoveryWarning], + evidenceCounts: makeEvidenceCounts(), + failedStage: "audit-write", + failureMessage: "legacy warning recovery marker" + }), + "utf8" + ); + + const loadJson = JSON.parse( + await runSession("load", { cwd: repoDir, json: true }) + ) as { + latestContinuityDiagnostics: { confidence: string; warnings: string[] } | null; + pendingContinuityRecovery: { + confidence?: string; + warnings?: string[]; + } | null; + }; + expect(loadJson.latestContinuityDiagnostics?.confidence).toBe("medium"); + expect(loadJson.latestContinuityDiagnostics?.warnings).toEqual([auditWarning]); + expect(loadJson.pendingContinuityRecovery?.confidence).toBe("medium"); + expect(loadJson.pendingContinuityRecovery?.warnings).toEqual([recoveryWarning]); + + const statusJson = JSON.parse( + await runSession("status", { cwd: repoDir, json: true }) + ) as { + latestContinuityDiagnostics: { confidence: string; warnings: string[] } | null; + pendingContinuityRecovery: { + confidence?: string; + warnings?: string[]; + } | null; + }; + expect(statusJson.latestContinuityDiagnostics?.confidence).toBe("medium"); + expect(statusJson.latestContinuityDiagnostics?.warnings).toEqual([auditWarning]); + expect(statusJson.pendingContinuityRecovery?.confidence).toBe("medium"); + expect(statusJson.pendingContinuityRecovery?.warnings).toEqual([recoveryWarning]); + + const loadOutput = await runSession("load", { cwd: repoDir }); + expect(loadOutput).toContain("Warnings:"); + expect(loadOutput).toContain(auditWarning); + expect(loadOutput).toContain(`- Warning: ${recoveryWarning}`); + expect(loadOutput).toContain("confidence medium"); + + const statusOutput = await runSession("status", { cwd: repoDir }); + expect(statusOutput).toContain("Warnings:"); + expect(statusOutput).toContain(auditWarning); + expect(statusOutput).toContain(`- Warning: ${recoveryWarning}`); + expect(statusOutput).toContain("confidence medium"); }, 30_000); it("writes and surfaces a continuity recovery marker when audit persistence fails", async () => { @@ -1766,6 +2032,71 @@ describe("runWrappedCodex with session continuity", () => { expect(await continuityStore.readRecoveryRecord()).toBeNull(); }, 30_000); + it("injects only continuity source files that actually exist", async () => { + const repoDir = await tempDir("cam-wrapper-existing-sources-repo-"); + const memoryRoot = await tempDir("cam-wrapper-existing-sources-memory-"); + const sessionsDir = await tempDir("cam-wrapper-existing-sources-rollouts-"); + await initRepo(repoDir); + process.env.CAM_CODEX_SESSIONS_DIR = sessionsDir; + + const { capturedArgsPath, mockCodexPath } = await writeWrapperMockCodex(repoDir, sessionsDir, { + sessionId: "session-wrapper-existing-sources", + message: "Continue with shared-only continuity." + }); + + await writeProjectConfig( + repoDir, + configJson({ + codexBinary: mockCodexPath, + sessionContinuityAutoLoad: true, + sessionContinuityAutoSave: false + }), + { + autoMemoryDirectory: memoryRoot, + sessionContinuityAutoLoad: true, + sessionContinuityAutoSave: false + } + ); + + const continuityStore = new SessionContinuityStore(detectProjectContext(repoDir), { + ...configJson({ + codexBinary: mockCodexPath, + sessionContinuityAutoLoad: true, + sessionContinuityAutoSave: false + }), + autoMemoryDirectory: memoryRoot + }); + await continuityStore.saveSummary( + { + project: { + goal: "Shared-only continuity goal.", + confirmedWorking: ["Shared-only continuity exists."], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [] + }, + projectLocal: { + goal: "", + confirmedWorking: [], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [] + } + }, + "project" + ); + + const exitCode = await runWrappedCodex(repoDir, "exec", ["continue"]); + expect(exitCode).toBe(0); + + const capturedArgs = JSON.parse(await fs.readFile(capturedArgsPath, "utf8")) as string[]; + const baseInstructionsArg = capturedArgs.find((arg) => arg.startsWith("base_instructions=")); + expect(baseInstructionsArg).toContain(continuityStore.paths.sharedFile); + expect(baseInstructionsArg).not.toContain(continuityStore.paths.localFile); + }, 30_000); + it("auto-saves continuity without injecting it when autoLoad is disabled and autoSave is enabled", async () => { const repoDir = await tempDir("cam-wrapper-save-only-repo-"); const memoryRoot = await tempDir("cam-wrapper-save-only-memory-"); diff --git a/test/session-continuity.test.ts b/test/session-continuity.test.ts index 28c8cf5..b16de56 100644 --- a/test/session-continuity.test.ts +++ b/test/session-continuity.test.ts @@ -588,10 +588,248 @@ describe("session continuity domain", () => { expect(prompt).toContain("Detected file writes:"); expect(prompt).toContain("Candidate explicit next-step phrases:"); expect(prompt).toContain("Candidate explicit untried phrases:"); + expect(prompt).toContain("Reviewer warning hints:"); + expect(prompt).toContain("Do not copy those warning phrases into project or projectLocal continuity items."); expect(prompt).toContain("pnpm test"); expect(prompt).toContain("login.ts"); }); + it("keeps reviewer noise warnings but clears resolved package-manager conflicts from a real rollout fixture", async () => { + const evidence = await parseRolloutEvidence( + path.join(process.cwd(), "test/fixtures/rollouts/mixed-language-reviewer-noise.jsonl") + ); + + expect(evidence).not.toBeNull(); + + const buckets = collectSessionContinuityEvidenceBuckets(evidence!); + expect(buckets.recentSuccessfulCommands.join("\n")).toContain("pnpm test"); + expect(buckets.recentFailedCommands.join("\n")).toContain("pnpm build"); + expect(buckets.detectedFileWrites.join("\n")).toContain("login.ts"); + expect(buckets.explicitNextSteps.join("\n")).toContain("更新 src/auth/login.ts"); + expect(buckets.explicitUntried.join("\n")).toContain("switching the login route to cookies()"); + expect(buckets.warningHints).toEqual( + expect.arrayContaining([ + expect.stringContaining("Reviewer or subagent prompt noise") + ]) + ); + expect(buckets.warningHints.join("\n")).not.toContain("Conflicting package manager signals"); + + const summarizer = new SessionContinuitySummarizer(baseConfig("/tmp/memory-root")); + const result = await summarizer.summarizeWithDiagnostics(evidence!); + + expect(result.summary.project.confirmedWorking.join("\n")).toContain("pnpm test"); + expect(result.summary.project.triedAndFailed.join("\n")).toContain("pnpm build"); + expect(result.summary.projectLocal.incompleteNext.join("\n")).toContain("更新 src/auth/login.ts"); + expect(result.summary.projectLocal.filesDecisionsEnvironment.join("\n")).toContain("login.ts"); + expect(result.diagnostics.confidence).toBe("low"); + expect(result.diagnostics.warnings).toEqual( + expect.arrayContaining([ + expect.stringContaining("Reviewer or subagent prompt noise") + ]) + ); + expect(result.diagnostics.warnings.join("\n")).not.toContain("Conflicting package manager signals"); + }); + + it("strips reviewer warning prose from codex continuity body while keeping diagnostics warnings", async () => { + const temp = await tempDir("cam-session-codex-warning-prose-"); + const reviewerWarning = + "Reviewer or subagent prompt noise was detected in the rollout; continuity extraction ignored non-product transcript lines."; + const mockBinary = await writeMockCodexBinary( + temp, + `fs.writeFileSync(outputPath, JSON.stringify({ + project: { + goal: ${JSON.stringify(reviewerWarning)}, + confirmedWorking: ["Command succeeded: pnpm test"], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [${JSON.stringify(reviewerWarning)}] + }, + projectLocal: { + goal: "", + confirmedWorking: [], + triedAndFailed: [], + notYetTried: [], + incompleteNext: ["Update src/auth/login.ts to set an httpOnly cookie."], + filesDecisionsEnvironment: ["File modified: login.ts", ${JSON.stringify(reviewerWarning)}] + } +}));` + ); + const evidence: RolloutEvidence = { + sessionId: "session-codex-warning-prose", + createdAt: "2026-03-15T00:00:00.000Z", + cwd: temp, + userMessages: ["Continue the auth rollout."], + agentMessages: [ + "接下来我会做两件并行的只读工作:一是按你要求跑完整校验命令并记录结果,二是把审查范围拆给 4 个 reviewer 子 agent 分域取证。" + ], + toolCalls: [ + { + name: "exec_command", + arguments: JSON.stringify({ cmd: "pnpm test" }), + output: "Process exited with code 0" + } + ], + rolloutPath: "/tmp/rollout.jsonl" + }; + + const summarizer = new SessionContinuitySummarizer( + baseConfig("/tmp/memory-root", { + extractorMode: "codex", + codexBinary: mockBinary + }) + ); + const result = await summarizer.summarizeWithDiagnostics(evidence); + const renderedBody = JSON.stringify({ + project: result.summary.project, + projectLocal: result.summary.projectLocal + }); + + expect(result.diagnostics.actualPath).toBe("codex"); + expect(result.diagnostics.warnings).toEqual( + expect.arrayContaining([expect.stringContaining("Reviewer or subagent prompt noise")]) + ); + expect(result.summary.project.confirmedWorking).toEqual(["Command succeeded: pnpm test"]); + expect(result.summary.projectLocal.incompleteNext).toEqual([ + "Update src/auth/login.ts to set an httpOnly cookie." + ]); + expect(result.summary.projectLocal.filesDecisionsEnvironment).toEqual(["File modified: login.ts"]); + expect(renderedBody).not.toContain(reviewerWarning); + }); + + it("clears continuity conflict warnings when a later explicit user correction resolves the preference", async () => { + const temp = await tempDir("cam-session-resolved-warning-"); + const mockBinary = await writeMockCodexBinary( + temp, + `fs.writeFileSync(outputPath, JSON.stringify({ + project: { + goal: "Keep the package-manager guidance aligned.", + confirmedWorking: ["Command succeeded: pnpm test"], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [] + }, + projectLocal: { + goal: "", + confirmedWorking: [], + triedAndFailed: [], + notYetTried: [], + incompleteNext: ["update the setup guide to match pnpm"], + filesDecisionsEnvironment: [] + } +}));` + ); + const evidence: RolloutEvidence = { + sessionId: "session-resolved-package-manager-warning", + createdAt: "2026-03-15T00:00:00.000Z", + cwd: temp, + userMessages: [ + "Actually use pnpm, not bun.", + "Next step: update the setup guide to match pnpm." + ], + agentMessages: ["Use bun in this repo for faster installs."], + toolCalls: [], + rolloutPath: "/tmp/rollout.jsonl" + }; + + const buckets = collectSessionContinuityEvidenceBuckets(evidence); + + expect(buckets.warningHints).toEqual([]); + + const summarizer = new SessionContinuitySummarizer( + baseConfig("/tmp/memory-root", { + extractorMode: "codex", + codexBinary: mockBinary + }) + ); + const result = await summarizer.summarizeWithDiagnostics(evidence); + + expect(result.diagnostics.warnings).toEqual([]); + expect(result.diagnostics.confidence).toBe("high"); + }); + + it("falls back after scrubbing warning-only codex output", async () => { + const temp = await tempDir("cam-session-codex-warning-only-"); + const reviewerWarning = + "Reviewer or subagent prompt noise was detected in the rollout; continuity extraction ignored non-product transcript lines."; + const mockBinary = await writeMockCodexBinary( + temp, + `fs.writeFileSync(outputPath, JSON.stringify({ + project: { + goal: ${JSON.stringify(reviewerWarning)}, + confirmedWorking: [${JSON.stringify(reviewerWarning)}], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [], + filesDecisionsEnvironment: [] + }, + projectLocal: { + goal: "", + confirmedWorking: [], + triedAndFailed: [], + notYetTried: [], + incompleteNext: [${JSON.stringify(reviewerWarning)}], + filesDecisionsEnvironment: [${JSON.stringify(reviewerWarning)}] + } +}));` + ); + const evidence: RolloutEvidence = { + sessionId: "session-codex-warning-only", + createdAt: "2026-03-15T00:00:00.000Z", + cwd: temp, + userMessages: [ + "We haven't tried switching the login route to cookies() yet.", + "Next step: update src/auth/login.ts to set an httpOnly cookie." + ], + agentMessages: [ + "接下来我会做两件并行的只读工作:一是按你要求跑完整校验命令并记录结果,二是把审查范围拆给 4 个 reviewer 子 agent 分域取证。" + ], + toolCalls: [ + { + name: "exec_command", + arguments: JSON.stringify({ cmd: "pnpm test" }), + output: "Process exited with code 0" + }, + { + name: "apply_patch_freeform", + arguments: + "diff --git a/src/auth/login.ts b/src/auth/login.ts\nindex abc..def 100644\n--- a/src/auth/login.ts\n+++ b/src/auth/login.ts\n@@ -1,3 +1,4 @@\n+setCookie(token);\n export {};", + output: undefined + } + ], + rolloutPath: "/tmp/rollout.jsonl" + }; + + const summarizer = new SessionContinuitySummarizer( + baseConfig("/tmp/memory-root", { + extractorMode: "codex", + codexBinary: mockBinary + }) + ); + const result = await summarizer.summarizeWithDiagnostics(evidence); + const renderedBody = JSON.stringify({ + project: result.summary.project, + projectLocal: result.summary.projectLocal + }); + + expect(result.diagnostics.preferredPath).toBe("codex"); + expect(result.diagnostics.actualPath).toBe("heuristic"); + expect(result.diagnostics.fallbackReason).toBe("low-signal"); + expect(result.diagnostics.warnings).toEqual( + expect.arrayContaining([expect.stringContaining("Reviewer or subagent prompt noise")]) + ); + expect(result.summary.project.confirmedWorking.join("\n")).toContain("pnpm test"); + expect(result.summary.project.notYetTried.join("\n")).toContain( + "switching the login route to cookies()" + ); + expect(result.summary.projectLocal.incompleteNext.join("\n")).toContain( + "update src/auth/login.ts" + ); + expect(result.summary.projectLocal.filesDecisionsEnvironment.join("\n")).toContain("login.ts"); + expect(renderedBody).not.toContain(reviewerWarning); + }); + it("collects successful and failed bash tool calls in continuity evidence buckets", () => { const evidence: RolloutEvidence = { sessionId: "session-bash-buckets", @@ -675,6 +913,8 @@ describe("session continuity domain", () => { ]); expect(diagnostics.preferredPath).toBe("codex"); expect(diagnostics.actualPath).toBe("codex"); + expect(diagnostics.confidence).toBe("high"); + expect(diagnostics.warnings).toEqual([]); expect(diagnostics.fallbackReason).toBeUndefined(); expect(diagnostics.codexExitCode).toBe(0); }); @@ -757,6 +997,7 @@ describe("session continuity domain", () => { expect(summary.projectLocal.filesDecisionsEnvironment.join("\n")).toContain("login.ts"); expect(diagnostics.preferredPath).toBe("codex"); expect(diagnostics.actualPath).toBe("heuristic"); + expect(diagnostics.confidence).toBe("low"); expect(diagnostics.codexExitCode).toBe(0); expect(diagnostics.fallbackReason).toBe(invalidCase.reason); } @@ -830,6 +1071,7 @@ describe("session continuity domain", () => { expect(summary.projectLocal.filesDecisionsEnvironment.join("\n")).toContain("login.ts"); expect(diagnostics.preferredPath).toBe("codex"); expect(diagnostics.actualPath).toBe("heuristic"); + expect(diagnostics.confidence).toBe("low"); expect(diagnostics.fallbackReason).toBe("low-signal"); }); @@ -860,6 +1102,7 @@ describe("session continuity domain", () => { expect(result.summary.sourceSessionId).toBe("session-codex-command-failed"); expect(result.diagnostics.preferredPath).toBe("codex"); expect(result.diagnostics.actualPath).toBe("heuristic"); + expect(result.diagnostics.confidence).toBe("low"); expect(result.diagnostics.fallbackReason).toBe("codex-command-failed"); expect(result.diagnostics.codexExitCode).toBe(17); }); @@ -880,8 +1123,10 @@ describe("session continuity domain", () => { expect(result.diagnostics.preferredPath).toBe("heuristic"); expect(result.diagnostics.actualPath).toBe("heuristic"); + expect(result.diagnostics.confidence).toBe("low"); expect(result.diagnostics.fallbackReason).toBe("configured-heuristic"); expect(result.diagnostics.evidenceCounts.nextSteps).toBeGreaterThan(0); + expect(result.diagnostics.warnings).toEqual([]); }); it("applySessionContinuityLayerSummary merges summary into base state", () => { diff --git a/test/sync-service.test.ts b/test/sync-service.test.ts index 8fc138b..08f7de1 100644 --- a/test/sync-service.test.ts +++ b/test/sync-service.test.ts @@ -109,6 +109,36 @@ function noOpRolloutFixture(projectDir: string, sessionId = "session-2"): string ].join("\n"); } +function sameRolloutCorrectionFixture(projectDir: string, sessionId = "session-correction"): string { + return [ + JSON.stringify({ + timestamp: "2026-03-14T00:20:00.000Z", + type: "session_meta", + payload: { + id: sessionId, + timestamp: "2026-03-14T00:20:00.000Z", + cwd: projectDir + } + }), + JSON.stringify({ + timestamp: "2026-03-14T00:20:01.000Z", + type: "event_msg", + payload: { + type: "user_message", + message: "remember that we use bun in this repository" + } + }), + JSON.stringify({ + timestamp: "2026-03-14T00:20:02.000Z", + type: "event_msg", + payload: { + type: "user_message", + message: "Actually use pnpm, not bun." + } + }) + ].join("\n"); +} + afterEach(async () => { await Promise.all(tempDirs.splice(0).map((dir) => fs.rm(dir, { recursive: true, force: true }))); }); @@ -192,6 +222,138 @@ describe("SyncService", () => { expect(auditEntries[0]?.operations).toEqual([]); }); + it("suppresses conflicting preference candidates from the same rollout and records reviewer conflicts", async () => { + const projectDir = await tempDir("cam-sync-within-conflict-project-"); + const memoryRoot = await tempDir("cam-sync-within-conflict-memory-"); + const rolloutPath = path.join(projectDir, "within-rollout-preference-conflict.jsonl"); + const fixturePath = path.join( + process.cwd(), + "test/fixtures/rollouts/within-rollout-preference-conflict.jsonl" + ); + await fs.copyFile(fixturePath, rolloutPath); + + const service = new SyncService( + detectProjectContext(projectDir), + baseConfig(memoryRoot), + path.resolve("schemas/memory-operations.schema.json") + ); + + const result = await service.syncRollout(rolloutPath, true); + const auditEntries = await service.memoryStore.readRecentSyncAuditEntries(5); + const projectEntries = await service.memoryStore.listEntries("project"); + + expect(result.skipped).toBe(false); + expect(result.applied).toEqual([]); + expect(projectEntries).toEqual([]); + expect(auditEntries[0]).toMatchObject({ + rolloutPath, + status: "no-op", + appliedCount: 0, + suppressedOperationCount: 2 + }); + expect(auditEntries[0]?.conflicts).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + source: "within-rollout", + topic: "preferences", + resolution: "suppressed" + }) + ]) + ); + }); + + it("suppresses hedged preference conflicts against existing durable memory and keeps the old entry", async () => { + const projectDir = await tempDir("cam-sync-existing-conflict-project-"); + const memoryRoot = await tempDir("cam-sync-existing-conflict-memory-"); + const rolloutPath = path.join(projectDir, "hedged-preference-conflict.jsonl"); + const fixturePath = path.join( + process.cwd(), + "test/fixtures/rollouts/hedged-preference-conflict.jsonl" + ); + await fs.copyFile(fixturePath, rolloutPath); + + const service = new SyncService( + detectProjectContext(projectDir), + baseConfig(memoryRoot), + path.resolve("schemas/memory-operations.schema.json") + ); + await service.memoryStore.remember( + "project", + "preferences", + "use-pnpm", + "Use pnpm in this repository.", + ["Use pnpm instead of npm in this repository."], + "Seed durable preference." + ); + + const result = await service.syncRollout(rolloutPath, true); + const auditEntries = await service.memoryStore.readRecentSyncAuditEntries(5); + const projectEntries = await service.memoryStore.listEntries("project"); + + expect(result.skipped).toBe(false); + expect(result.applied).toEqual([]); + expect(projectEntries.map((entry) => entry.summary)).toEqual(["Use pnpm in this repository."]); + expect(auditEntries[0]).toMatchObject({ + rolloutPath, + status: "no-op", + appliedCount: 0, + suppressedOperationCount: 2 + }); + expect(auditEntries[0]?.conflicts).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + source: "existing-memory", + topic: "preferences", + resolution: "suppressed" + }) + ]) + ); + }); + + it("keeps the latest high-confidence same-rollout correction and audits the suppressed stale candidate", async () => { + const projectDir = await tempDir("cam-sync-rollout-correction-project-"); + const memoryRoot = await tempDir("cam-sync-rollout-correction-memory-"); + const rolloutPath = path.join(projectDir, "same-rollout-correction.jsonl"); + await fs.writeFile( + rolloutPath, + sameRolloutCorrectionFixture(projectDir, "session-rollout-correction"), + "utf8" + ); + + const service = new SyncService( + detectProjectContext(projectDir), + baseConfig(memoryRoot), + path.resolve("schemas/memory-operations.schema.json") + ); + + const result = await service.syncRollout(rolloutPath, true); + const auditEntries = await service.memoryStore.readRecentSyncAuditEntries(5); + const projectEntries = await service.memoryStore.listEntries("project"); + + expect(result.skipped).toBe(false); + expect(result.applied).toHaveLength(1); + expect(projectEntries.map((entry) => entry.summary)).toEqual([ + "Actually use pnpm, not bun" + ]); + expect(auditEntries[0]).toMatchObject({ + rolloutPath, + status: "applied", + appliedCount: 1, + suppressedOperationCount: 1, + scopesTouched: ["project"] + }); + expect(auditEntries[0]?.conflicts).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + source: "within-rollout", + candidateSummary: "we use bun in this repository", + conflictsWith: ["Actually use pnpm, not bun"], + resolution: "suppressed" + }) + ]) + ); + }); + it("records skipped audit entries for already processed and no-evidence rollouts", async () => { const projectDir = await tempDir("cam-sync-skip-project-"); const memoryRoot = await tempDir("cam-sync-skip-memory-");