Conversation
Reviewer's GuideIntroduce continuity confidence and warning hygiene across session continuity, sync, and CLI surfaces, add contradiction review for memory extraction, and tighten reviewer-facing documentation and release/test contracts while ensuring CLI-based workflows are fully covered by smoke tests. Sequence diagram for continuity confidence and warning propagationsequenceDiagram
actor Reviewer
participant CLI as SessionCLI
participant Store as SessionContinuityStore
participant Summ as SessionContinuitySummarizer
participant Ev as collectSessionContinuityEvidenceBuckets
participant Audit as ContinuityAuditLog
Reviewer->>CLI: cam session save or refresh
CLI->>Store: readExistingState
CLI->>Summ: generate(evidence, existingState, config)
Summ->>Ev: collectSessionContinuityEvidenceBuckets(evidence)
Ev-->>Summ: buckets including warningHints
alt extractorMode heuristic
Summ->>Summ: heuristicSummary(evidence, existingState, buckets)
Summ-->>CLI: summary, diagnostics with confidence low|medium
else extractorMode codex
Summ->>Summ: tryCodexExtraction(evidence)
alt codex success
Summ->>Summ: buildDiagnostics(actualPath codex, warnings buckets.warningHints)
Summ-->>CLI: summary codex, diagnostics confidence high|medium
else codex failed or low signal
Summ->>Summ: heuristicSummary(evidence, existingState, buckets)
Summ->>Summ: buildDiagnostics(actualPath heuristic, fallbackReason, warnings buckets.warningHints, usedFallbackNext)
Summ-->>CLI: summary heuristic, diagnostics confidence low
end
end
CLI->>Audit: buildSessionContinuityAuditEntry(diagnostics)
Audit-->>Audit: persist audit entry with confidence and warnings
Note over Summ,Audit: Reviewer warning prose is scrubbed from continuity Markdown
Note over Summ: scrubReviewerWarningProse(summary, buckets.warningHints)
Reviewer->>CLI: cam session status or load
CLI->>Store: readRecentAuditEntries
Store-->>CLI: latest SessionContinuityAuditEntry
CLI->>CLI: toSessionContinuityDiagnostics(entry)
CLI->>CLI: normalizeContinuityRecoveryRecord(record)
CLI-->>Reviewer: text and json including confidence and warnings while keeping continuity body clean
Class diagram for contradiction review and memory audit typesclassDiagram
class MemoryScope
class MemoryOperation {
<<interface>>
action string
scope MemoryScope
topic string
id string
summary string
details string
sources string[]
reason string
}
class MemoryEntry {
<<interface>>
id string
scope MemoryScope
topic string
summary string
details string
sources string[]
reason string
}
class MemoryConflictCandidate {
<<interface>>
scope MemoryScope
topic string
candidateSummary string
conflictsWith string[]
source MemoryConflictSource
resolution MemoryConflictResolution
}
class MemoryConflictSource {
<<type>>
within-rollout
existing-memory
}
class MemoryConflictResolution {
<<type>>
suppressed
}
class ReviewedMemoryOperations {
<<interface>>
operations MemoryOperation[]
suppressedOperationCount number
conflicts MemoryConflictCandidate[]
}
class SyncService {
+syncFromRollout(evidence, isRecovery) Promise
-extractOperations(evidence, existingEntries) Promise
}
class MemorySyncAuditEntry {
<<interface>>
appliedAt string
status string
appliedCount number
suppressedOperationCount number
scopesTouched MemoryScope[]
conflicts MemoryConflictCandidate[]
resultSummary string
}
class SyncRecoveryRecord {
<<interface>>
appliedCount number
suppressedOperationCount number
scopesTouched MemoryScope[]
conflicts MemoryConflictCandidate[]
failedStage string
failureMessage string
}
MemoryConflictCandidate --> MemoryScope
MemoryConflictCandidate --> MemoryConflictSource
MemoryConflictCandidate --> MemoryConflictResolution
ReviewedMemoryOperations --> MemoryOperation
ReviewedMemoryOperations --> MemoryConflictCandidate
MemorySyncAuditEntry --> MemoryConflictCandidate
SyncRecoveryRecord --> MemoryConflictCandidate
SyncService --> ReviewedMemoryOperations
SyncService --> MemorySyncAuditEntry
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughThis PR introduces conflict detection and reviewer confidence modeling to the memory sync system. It adds a new contradiction-review module to identify and suppress conflicting memory operations extracted from the same rollout or against existing memory, records conflict metadata in audit entries, and extends session continuity with confidence levels and warning hints to track sync reliability and provide reviewer diagnostics. Changes
Sequence Diagram(s)sequenceDiagram
participant Extractor as Extractor<br/>(Extract Ops)
participant Review as Contradiction<br/>Reviewer
participant Sync as SyncService
participant Audit as Audit Entry<br/>Builder
participant Recovery as Recovery Record
Extractor->>Review: reviewExtractedMemoryOperations<br/>(operations, existingEntries)
activate Review
Note over Review: Identify choice-level<br/>conflicts within same-rollout<br/>& vs existing memory
Review->>Review: Determine suppression<br/>decisions
Review-->>Extractor: ReviewedMemoryOperations<br/>(filtered ops, suppressedCount,<br/>conflicts[])
deactivate Review
Extractor->>Sync: syncRollout(reviewed)<br/>with suppression metadata
activate Sync
Sync->>Audit: buildMemorySyncAuditEntry<br/>(suppressedOperationCount,<br/>conflicts)
activate Audit
Audit-->>Sync: MemorySyncAuditEntry<br/>with conflict reviews
deactivate Audit
Sync->>Recovery: writeSyncRecoveryRecord<br/>(suppressedOperationCount,<br/>conflicts)
activate Recovery
Recovery-->>Sync: Persisted recovery record
deactivate Recovery
deactivate Sync
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The conflict/hedged directive detection logic (e.g.,
hedgedDirectivePattern, package manager/repo-search value handling, and directive extraction) is now implemented in bothsession-continuity-evidence.tsandcontradiction-review.ts; consider centralizing this into a shared helper to avoid drift between continuity warnings and durable-memory contradiction review. - Confidence calculation is implemented in multiple places (
determineConfidencein the summarizer vs.normalizeSessionContinuityConfidencefor audit/recovery normalization) with slightly different inputs (e.g.,usedFallbackNext); it would be safer to have a single canonical confidence computation to avoid inconsistent values across surfaces.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The conflict/hedged directive detection logic (e.g., `hedgedDirectivePattern`, package manager/repo-search value handling, and directive extraction) is now implemented in both `session-continuity-evidence.ts` and `contradiction-review.ts`; consider centralizing this into a shared helper to avoid drift between continuity warnings and durable-memory contradiction review.
- Confidence calculation is implemented in multiple places (`determineConfidence` in the summarizer vs. `normalizeSessionContinuityConfidence` for audit/recovery normalization) with slightly different inputs (e.g., `usedFallbackNext`); it would be safer to have a single canonical confidence computation to avoid inconsistent values across surfaces.
## Individual Comments
### Comment 1
<location path="src/lib/extractor/session-continuity-summarizer.ts" line_range="284-287" />
<code_context>
+ warnings: string[] = [],
+ usedFallbackNext = false
): SessionContinuityDiagnostics {
+ const normalizedWarnings = [...new Set(warnings)];
+ if (usedFallbackNext) {
+ normalizedWarnings.push(
+ "Next steps were inferred from the latest request because the rollout did not contain an explicit next-step phrase."
+ );
+ }
</code_context>
<issue_to_address>
**nitpick (bug_risk):** Deduplication of warnings is lost when appending the fallback-next warning; consider re-normalizing or checking before push.
Here you dedupe `warnings` via `new Set`, but then always `push` the inferred-next-steps warning when `usedFallbackNext` is true. If a caller ever passes this same string in `warnings`, it will appear twice. Either add the inferred warning before creating `normalizedWarnings`, or check `normalizedWarnings.includes(...)` before pushing to preserve uniqueness.
</issue_to_address>
### Comment 2
<location path="src/lib/domain/session-continuity-diagnostics.ts" line_range="86-95" />
<code_context>
+ : [];
+}
+
+export function normalizeSessionContinuityConfidence(
+ confidence: unknown,
+ warnings: string[],
+ fallbackReason?: SessionContinuityFallbackReason
+): SessionContinuityConfidence {
+ if (isConfidence(confidence)) {
+ return confidence;
+ }
+
+ if (fallbackReason) {
+ return "low";
+ }
+
+ return warnings.length > 0 ? "medium" : "high";
+}
+
</code_context>
<issue_to_address>
**question:** Confidence normalization logic slightly diverges from `determineConfidence`; consider aligning the behaviors.
`determineConfidence` currently treats any fallback reason, any warnings, or `usedFallbackNext` as `low`, `high` for codex with no warnings/fallback, and `medium` otherwise. Here, `normalizeSessionContinuityConfidence` only returns `low` when `fallbackReason` is present and treats `warnings.length > 0` as `medium`, so legacy records with warnings but no fallback will differ from newly generated diagnostics. Consider either matching `determineConfidence`’s thresholds or clearly documenting the intentional difference and, if needed, passing additional inputs (e.g., `usedFallbackNext`) into this function.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| const normalizedWarnings = [...new Set(warnings)]; | ||
| if (usedFallbackNext) { | ||
| normalizedWarnings.push( | ||
| "Next steps were inferred from the latest request because the rollout did not contain an explicit next-step phrase." |
There was a problem hiding this comment.
nitpick (bug_risk): Deduplication of warnings is lost when appending the fallback-next warning; consider re-normalizing or checking before push.
Here you dedupe warnings via new Set, but then always push the inferred-next-steps warning when usedFallbackNext is true. If a caller ever passes this same string in warnings, it will appear twice. Either add the inferred warning before creating normalizedWarnings, or check normalizedWarnings.includes(...) before pushing to preserve uniqueness.
| export function normalizeSessionContinuityConfidence( | ||
| confidence: unknown, | ||
| warnings: string[], | ||
| fallbackReason?: SessionContinuityFallbackReason | ||
| ): SessionContinuityConfidence { | ||
| if (isConfidence(confidence)) { | ||
| return confidence; | ||
| } | ||
|
|
||
| if (fallbackReason) { |
There was a problem hiding this comment.
question: Confidence normalization logic slightly diverges from determineConfidence; consider aligning the behaviors.
determineConfidence currently treats any fallback reason, any warnings, or usedFallbackNext as low, high for codex with no warnings/fallback, and medium otherwise. Here, normalizeSessionContinuityConfidence only returns low when fallbackReason is present and treats warnings.length > 0 as medium, so legacy records with warnings but no fallback will differ from newly generated diagnostics. Consider either matching determineConfidence’s thresholds or clearly documenting the intentional difference and, if needed, passing additional inputs (e.g., usedFallbackNext) into this function.
There was a problem hiding this comment.
4 issues found across 32 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="package.json">
<violation number="1" location="package.json:31">
P3: `test:reviewer-smoke` redundantly includes `test/docs-contract.test.ts` (already run by `test:docs-contract` immediately before it in `ci`) and shares two test files with `test:cli-smoke`. Consider removing the overlap between the smoke scripts so each file runs once during the smoke phase and once in the full `pnpm test`.</violation>
</file>
<file name="src/lib/domain/recovery-records.ts">
<violation number="1" location="src/lib/domain/recovery-records.ts:123">
P2: Type guard accepts non-number `suppressedOperationCount` values. Because the variable defaults invalid types to `0`, any non-number value (string, boolean, object) silently passes. Validate the field type directly instead.</violation>
<violation number="2" location="src/lib/domain/recovery-records.ts:139">
P2: Type guard accepts non-array `conflicts` values. When `record.conflicts` is a non-array truthy value (e.g., a string or number), both sides of the length comparison resolve to `0`, so the check passes silently. Consider validating with `(record.conflicts === undefined || (Array.isArray(record.conflicts) && ...))` to reject invalid types.</violation>
</file>
<file name="src/lib/domain/session-continuity-diagnostics.ts">
<violation number="1" location="src/lib/domain/session-continuity-diagnostics.ts:99">
P2: `normalizeSessionContinuityConfidence` returns `"medium"` when warnings are present but `fallbackReason` is absent, whereas `determineConfidence` returns `"low"` for the same condition. Legacy records without a stored `confidence` that have warnings will therefore be normalized to `medium` instead of `low`, creating an inconsistency with newly generated diagnostics. Align the thresholds or document the intentional difference.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| suppressedOperationCount >= 0 && | ||
| Array.isArray(record.scopesTouched) && | ||
| record.scopesTouched.every((scope) => isMemoryScope(scope)) && | ||
| conflicts.length === (Array.isArray(record.conflicts) ? record.conflicts.length : 0) && |
There was a problem hiding this comment.
P2: Type guard accepts non-array conflicts values. When record.conflicts is a non-array truthy value (e.g., a string or number), both sides of the length comparison resolve to 0, so the check passes silently. Consider validating with (record.conflicts === undefined || (Array.isArray(record.conflicts) && ...)) to reject invalid types.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/domain/recovery-records.ts, line 139:
<comment>Type guard accepts non-array `conflicts` values. When `record.conflicts` is a non-array truthy value (e.g., a string or number), both sides of the length comparison resolve to `0`, so the check passes silently. Consider validating with `(record.conflicts === undefined || (Array.isArray(record.conflicts) && ...))` to reject invalid types.</comment>
<file context>
@@ -96,8 +133,10 @@ export function isSyncRecoveryRecord(value: unknown): value is SyncRecoveryRecor
+ suppressedOperationCount >= 0 &&
Array.isArray(record.scopesTouched) &&
record.scopesTouched.every((scope) => isMemoryScope(scope)) &&
+ conflicts.length === (Array.isArray(record.conflicts) ? record.conflicts.length : 0) &&
isSyncRecoveryFailedStage(record.failedStage) &&
typeof record.failureMessage === "string" &&
</file context>
| ) | ||
| : []; | ||
| const suppressedOperationCount = | ||
| typeof record.suppressedOperationCount === "number" ? record.suppressedOperationCount : 0; |
There was a problem hiding this comment.
P2: Type guard accepts non-number suppressedOperationCount values. Because the variable defaults invalid types to 0, any non-number value (string, boolean, object) silently passes. Validate the field type directly instead.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/domain/recovery-records.ts, line 123:
<comment>Type guard accepts non-number `suppressedOperationCount` values. Because the variable defaults invalid types to `0`, any non-number value (string, boolean, object) silently passes. Validate the field type directly instead.</comment>
<file context>
@@ -84,6 +114,13 @@ export function isSyncRecoveryRecord(value: unknown): value is SyncRecoveryRecor
+ )
+ : [];
+ const suppressedOperationCount =
+ typeof record.suppressedOperationCount === "number" ? record.suppressedOperationCount : 0;
return (
typeof record.recordedAt === "string" &&
</file context>
| return "low"; | ||
| } | ||
|
|
||
| return warnings.length > 0 ? "medium" : "high"; |
There was a problem hiding this comment.
P2: normalizeSessionContinuityConfidence returns "medium" when warnings are present but fallbackReason is absent, whereas determineConfidence returns "low" for the same condition. Legacy records without a stored confidence that have warnings will therefore be normalized to medium instead of low, creating an inconsistency with newly generated diagnostics. Align the thresholds or document the intentional difference.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/domain/session-continuity-diagnostics.ts, line 99:
<comment>`normalizeSessionContinuityConfidence` returns `"medium"` when warnings are present but `fallbackReason` is absent, whereas `determineConfidence` returns `"low"` for the same condition. Legacy records without a stored `confidence` that have warnings will therefore be normalized to `medium` instead of `low`, creating an inconsistency with newly generated diagnostics. Align the thresholds or document the intentional difference.</comment>
<file context>
@@ -71,12 +77,35 @@ function isEvidenceCounts(
+ return "low";
+ }
+
+ return warnings.length > 0 ? "medium" : "high";
+}
+
</file context>
| return warnings.length > 0 ? "medium" : "high"; | |
| return warnings.length > 0 ? "low" : "high"; |
| "test": "vitest run", | ||
| "test:cli-smoke": "vitest run test/audit.test.ts test/memory-command.test.ts test/session-command.test.ts", | ||
| "test:docs-contract": "vitest run test/docs-contract.test.ts", | ||
| "test:reviewer-smoke": "vitest run test/docs-contract.test.ts test/memory-command.test.ts test/session-command.test.ts test/session-continuity.test.ts", |
There was a problem hiding this comment.
P3: test:reviewer-smoke redundantly includes test/docs-contract.test.ts (already run by test:docs-contract immediately before it in ci) and shares two test files with test:cli-smoke. Consider removing the overlap between the smoke scripts so each file runs once during the smoke phase and once in the full pnpm test.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At package.json, line 31:
<comment>`test:reviewer-smoke` redundantly includes `test/docs-contract.test.ts` (already run by `test:docs-contract` immediately before it in `ci`) and shares two test files with `test:cli-smoke`. Consider removing the overlap between the smoke scripts so each file runs once during the smoke phase and once in the full `pnpm test`.</comment>
<file context>
@@ -20,10 +20,15 @@
"test": "vitest run",
+ "test:cli-smoke": "vitest run test/audit.test.ts test/memory-command.test.ts test/session-command.test.ts",
+ "test:docs-contract": "vitest run test/docs-contract.test.ts",
+ "test:reviewer-smoke": "vitest run test/docs-contract.test.ts test/memory-command.test.ts test/session-command.test.ts test/session-continuity.test.ts",
"test:watch": "vitest"
},
</file context>
| "test:reviewer-smoke": "vitest run test/docs-contract.test.ts test/memory-command.test.ts test/session-command.test.ts test/session-continuity.test.ts", | |
| "test:reviewer-smoke": "vitest run test/session-continuity.test.ts", |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/lib/commands/session.ts (1)
100-124:⚠️ Potential issue | 🟡 MinorNormalize diagnostics before compact-history grouping.
The preview row is rendered from
toSessionContinuityDiagnostics(group.latest), but the grouping key still uses rawentry.confidence/entry.warnings. Legacy audit entries without explicit confidence therefore stop coalescing with equivalent normalized entries and show up as duplicate “similar” generations. Use the normalized diagnostics in the signature too.♻️ Proposed fix
const preview = buildCompactHistoryPreview(entries, { excludeLeadingCount: 1, maxGroups: recentContinuityPreviewGroupLimit, - getSignature: (entry) => - JSON.stringify({ + getSignature: (entry) => { + const diagnostics = toSessionContinuityDiagnostics(entry); + return JSON.stringify({ rolloutPath: entry.rolloutPath, sourceSessionId: entry.sourceSessionId, scope: entry.scope, trigger: normalizeSessionContinuityAuditTrigger(entry.trigger), writeMode: normalizeSessionContinuityWriteMode(entry.writeMode), preferredPath: entry.preferredPath, actualPath: entry.actualPath, - confidence: entry.confidence ?? "high", - warnings: entry.warnings ?? [], + confidence: diagnostics.confidence, + warnings: diagnostics.warnings, fallbackReason: entry.fallbackReason ?? null, codexExitCode: entry.codexExitCode ?? null, evidenceCounts: { successfulCommands: entry.evidenceCounts.successfulCommands, failedCommands: entry.evidenceCounts.failedCommands, fileWrites: entry.evidenceCounts.fileWrites, nextSteps: entry.evidenceCounts.nextSteps, untried: entry.evidenceCounts.untried }, writtenPaths: entry.writtenPaths - }) + }); + } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/commands/session.ts` around lines 100 - 124, The grouping signature passed to buildCompactHistoryPreview uses raw fields (e.g., entry.confidence, entry.warnings) causing legacy entries to not coalesce; update the getSignature logic inside buildCompactHistoryPreview to use the normalized diagnostics instead (call toSessionContinuityDiagnostics(entry) or apply the same normalization helpers) and replace raw entry.confidence/entry.warnings (and other diagnostic-derived fields like fallbackReason, codexExitCode if produced by normalization) with the normalized values so the signature matches what preview rows render; keep existing keys (rolloutPath, sourceSessionId, scope, trigger via normalizeSessionContinuityAuditTrigger, writeMode via normalizeSessionContinuityWriteMode, preferredPath, actualPath, evidenceCounts, writtenPaths) but source diagnostic values from the normalized diagnostics object.src/lib/domain/sync-service.ts (1)
123-141:⚠️ Potential issue | 🟡 MinorSurface fully suppressed syncs explicitly.
If contradiction review suppresses every candidate,
appliedstays empty and this path collapses into the same no-op result as “nothing was extracted”. Callers then get the generic no-op message even thoughsuppressedOperationCount > 0and conflict metadata exists. Consider a dedicated all-suppressed message so the new reviewer flow is visible without opening the audit log.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/domain/sync-service.ts` around lines 123 - 141, Currently code sets status = "no-op" whenever applied.length === 0, which hides cases where reviewExtractedMemoryOperations suppressed all candidates; change the status calculation after calling reviewExtractedMemoryOperations/applyOperations so that if applied.length === 0 AND reviewedOperations.suppressedOperationCount > 0 (or there are non-empty reviewedOperations.conflicts if you want to surface conflict-only cases) you set status to "all-suppressed" instead of "no-op" before calling buildMemorySyncAuditEntry, ensuring callers see a distinct all-suppressed audit status.
🧹 Nitpick comments (6)
src/lib/commands/wrapper.ts (1)
85-87: Consider centralizing continuity source-path selection logic.This exists+map pattern is likely to recur across command surfaces; extracting a shared helper would reduce future drift between wrapper/session code paths.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/commands/wrapper.ts` around lines 85 - 87, Extract the exists+map logic into a shared helper (e.g., a function like selectExistingPaths or pickExistingPaths) and replace the inline array handling in wrapper.ts where [projectLocation, localLocation].filter(...).map(...) is used; the helper should accept an array of location-like objects (with exists and path properties) and return an array of path strings for those with exists=true so both wrapper and session command code paths can call the same utility and avoid divergence.package.json (1)
23-23: De-duplicate the smoke suites inci.
test/docs-contract.test.ts,test/memory-command.test.ts, andtest/session-command.test.tsnow run in the targeted suites and then run again underpnpm test. That makespnpm cislower and increases flake surface without strengthening the gate. Consider keeping the smoke suites disjoint, or trimming the redundant targeted runs from the top-level chain.Also applies to: 29-31
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@package.json` at line 23, The CI script "ci" currently runs targeted smoke suites ("test:docs-contract", "test:reviewer-smoke", "test:cli-smoke") and then runs the full "test" script again causing duplication; remove the redundant targeted runs from the "ci" pipeline or modify the "test" script to exclude those specific tests so smoke suites run only once. Locate the "ci" npm script entry in package.json and either delete the duplicate targeted scripts ("test:docs-contract", "test:reviewer-smoke", "test:cli-smoke") from the chain or update "test" to accept an env/flag that skips those suites, ensuring the smoke suite scripts remain disjoint from the top-level "test" invocation.src/lib/commands/memory.ts (1)
63-65: Normalizeconflictsbefore folding it into the compaction key.The recent-audit preview now treats conflict array order as part of the grouping signature. If the same conflict set is emitted in a different order on retry/recovery, equivalent sync events will stop coalescing and reviewers will see duplicate groups. Sorting by stable fields before
JSON.stringifywould keep this preview deterministic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/commands/memory.ts` around lines 63 - 65, The compaction key currently embeds entry.conflicts as-is which makes grouping order-sensitive; before folding conflicts into the key (where suppressedOperationCount, scopesTouched, conflicts are combined), normalize the conflicts array by sorting it deterministically (e.g., by a stable field such as conflict.id or conflict.path, then secondary fields as needed) and/or by deep-sorting objects' keys, then use that sorted structure when JSON.stringify-ing the key; update the place that builds the compaction key (referencing the conflicts property used alongside suppressedOperationCount and scopesTouched) to sort/normalize conflicts first so equivalent conflict sets always produce the same grouping signature.test/docs-contract.test.ts (1)
19-23: Prefer stable contract anchors over exact prose fragments.A few of these checks lock CI to editorial wording (
"reviewer warning prose","deterministic scrub", mixed-language sentence fragments) instead of stable contract markers like command names, field names, or section headings. That will create noisy failures on harmless copy edits. I’d narrow these assertions to the normative surface and leave descriptive wording flexible.Also applies to: 51-54
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/docs-contract.test.ts` around lines 19 - 23, The assertions in test/docs-contract.test.ts currently assert exact prose fragments (the expect(...) calls checking "reviewer warning prose", "cam memory", "cam session status", "confidence", "deterministic scrub"), which is brittle; change those asserts to check for stable contract anchors instead (e.g., section headings, field or command names, or backticked identifiers) using case-insensitive substring or regex matches so copyedits don't break CI—update the expect calls that read readme/readmeEn (also the similar checks around 51-54) to look for headings like a "Reviewer" or "Review" section header, explicit field names such as "confidence" as a standalone word boundary match (e.g. /\bconfidence\b/i) or backticked API/command names, rather than exact editorial phrases.src/lib/extractor/contradiction-review.ts (1)
273-289: Consider caching directive choices for existing entries.The
extractDirectiveChoicescall inside the filter loop reconstructs a synthetic operation for each existing entry on each review iteration. For large existing entry sets, this could be optimized.♻️ Potential optimization
+ const existingChoicesCache = new Map<string, DirectiveChoice[]>(); + for (const entry of existingEntries) { + if (!reviewableTopics.has(entry.topic)) continue; + const key = `${entry.scope}::${entry.topic}::${entry.id}`; + existingChoicesCache.set(key, extractDirectiveChoices({ + action: "upsert", + scope: entry.scope, + topic: entry.topic, + id: entry.id, + summary: entry.summary, + details: entry.details, + sources: entry.sources, + reason: entry.reason + })); + } // Then in the loop: const conflictingExisting = existingEntries .filter((entry) => { + const key = `${entry.scope}::${entry.topic}::${entry.id}`; + const existingChoices = existingChoicesCache.get(key) ?? []; return ( entry.scope === review.operation.scope && entry.topic === review.operation.topic && - choicesConflict(review.choices, extractDirectiveChoices({...})) + choicesConflict(review.choices, existingChoices) ); })This is a minor optimization since existing entries are typically small in practice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/extractor/contradiction-review.ts` around lines 273 - 289, The filter is calling extractDirectiveChoices for every existing entry on each review which is wasteful; precompute and cache each existing entry's directive choices once (e.g., build a Map keyed by entry.id or by the entry object) by invoking extractDirectiveChoices for the synthetic upsert operation for each entry, then replace the inline extractDirectiveChoices call inside the conflictingExisting computation with a lookup to the cache so choicesConflict(review.choices, cachedChoices) is used instead.src/lib/domain/recovery-records.ts (1)
34-56: Consider extracting shared type guards to reduce duplication.The type guards
isConflictSource,isConflictResolution, andisMemoryConflictCandidateare duplicated in bothrecovery-records.tsandmemory-sync-audit.ts. Consider extracting these to a shared validation utilities module.♻️ Suggested approach
Create a shared module like
src/lib/util/type-guards.ts:// src/lib/util/type-guards.ts export function isMemoryScope(value: unknown): value is MemoryScope { ... } export function isConflictSource(value: unknown): value is MemoryConflictCandidate["source"] { ... } export function isConflictResolution(value: unknown): value is MemoryConflictCandidate["resolution"] { ... } export function isMemoryConflictCandidate(value: unknown): value is MemoryConflictCandidate { ... } export function isStringArray(value: unknown): value is string[] { ... }Then import from both
recovery-records.tsandmemory-sync-audit.ts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/domain/recovery-records.ts` around lines 34 - 56, Extract the duplicated guards into a single shared module (e.g., type-guards) and export isConflictSource, isConflictResolution, isMemoryConflictCandidate, plus the helper guards used inside them (isMemoryScope and isStringArray); then remove the duplicated implementations from recovery-records.ts and memory-sync-audit.ts and import the exported functions from the new module, ensuring all callers still use the same function names (isConflictSource, isConflictResolution, isMemoryConflictCandidate) so type narrowing behavior is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/release-checklist.md`:
- Line 28: The release checklist currently contradicts itself by treating the
`pnpm exec tsx src/cli.ts audit` (the "audit" step) as both optional and
required; pick one policy (either make the audit step mandatory or clearly
optional) and make the wording consistent in both places that mention the audit
and the phrase "record its findings" so they match; update the two checklist
entries that reference the `audit` command and the instruction to record
findings to use identical language reflecting the chosen policy (e.g., "Run and
record the results of `pnpm exec tsx src/cli.ts audit`" if required, or
"Optionally run `pnpm exec tsx src/cli.ts audit`; record findings if run" if
optional).
In `@src/lib/extractor/session-continuity-evidence.ts`:
- Around line 276-319: collectWarningHints applies authoritative "clear and
replace" semantics via applySignal but processes all agentMessages then all
userMessages, which breaks interleaved ordering; change collectWarningHints to
operate on a single ordered transcript instead of separate role buckets by
merging agentMessages and userMessages into one sequence preserving original
chronological order (e.g., by attaching an index/timestamp to each message or by
accepting a pre-interleaved array) and then iterate that combined array, calling
isPromptLikeContinuityMessage and extractDirectiveChoice for each entry and
applying applySignal in that real order so authoritative clears respect actual
message order.
---
Outside diff comments:
In `@src/lib/commands/session.ts`:
- Around line 100-124: The grouping signature passed to
buildCompactHistoryPreview uses raw fields (e.g., entry.confidence,
entry.warnings) causing legacy entries to not coalesce; update the getSignature
logic inside buildCompactHistoryPreview to use the normalized diagnostics
instead (call toSessionContinuityDiagnostics(entry) or apply the same
normalization helpers) and replace raw entry.confidence/entry.warnings (and
other diagnostic-derived fields like fallbackReason, codexExitCode if produced
by normalization) with the normalized values so the signature matches what
preview rows render; keep existing keys (rolloutPath, sourceSessionId, scope,
trigger via normalizeSessionContinuityAuditTrigger, writeMode via
normalizeSessionContinuityWriteMode, preferredPath, actualPath, evidenceCounts,
writtenPaths) but source diagnostic values from the normalized diagnostics
object.
In `@src/lib/domain/sync-service.ts`:
- Around line 123-141: Currently code sets status = "no-op" whenever
applied.length === 0, which hides cases where reviewExtractedMemoryOperations
suppressed all candidates; change the status calculation after calling
reviewExtractedMemoryOperations/applyOperations so that if applied.length === 0
AND reviewedOperations.suppressedOperationCount > 0 (or there are non-empty
reviewedOperations.conflicts if you want to surface conflict-only cases) you set
status to "all-suppressed" instead of "no-op" before calling
buildMemorySyncAuditEntry, ensuring callers see a distinct all-suppressed audit
status.
---
Nitpick comments:
In `@package.json`:
- Line 23: The CI script "ci" currently runs targeted smoke suites
("test:docs-contract", "test:reviewer-smoke", "test:cli-smoke") and then runs
the full "test" script again causing duplication; remove the redundant targeted
runs from the "ci" pipeline or modify the "test" script to exclude those
specific tests so smoke suites run only once. Locate the "ci" npm script entry
in package.json and either delete the duplicate targeted scripts
("test:docs-contract", "test:reviewer-smoke", "test:cli-smoke") from the chain
or update "test" to accept an env/flag that skips those suites, ensuring the
smoke suite scripts remain disjoint from the top-level "test" invocation.
In `@src/lib/commands/memory.ts`:
- Around line 63-65: The compaction key currently embeds entry.conflicts as-is
which makes grouping order-sensitive; before folding conflicts into the key
(where suppressedOperationCount, scopesTouched, conflicts are combined),
normalize the conflicts array by sorting it deterministically (e.g., by a stable
field such as conflict.id or conflict.path, then secondary fields as needed)
and/or by deep-sorting objects' keys, then use that sorted structure when
JSON.stringify-ing the key; update the place that builds the compaction key
(referencing the conflicts property used alongside suppressedOperationCount and
scopesTouched) to sort/normalize conflicts first so equivalent conflict sets
always produce the same grouping signature.
In `@src/lib/commands/wrapper.ts`:
- Around line 85-87: Extract the exists+map logic into a shared helper (e.g., a
function like selectExistingPaths or pickExistingPaths) and replace the inline
array handling in wrapper.ts where [projectLocation,
localLocation].filter(...).map(...) is used; the helper should accept an array
of location-like objects (with exists and path properties) and return an array
of path strings for those with exists=true so both wrapper and session command
code paths can call the same utility and avoid divergence.
In `@src/lib/domain/recovery-records.ts`:
- Around line 34-56: Extract the duplicated guards into a single shared module
(e.g., type-guards) and export isConflictSource, isConflictResolution,
isMemoryConflictCandidate, plus the helper guards used inside them
(isMemoryScope and isStringArray); then remove the duplicated implementations
from recovery-records.ts and memory-sync-audit.ts and import the exported
functions from the new module, ensuring all callers still use the same function
names (isConflictSource, isConflictResolution, isMemoryConflictCandidate) so
type narrowing behavior is preserved.
In `@src/lib/extractor/contradiction-review.ts`:
- Around line 273-289: The filter is calling extractDirectiveChoices for every
existing entry on each review which is wasteful; precompute and cache each
existing entry's directive choices once (e.g., build a Map keyed by entry.id or
by the entry object) by invoking extractDirectiveChoices for the synthetic
upsert operation for each entry, then replace the inline extractDirectiveChoices
call inside the conflictingExisting computation with a lookup to the cache so
choicesConflict(review.choices, cachedChoices) is used instead.
In `@test/docs-contract.test.ts`:
- Around line 19-23: The assertions in test/docs-contract.test.ts currently
assert exact prose fragments (the expect(...) calls checking "reviewer warning
prose", "cam memory", "cam session status", "confidence", "deterministic
scrub"), which is brittle; change those asserts to check for stable contract
anchors instead (e.g., section headings, field or command names, or backticked
identifiers) using case-insensitive substring or regex matches so copyedits
don't break CI—update the expect calls that read readme/readmeEn (also the
similar checks around 51-54) to look for headings like a "Reviewer" or "Review"
section header, explicit field names such as "confidence" as a standalone word
boundary match (e.g. /\bconfidence\b/i) or backticked API/command names, rather
than exact editorial phrases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7ed7756e-998f-41c7-a684-e81ca6e955cc
📒 Files selected for processing (32)
.github/workflows/ci.ymlCONTRIBUTING.mdREADME.en.mdREADME.mddocs/architecture.en.mddocs/architecture.mddocs/release-checklist.mddocs/session-continuity.mdpackage.jsonsrc/lib/commands/memory.tssrc/lib/commands/session.tssrc/lib/commands/wrapper.tssrc/lib/domain/memory-sync-audit.tssrc/lib/domain/recovery-records.tssrc/lib/domain/session-continuity-diagnostics.tssrc/lib/domain/sync-service.tssrc/lib/extractor/contradiction-review.tssrc/lib/extractor/session-continuity-evidence.tssrc/lib/extractor/session-continuity-prompt.tssrc/lib/extractor/session-continuity-summarizer.tssrc/lib/types.tstest/docs-contract.test.tstest/extractor.test.tstest/fixtures/rollouts/hedged-preference-conflict.jsonltest/fixtures/rollouts/mixed-language-reviewer-noise.jsonltest/fixtures/rollouts/within-rollout-preference-conflict.jsonltest/memory-command.test.tstest/memory-sync-audit.test.tstest/recovery-records.test.tstest/session-command.test.tstest/session-continuity.test.tstest/sync-service.test.ts
| - Run `cam session load --json` and confirm older JSON consumers still receive the existing core fields. | ||
| - Run `cam session status --json` and confirm the latest explicit audit drill-down matches the newest audit-log entry when present. | ||
| - Run `pnpm pack:check` | ||
| - Run `pnpm exec tsx src/cli.ts audit` if you want the repository privacy scan; keep it as a manual release-time check instead of a CI gate. |
There was a problem hiding this comment.
Make the audit step either required or optional, not both.
Line 28 frames pnpm exec tsx src/cli.ts audit as an optional release-time check, while Line 55 requires recording its findings. That contradiction makes the checklist easy to apply inconsistently. Pick one policy and use the same wording in both sections.
Also applies to: 55-55
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/release-checklist.md` at line 28, The release checklist currently
contradicts itself by treating the `pnpm exec tsx src/cli.ts audit` (the "audit"
step) as both optional and required; pick one policy (either make the audit step
mandatory or clearly optional) and make the wording consistent in both places
that mention the audit and the phrase "record its findings" so they match;
update the two checklist entries that reference the `audit` command and the
instruction to record findings to use identical language reflecting the chosen
policy (e.g., "Run and record the results of `pnpm exec tsx src/cli.ts audit`"
if required, or "Optionally run `pnpm exec tsx src/cli.ts audit`; record
findings if run" if optional).
| function collectWarningHints(agentMessages: string[], userMessages: string[]): string[] { | ||
| const warnings = new Set<string>(); | ||
| const directiveValues = new Map<string, Set<string>>(); | ||
| let promptNoiseDetected = false; | ||
|
|
||
| const applySignal = (signal: DirectiveSignal) => { | ||
| const values = directiveValues.get(signal.key) ?? new Set<string>(); | ||
| if (signal.authoritative) { | ||
| values.clear(); | ||
| } | ||
| values.add(signal.value); | ||
| directiveValues.set(signal.key, values); | ||
| }; | ||
|
|
||
| for (const message of agentMessages) { | ||
| if (isPromptLikeContinuityMessage(message)) { | ||
| promptNoiseDetected = true; | ||
| continue; | ||
| } | ||
| const choices = [ | ||
| extractDirectiveChoice(message, packageManagerValues, "package-manager"), | ||
| extractDirectiveChoice(message, repoSearchValues, "repo-search") | ||
| ].filter((choice): choice is DirectiveSignal => Boolean(choice)); | ||
|
|
||
| for (const choice of choices) { | ||
| applySignal(choice); | ||
| } | ||
| } | ||
|
|
||
| for (const message of userMessages) { | ||
| if (isPromptLikeContinuityMessage(message)) { | ||
| promptNoiseDetected = true; | ||
| continue; | ||
| } | ||
|
|
||
| const choices = [ | ||
| extractDirectiveChoice(message, packageManagerValues, "package-manager"), | ||
| extractDirectiveChoice(message, repoSearchValues, "repo-search") | ||
| ].filter((choice): choice is DirectiveSignal => Boolean(choice)); | ||
|
|
||
| for (const choice of choices) { | ||
| applySignal(choice); | ||
| } | ||
| } |
There was a problem hiding this comment.
Preserve transcript order before applying authoritative directive clears.
applySignal() implements a “latest authoritative choice wins” rule, but this function only sees role-separated arrays and then processes all agentMessages before all userMessages. src/lib/domain/rollout.ts::parseRolloutEvidence() does not preserve a single interleaved message stream, so warning/confidence outcomes here can flip based on speaker bucket rather than actual rollout order. This needs an ordered transcript input before using authoritative replacement semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/extractor/session-continuity-evidence.ts` around lines 276 - 319,
collectWarningHints applies authoritative "clear and replace" semantics via
applySignal but processes all agentMessages then all userMessages, which breaks
interleaved ordering; change collectWarningHints to operate on a single ordered
transcript instead of separate role buckets by merging agentMessages and
userMessages into one sequence preserving original chronological order (e.g., by
attaching an index/timestamp to each message or by accepting a pre-interleaved
array) and then iterate that combined array, calling
isPromptLikeContinuityMessage and extractDirectiveChoice for each entry and
applying applySignal in that real order so authoritative clears respect actual
message order.
Summary
Describe the behavior change in user-facing terms.
What changed
Claude parity
Explain whether this change improves Claude Code auto memory parity, preserves it, or intentionally diverges.
Validation
pnpm lintpnpm testpnpm buildDocs
docs/claude-reference.mdupdated if parity behavior changeddocs/architecture.mdupdated if internals changeddocs/native-migration.mdupdated if migration assumptions changedRisks
List any risks, edge cases, or follow-up work.
Summary by Sourcery
Tighten continuity and memory reviewer hygiene by adding explicit confidence/warning surfaces, contradiction review for preference conflicts, and richer CLI/JSON diagnostics, while ensuring docs and CI enforce the updated reviewer-facing contract.
New Features:
Enhancements:
Build:
pnpm ciscript andpack:checkscript to run lint, contract tests, smoke tests, unit tests, build, and dry-run packaging in a single command.CI:
Documentation:
Tests:
Summary by cubic
Add contradiction review and reviewer confidence to keep session continuity and durable memory clean. Conflicting or hedged preferences are suppressed and surfaced in CLI output and audit instead of silently merging.
New Features
conflictsandsuppressedOperationCountin audit/recovery.cam sessionshows confidence/warnings in text and JSON; prompt guards against copying reviewer hints into continuity.cam memoryshows suppressed counts and a compact conflict review in pending recovery; wrapper only includes existing continuity file paths.Refactors
test:docs-contract,test:reviewer-smoke,test:cli-smoke, andpack:check; wire into workflow andpnpm ci.MemoryConflictCandidateand continuity confidence; extend audit and recovery records; add fixtures and tests for conflicting preferences and reviewer noise.Written for commit 0f112a4. Summary will update on new commits.
Summary by CodeRabbit
New Features
Documentation
Tests