feat: v2 dual-write for checkpoints behind feature flag#759
feat: v2 dual-write for checkpoints behind feature flag#759
Conversation
... for transcript writes to /full/current ref
... instead of adding it to the permanent `/main` ref.
| Transcript string `json:"transcript"` | ||
| ContentHash string `json:"content_hash"` | ||
| Transcript string `json:"transcript,omitempty"` | ||
| ContentHash string `json:"content_hash,omitempty"` |
There was a problem hiding this comment.
These attributes might not be set going forward. Omitting them in such a case made sense to me.
There was a problem hiding this comment.
Why would Transcript be missing but Prompt isn't? Should Prompt be omitempty too? (Also, is this for the sub-agent case? Or is there some other case you had in mind?)
There was a problem hiding this comment.
The original thought was that by moving the full transcript to the /full/* refs we wouldn't need this anymore unless a checkpoint was pinned. Given that these attributes were only used for the full transcripts we should be okay omitting them unless checkpoints are pinned down the road.
Your comment brought up a few more questions, though:
- Do we want this attribute to point to the compacted
transcript.jsonlfile instead? - Do we want to introduce new attributes for compacted transcripts and their hashes?
- Do we want a way to refer to objects in the
/full/*namespace to somehow make the lookup simpler?
Let's take a look at those a bit more closely tomorrow.
There was a problem hiding this comment.
Pull request overview
Adds a feature-flagged “checkpoints v2” dual-write path so the manual-commit strategy can persist checkpoint data to both the existing v1 metadata branch and the new v2 custom refs, enabling a gradual migration without breaking current read paths.
Changes:
- Wire dual-write into
CondenseSessionand stop-time finalization (UpdateCommitted) behindsettings.IsCheckpointsV2Enabled(), with v2 failures treated as best-effort. - Introduce
V2GitStoreto write v2 checkpoint data to custom refs (/mainfor metadata/prompts,/full/currentfor raw transcript + content hash). - Add unit + strategy-level + integration tests, and extend integration
TestEnvhelpers to read from arbitrary refs.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/strategy/manual_commit_condensation.go | Builds shared v1/v2 write opts and adds best-effort v2 dual-write during condensation. |
| cmd/entire/cli/strategy/manual_commit_hooks.go | Adds best-effort v2 dual-write during stop-time checkpoint finalization. |
| cmd/entire/cli/strategy/manual_commit_test.go | Adds strategy-level tests for v2 dual-write enabled/disabled behavior. |
| cmd/entire/cli/paths/paths.go | Defines v2 custom ref names under refs/entire/.... |
| cmd/entire/cli/integration_test/v2_dual_write_test.go | Adds integration coverage for full workflow, disabled path, and stop-time finalization. |
| cmd/entire/cli/integration_test/testenv.go | Adds helpers for reading files from full refs and for checking ref existence. |
| cmd/entire/cli/checkpoint/v2_store.go | Adds low-level v2 ref creation/state/update primitives. |
| cmd/entire/cli/checkpoint/v2_committed.go | Implements v2 committed write/update logic for /main and /full/current. |
| cmd/entire/cli/checkpoint/v2_store_test.go | Adds unit tests for v2 ref management and committed write/update behavior. |
| cmd/entire/cli/checkpoint/committed.go | Reuses shared write-option validation via validateWriteOpts. |
| cmd/entire/cli/checkpoint/checkpoint.go | Makes SessionFilePaths JSON fields omitempty to support v2 layouts with missing fields. |
| // writeMainSessionToSubdirectory writes a single session's metadata, prompts, and | ||
| // content hash to a session subdirectory (0/, 1/, 2/, … indexed by session order | ||
| // within the checkpoint). Unlike the v1 equivalent, this does NOT write the raw | ||
| // transcript (full.jsonl) — that goes to /full/current. |
There was a problem hiding this comment.
Docstring states that writeMainSessionToSubdirectory writes a content hash to /main, but the implementation doesn't write content_hash.txt (it only writes prompts + metadata). Please adjust the docstring to avoid implying content_hash.txt exists on /main.
| // Dual-write: update v2 refs when enabled | ||
| updateCommittedV2IfEnabled(logCtx, repo, updateOpts) | ||
|
|
There was a problem hiding this comment.
updateCommittedV2IfEnabled() calls settings.IsCheckpointsV2Enabled(), which loads and parses settings from disk each time. Since this is invoked inside the per-checkpoint loop, stop-time finalization can end up reloading settings N times. Consider evaluating the feature flag once before the loop (or passing a precomputed bool into the helper) to avoid unnecessary I/O during stop-time finalization.
| // Determine session index | ||
| sessionIndex := s.gs.findSessionIndex(ctx, basePath, existingSummary, entries, opts.SessionID) | ||
|
|
||
| // Write session files (metadata, prompts, content hash — no transcript) |
There was a problem hiding this comment.
The comment says /main session writes include a content hash, but this function only writes metadata.json and (optionally) prompt.txt; no content_hash.txt is written on /main. Please update the comment to match the actual behavior (or add the content hash write if the intent is to store it on /main).
| // Write session files (metadata, prompts, content hash — no transcript) | |
| // Write session files (metadata and prompts — no transcript or content hash) |
Summary
Introduces checkpoints v2 dual-write support. When
checkpoints_v2is enabled in strategy options, the CLI writes checkpoint data to both v1 (entire/checkpoints/v1branch) and v2 custom refs (refs/entire/checkpoints/v2/mainandrefs/entire/checkpoints/v2/full/current).V2GitStore— new store type for v2 ref operations, separate fromGitStore(v1) to simplify future v1 removal/mainref — receives metadata, prompts, session metadata (no raw transcript). Compacttranscript.jsonlwill be added here once compaction (workstream B) is ready/full/currentref — receives raw transcript (full.jsonl) +content_hash.txt. Each write replaces the entire tree (no accumulation); generation rotation is future workCondenseSessionand stop-time finalization both dual-write, gated bysettings.IsCheckpointsV2Enabled(). V2 failures are best-effort (logged as warnings, never block v1)validateWriteOptsused by both v1 and v2 storesWhat's NOT in scope (future steps per design spec)
/mainentire status,entire explain,entire resume)/full/currentTest plan
V2GitStore(ref management, write/update for both refs, multi-session, edge cases)CondenseSession)mise run fmt && mise run lint— 0 issuesmise run test:ci— all unit, integration, and E2E canary tests pass🤖 Generated with Claude Code
Note
Medium Risk
Adds a new v2 checkpoint storage path using custom git refs and wires it into commit/stop-time flows; mistakes could lead to missing or inconsistent checkpoint data even though v1 remains the source of truth. Risk is mitigated by feature-flag gating and best-effort v2 writes that do not block v1.
Overview
Adds feature-flagged v2 dual-write for checkpoints: when
checkpoints_v2is enabled, the manual-commit strategy continues writing v1 checkpoint data while also writing to new custom refsrefs/entire/checkpoints/v2/mainandrefs/entire/checkpoints/v2/full/current.Introduces
V2GitStorewith ref management and write/update flows that split data:/mainstores checkpoint/session metadata + prompts (no transcript), while/full/currentstores the redacted, chunkedfull.jsonltranscript pluscontent_hash.txtand replaces prior contents on each write. Validation for write options is centralized viavalidateWriteOpts, and tests are expanded with new unit tests, integration tests, and test-env helpers to read custom refs.Written by Cursor Bugbot for commit 66841b2. Configure here.