Make the debug stack serve the real parity workflow with the minimum number of contracts:
- one owned capture wire contract: raw Frida JSONL
- one durable trace kind: typed replay
.cdt - one replay schema shared by Frida finalize, Python record, and Zig record
- diff/focus/health as the center of the design
- query/entity/tick as thin projections, not schema drivers
If we sequence the cuts correctly, we should not need a more complicated artifact taxonomy. The target architecture should fall out of the cutover.
Historical differential sessions show the actual loop is:
flowchart LR
A["Capture original run\nFrida JSONL"] --> B["Finalize to typed replay .cdt + .crd"]
B --> C["Record candidate replay .cdt\nPython today, Zig later"]
C --> D["Find first sustained divergence"]
D --> E["Focus one tick window deeply"]
E --> F["Use decompile / hotspot evidence"]
F --> G["Patch deterministic runtime"]
G --> C
The sessions do not show strong evidence for multiple durable .cdt dialects or
for broad trace-query abstractions driving the schema.
- Do not preserve long-lived compatibility wrappers inside
src/crimson/dbg. - Do not add a
trace_kindtaxonomy unless we later prove we truly need more than one durable.cdtkind. - Do not let
query,entity,tick, orvizforce weak typing back into the replay path. - Do not optimize around stored bisect repro
.cdtbundles before proving they materially improve parity work.
flowchart LR
J["Raw Frida JSONL"] --> F["frida_finalize.py\nstrict typed boundary"]
F --> O["Original replay .cdt"]
R["record.py (python)"] --> P["Python replay .cdt"]
Z["crimson-zig/src/cdt_trace.zig"] --> Q["Zig replay .cdt"]
O --> D["diff / health / verify"]
P --> D
Q --> D
D --> X["focus"]
X --> H["decompile / hotspot analysis"]
- Fix the owned capture contract first.
- Cut secondary artifact paths before adding more metadata structure.
- Make all producers emit one replay schema before polishing peripheral consumers.
- Delete old paths in the same wave as caller migration.
- Use real captures and real replay artifacts as acceptance gates.
flowchart LR
S0["Stage 0\nLock invariants"] --> S1["Stage 1\nDelete viz now"]
S1 --> S2["Stage 2\nMake capture contract strict"]
S2 --> S3["Stage 3\nDelete bisect .cdt dialect"]
S3 --> S4["Stage 4\nType one replay meta/schema"]
S4 --> S5["Stage 5\nAlign Zig with replay schema"]
S5 --> S6["Stage 6\nTrim remaining secondary consumers"]
S6 --> S7["Stage 7\nRemove duplicate authorities + stale docs"]
Create a safety rail around the real parity loop so later cuts are measured against actual artifacts, not just type-checking.
- Expand fixture-driven coverage around:
tests/debug/test_dbg_frida_finalize.pytests/debug/test_dbg_trace.pytests/debug/test_dbg_record.pytests/debug/test_dbg_cli.py
- Add at least one fixture path for:
- raw Frida JSONL -> finalized replay
.cdt .crd-> Python replay.cdt- trace diff/focus happy path
- raw Frida JSONL -> finalized replay
- Add a single doc note in
DBG_TRACE_ARCHITECTURE_MEMO.mdpointing to this plan as the execution document.
- No schema changes yet.
- No consumer cleanup yet.
uv run pytest tests/debuguv run ty check src/crimson/dbg src/crimson/cli/dbg.py tests/debug
- We can change the internals of finalize/trace/diff without losing coverage on the real artifact flow.
test(dbg): lock replay trace cutover invariants
Remove the lowest-value surface immediately while the architecture is still
fresh in our heads. viz is a static HTML convenience layer, not part of the
core parity loop.
- no historical evidence that it drove parity fixes
- weaker diagnostic value than
difforfocus - easy to delete cleanly
- shrinks the public surface before we start deeper schema work
- delete:
src/crimson/dbg/viz.py- the
dbg vizcommand insrc/crimson/cli/dbg.py tests/debug/test_dbg_cli.pycoverage specific toviz
- remove
vizmentions from docs where it is presented as a standard parity workflow step:docs/frida/differential-playbook.mdDBG_TRACE_ARCHITECTURE_MEMO.mdplan.mdstage descriptions that would otherwise imply it survives longer
uv run pytest tests/debug/test_dbg_cli.pyuv run ty check src/crimson/dbg src/crimson/cli/dbg.py tests/debugrg -n "dbg viz|write_viz_html|viz.py" src tests docs
- there is no
vizcommand left - no docs still imply
vizis part of the standard parity workflow
refactor(dbg): remove trace viz
Raw Frida JSONL is not an untrusted boundary; it is an owned producer/consumer contract. The goal is to make that contract strict on both sides:
gameplay_diff_capture.jsemits one narrow, canonical wire shapefrida_finalize.pydecodes that shape directly and validates invariants- neither side relies on permissive cleanup to recover from sloppy producer output
The only tolerated failure modes should be transport issues like abrupt shutdown or truncated tail rows, not schema looseness.
scripts/frida/gameplay_diff_capture.js- document the exact emitted row shapes for
session_start,run_start,tick,run_end, andsession_end - make the row-builder functions authoritative for those shapes instead of passing through ad hoc objects
- emit canonical scalars/lists only:
- stable ints for ids/counts/ticks
- explicit booleans
- normalized string enums
- canonical replay input tuples
- stop silently papering over missing required tick fields with permissive
{}/[]/0.0fallbacks where the row is supposed to be replay-grade - fail fast or shut down capture when required replay fields are missing or non-finite, instead of writing a weak row and hoping finalize sorts it out
- keep truly auxiliary debug events separate in spirit from the finalized replay row contract
- identify which objects are truly stable contracts and which are genuinely flexible
- document the exact emitted row shapes for
src/crimson/dbg/frida_finalize.py- replace
_SessionStartRow.config: dict[str, Any]with a typed config struct - replace
_SessionStartRow.session_fingerprint: dict[str, Any]with a typed fingerprint struct - type any remaining stable capture row payloads instead of leaving them as generic maps
- keep semantic finalize validation where it is
- replace
- add regression fixtures/tests for malformed boundary rows so boundary validation stays strict
- narrow or remove
Anyusage in capture-row structs - reduce
payloads.pyinvolvement in finalize to presentation-only metadata if still needed at all - reduce finalize-time normalization that exists only because the JS producer is currently too permissive
uv run pytest tests/debug/test_dbg_frida_finalize.py tests/debug/test_dbg_trace.pyuv run ty check src/crimson/dbg/frida_finalize.pyuv run sg scan --report-style short src/crimson/dbg/frida_finalize.py
- The finalized replay trace is fully trusted typed data.
- The Frida script emits one narrow replay-grade row contract for
session_start/run_start/tick/run_end/session_end. - The JSONL capture format is treated as a strict owned wire contract, not as a loose boundary that needs defensive repair.
- Any remaining generic maps in finalize are explicitly documented as truly dynamic boundary payloads.
refactor(dbg): type frida finalize boundary rows
Replay .cdt should be the only durable trace kind. dbg bisect may still
produce an output artifact, but not a second trace dialect that drags extra
reader/writer complexity through the system.
src/crimson/dbg/diff.py- stop constructing
BisectTickRecordandBisectTickChannels - make
bisect_traces()return report data only - if
--outstays, write a lightweight report bundle that references the two replay traces plusfirst_bad_tickand window bounds
- stop constructing
src/crimson/cli/dbg.py- change
dbg bisect --outsemantics away fromrepro.cdt - prefer JSON or a small report directory over a third trace file
- change
tests/debug/test_dbg_cli.py- rewrite repro assertions around the new report output
BisectTickChannelsBisectTickRecordBisectTickBlockBisectTraceReaderwrite_bisect_trace_iterwrite_bisect_trace- any bisect-only schema branches in
trace.py
uv run pytest tests/debug/test_dbg_cli.py tests/debug/test_dbg_trace.pyuv run ty check src/crimson/dbg src/crimson/cli/dbg.pyrg -n "BisectTraceReader|write_bisect_trace|BisectTick" src tests
- There is only one durable
.cdtreader/writer path left. - We do not need
trace_kindbecause we no longer have competing durable trace dialects.
refactor(dbg): remove bisect repro trace dialect
Once only replay .cdt remains, TraceMeta can stop pretending to be a
generic artifact envelope and become a typed replay-trace contract.
src/crimson/dbg/schema.py- replace generic
TraceMetadict fields with typed replay metadata structs - define typed producer/source/config shapes for:
- Frida-finalized original traces
- Python-recorded candidate traces
- Zig-recorded candidate traces
- replace generic
src/crimson/dbg/record.py- emit the typed replay meta directly
src/crimson/dbg/frida_finalize.py- emit the same typed replay meta directly
src/crimson/dbg/trace.py- decode one replay meta contract
- keep the shared container logic, but not a generic artifact abstraction
- Prefer one replay meta with a small producer-specific typed section.
- Do not reintroduce an artifact union unless a second durable
.cdtkind reappears for a real reason.
uv run pytest tests/debuguv run ty check src/crimson/dbg src/crimson/cli/dbg.py tests/debuguv run sg scan --report-style short src/crimson/dbg src/crimson/cli/dbg.py
- All replay
.cdtproducers emit one typed metadata contract. - No consumer needs producer folklore plus generic dict access to read replay metadata.
refactor(dbg): type replay trace metadata
Python and Zig should emit the same replay trace contract, not merely two typed contracts that happen to be close.
crimson-zig/src/cdt_trace.zig- align replay channel shapes with Python canonical replay channels
- align ownership representation
- remove single-player-only assumptions where the schema should permit more
- Python side:
- remove Zig-specific schema accommodations once the contracts match
- add cross-implementation fixture coverage
- player container shapes
- owner representation
- metadata/source/config layout
- channel/version alignment
- Python can read Zig-produced traces through the normal replay reader
- Zig can read or validate Python-produced replay traces if a reader exists
uv run pytest tests/debuguv run ty check src/crimson/dbg src/crimson/cli/dbg.py tests/debug
- Replay traces from original/Frida, Python, and Zig all pass through the same replay reader and the same diff/focus pipeline.
refactor(zig): align replay trace schema with python
query, entity, and tick should project from typed replay rows or typed
diff/focus reports. They should not shape the underlying trace model.
src/crimson/dbg/query.py- remove any generic row-materialization that is not strictly needed at the query edge
src/crimson/cli/dbg.py- keep builtin conversion at JSON/HTML/text emission only
- any leftover replay-path coercion helpers whose only purpose is to hop through builtins and come back to typed data
uv run pytest tests/debug/test_dbg_cli.pyuv run ty check src/crimson/dbg/query.py src/crimson/cli/dbg.pyuv run sg scan --report-style short src/crimson/dbg src/crimson/cli/dbg.py
- Secondary commands are clearly layered on top of typed replay traces and reports.
refactor(dbg): thin secondary trace consumers
Finish the simplification by deriving facts once and making docs match reality.
src/crimson/dbg/trace.py- derive footer counts from actual rows
- decide whether
meta.tick_rangestays stored or becomes purely derived
src/crimson/dbg/diff.pyandsrc/crimson/dbg/health.py- stop trusting different competing metadata authorities for the same facts
- docs:
- update
docs/rewrite/cdt-trace-format.md - update
docs/frida/differential-playbook.mdif bisect output semantics change - update
DBG_TRACE_ARCHITECTURE_MEMO.mdto mark completed stages
- update
- add a structural rule or targeted test to stop
dict[str, object]/Anyfrom creeping back into replay-path structs
just checkuv run pytest tests/debuguv run ty check src/crimson/dbg src/crimson/cli/dbg.py tests/debug
- one source of truth per replay fact
- docs describe the shipped architecture
- no ghost APIs or stale repro-trace language remain
docs(dbg): sync replay trace architecture
Deleting the bisect .cdt dialect early keeps us from doing unnecessary work:
- no
trace_kind - no replay-vs-bisect meta union
- no second reader/writer pair to maintain
- no consumer branching over artifact kind
If the team later proves that a stored bisect bundle is still useful, it should re-enter as a small report artifact, not as a second core trace contract.
If we type metadata before removing the extra dialect, we risk designing a typed union around a split that we may delete. Removing the extra artifact kind first lets the simpler replay meta shape emerge naturally.
The final code should read like this:
gameplay_diff_capture.jsemits stable raw rowsfrida_finalize.pystrictly validates those rows and writes typed replay tracesrecord.pywrites the same replay trace contract for Python- Zig writes the same replay trace contract
diff,focus, andhealthoperate only on that contract- everything else is a projection
That is the smallest architecture that still serves the actual parity loop.