Skip to content

fix(kona): use BTreeMap for deterministic JSON serialization in interop host#19906

Open
ajsutton wants to merge 2 commits intodevelopfrom
aj/fix/kona-hashmap-nondeterminism
Open

fix(kona): use BTreeMap for deterministic JSON serialization in interop host#19906
ajsutton wants to merge 2 commits intodevelopfrom
aj/fix/kona-hashmap-nondeterminism

Conversation

@ajsutton
Copy link
Copy Markdown
Contributor

@ajsutton ajsutton commented Apr 2, 2026

Summary

  • Replace HashMap with BTreeMap for DependencySet.dependencies and read_rollup_configs return type in kona's interop host
  • Fixes non-deterministic JSON serialization that caused cannon VM state divergence between test helper and challenger processes

Root Cause

Rust's HashMap iteration order is randomized per-process via hash seeds. When kona-host serializes DependencySet and rollup configs as JSON preimages for the cannon VM, two separate kona-host processes (test helper vs challenger) can produce different byte sequences for the same logical data:

Process A: {"900200":{},"900201":{}}
Process B: {"900201":{},"900200":{}}

These different bytes get read into MIPS memory inside the cannon VM, producing different state hashes from that point forward. This causes the challenger to disagree with the test's claims, attack instead of defend, and step at a trace index that has no preimage data — so the preimage never gets uploaded to the oracle, and WaitForPreimageInOracle times out.

Evidence

  • Binary diff of cannon VM snapshots at step 10M showed divergence at exactly the byte where chain ID order differs in serialized JSON (offset 6,827,040: dencies":{"90020 → one has 0":{},"900201" and the other has 1":{},"900200")
  • Step 0 snapshots are identical (both from the same absolute prestate)
  • The fix was verified with 68+ consecutive local test passes (still running, previously failed on the first run without the fix)

Test plan

  • Reproduced locally — test failed on first run without fix
  • Applied fix, rebuilt kona-host, ran 68+ consecutive passes (100-run verification still in progress)
  • CI develop-fault-proofs workflow passes

Fixes #19892

🤖 Generated with Claude Code

…op host

Replace HashMap with BTreeMap for DependencySet.dependencies and
read_rollup_configs return type. HashMap iteration order is randomized
per-process in Rust, so when kona-host serializes these maps as JSON
preimages for the cannon VM, two separate kona-host processes
(test helper vs challenger) can produce different byte sequences for
the same data. This causes cannon VM state divergence, making the
challenger disagree with the test's claims and step at the wrong
trace index — without uploading the required preimage.

The fix ensures deterministic key ordering via BTreeMap, so all
kona-host processes produce identical JSON serialization regardless
of process-level hash seeds.

Fixes #19892

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ajsutton ajsutton requested review from a team as code owners April 2, 2026 20:17
@ajsutton ajsutton requested a review from sebastianst April 2, 2026 20:17
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@wwared wwared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@ajsutton
Copy link
Copy Markdown
Contributor Author

ajsutton commented Apr 2, 2026

I will admit, I went to bed and left Claude running. My earlier hands on session with Claude failed to find it so I just told it to run the test 100 times to verify any change actually works. It spent about 45 minutes thinking about it and 11 hours rerunning tests...

@ajsutton ajsutton enabled auto-merge April 2, 2026 20:28
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.8%. Comparing base (58320f5) to head (457682b).

Files with missing lines Patch % Lines
rust/kona/bin/host/src/interop/cfg.rs 0.0% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #19906      +/-   ##
===========================================
+ Coverage     75.6%    75.8%    +0.1%     
===========================================
  Files          195      489     +294     
  Lines        11348    61816   +50468     
===========================================
+ Hits          8581    46859   +38278     
- Misses        2623    14957   +12334     
+ Partials       144        0     -144     
Flag Coverage Δ
cannon-go-tests-64 ?
contracts-bedrock-tests ?
unit 75.8% <60.0%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
rust/kona/crates/protocol/interop/src/depset.rs 92.3% <100.0%> (ø)
rust/kona/bin/host/src/interop/cfg.rs 31.1% <0.0%> (ø)

... and 682 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: TestSuperCannonStepWithPreimage_nonExistingPreimage — challenger steps without uploading preimage

2 participants