Skip to content

feat(tokamak-debugger): Smart Contract Autopsy Lab + Sentinel Real-Time Detection#5

Open
jason-h23 wants to merge 127 commits intomainfrom
feat/tokamak-autopsy
Open

feat(tokamak-debugger): Smart Contract Autopsy Lab + Sentinel Real-Time Detection#5
jason-h23 wants to merge 127 commits intomainfrom
feat/tokamak-autopsy

Conversation

@jason-h23
Copy link
Copy Markdown

@jason-h23 jason-h23 commented Feb 28, 2026

Summary

  • Smart Contract Autopsy Lab (E-4): Post-hack forensic analysis tool that replays transactions through LEVM, classifies attack patterns (reentrancy, flash loan, price manipulation, access control bypass), traces fund flows (ETH + ERC-20), and generates comprehensive JSON/Markdown reports
  • Production Readiness (E-4 hardening): RPC timeout/retry, ERC-20 amount decoding, 80+ known labels, bounded caches, observability metrics, confidence scoring with evidence chains
  • Sentinel Real-Time Detection (H-1~H-3): 7-heuristic pre-filter engine, deep analysis with LEVM replay, background worker thread integrated into block processing pipeline
  • Alert & Dashboard (H-4/H-5): AlertDispatcher fan-out (JSONL/stdout/webhook), deduplication, rate limiting, WebSocket live feed, JSONL query engine, Prometheus metrics, Astro+React dashboard
  • E2E Pipeline: Live reentrancy detection demo — real bytecode execution through all 6 phases (deploy → trace → classify → fund-flow → sentinel → alert)
  • Sentinel H-6 Expansion: CLI/config wiring (H-6a), mempool pre-execution monitoring (H-6b), adaptive ML pipeline with z-score anomaly detection (H-6c), auto-pause circuit breaker with fail-open safety (H-6d)

H-6 Expansion Details

H-6a: CLI & Configuration

  • TOML-based SentinelFullConfig with 6 sub-configs
  • load_config / merge_cli_overrides / validate functions
  • 6 --sentinel.* CLI flags wired into ethrex cmd
  • init_sentinel() bootstrap returning SentinelComponents

H-6b: Mempool Monitoring

  • MempoolPreFilter with 5 calldata heuristics (flash-loan selector, high-value DeFi, high-gas known contract, suspicious creation, multicall pattern)
  • MempoolObserver trait in ethrex-blockchain with hooks in add_transaction_to_pool / add_blob_transaction_to_pool
  • MempoolAlert / MempoolSuspicionReason types

H-6c: Adaptive Pipeline

  • AnalysisStep trait + StepResult (Continue/Dismiss/AddSteps)
  • FeatureVector with 16 numerical features
  • 6 pipeline steps: FlashLoan, Reentrancy, PriceManipulation, AccessControl, FundFlow, AnomalyDetection
  • StatisticalAnomalyDetector (z-score + sigmoid confidence)
  • AnalysisPipeline orchestrator with PipelineMetrics

H-6d: Auto Pause

  • PauseController (AtomicBool + Condvar + auto-resume timer) in ethrex-blockchain
  • AutoPauseHandler (AlertHandler circuit breaker with configurable score threshold)
  • sentinel_resume / sentinel_status JSON-RPC endpoints (authrpc-only for resume)

Security Hardening (Code Review)

  • sentinel_resume moved to authrpc-only (was exposed on public HTTP)
  • Dynamic pipeline step queue bounded to MAX_DYNAMIC_STEPS = 64
  • Combined confidence score (max(prefilter, pipeline)) in alerts
  • PauseController fail-open on lock poisoning (no panic)

Test Plan

  • cargo test -p tokamak-debugger — 28 base tests
  • cargo test -p tokamak-debugger --features sentinel — 184 tests
  • cargo test -p tokamak-debugger --features autopsy — 118 + 10 ignored
  • cargo test -p tokamak-debugger --features "sentinel,autopsy" — 283 + 10 ignored
  • cargo test -p tokamak-debugger --features "cli,autopsy,sentinel" — 310 + 10 ignored
  • cargo test -p ethrex-blockchain pause_controller — 5 tests
  • cargo clippy -p tokamak-debugger --features "cli,autopsy,sentinel" — clean
  • cargo clippy -p ethrex-blockchain — clean
  • cargo build (default features) — clean

jason-h23 and others added 30 commits February 22, 2026 11:51
Complete Phase 0 analysis: evaluate ethrex, Reth, from-scratch, and
revm-only options via weighted decision matrix. ethrex fork selected
(score 4.85/5) for its custom LEVM, ZK-native architecture, Hook
system, and manageable 133K-line codebase.

Includes vision, competitive landscape, feature specs, team discussion
summaries, Volkov review history, and branch strategy.
- Rebalance decision matrix to ethrex vs Reth binary comparison;
  move "from scratch" and "revm only" to appendix
- Adjust Reth scores: ZK 1→2 (Zeth exists), manageability 2→3
  (modular arch acknowledged), sync 5→4 for ethrex (less battle-tested)
- Add EXIT criteria with 4 elements: metric, deadline, action, owner
- Add Tier S PoC section: perf_opcode_timings build verification
  and code path analysis
- Add JIT technical barriers (dynamic jumps, revmc reference)
- Fix weighted sum arithmetic (Reth 2.85→2.80)
Record completed work: DECISION.md creation, Volkov R6 review (6.5/10),
three mandatory fixes (matrix rebalance, EXIT criteria, Tier S PoC),
Reth/Zeth/ExEx research findings, and next steps for Phase 1.1.
- DECISION.md: DRAFT → FINAL
- Replace human staffing model with AI Agent development model
- Add bus factor policy (Kevin as interim decision-maker)
- Replace staffing risks with agent-specific risks
- Remove Senior Rust 2명 EXIT criterion
- Add 11 custom commands (.claude/commands/):
  - Development: /rust, /evm, /jit, /debugger, /l2
  - Verification: /quality-gate, /safety-review, /diff-test
  - Operations: /rebase-upstream, /phase, /bench
- Volkov R8: 7.5/10 PROCEED achieved
Architecture analysis documents:
- OVERVIEW.md: 25+2 crate dependency graph, node startup flow, CI inventory
- LEVM.md: VM struct, execution flow, dual-dispatch loop, hook system
- MODIFICATION-POINTS.md: 5 modification points, hybrid isolation strategy
- PHASE-1-1.md: Phase 1.1 execution plan with success criteria

Phase 1.1 infrastructure:
- Skeleton crates: tokamak-jit, tokamak-bench, tokamak-debugger
- Feature flag: `tokamak` propagation chain (cmd → vm → levm)
- Workspace registration for 3 new crates
- Fix OpcodeTimings: remove false min/max claim, document 4 actual fields
- Fix CallFrame: caller→msg_sender, Bytes→Code, return_data→output/sub_return_data
- Fix opcode table: describe const fn chaining pattern accurately
- Label all pseudocode snippets consistently (JIT, debugger, L2 hook)
- Plan feature flag split: tokamak → tokamak-jit/debugger/l2
- Add JIT-VM interface complexity analysis (5 challenges)
- Add failure scenarios & mitigations table (5 scenarios)
- Record build results: 5m53s clean, 718 tests passed
- Fix line count ~133K → ~103K (verified via wc -l)
- Add tokamak feature to OVERVIEW.md feature tables
Split monolithic `tokamak` feature into 3 independent features
(tokamak-jit, tokamak-debugger, tokamak-l2) with umbrella re-export.
Add pr-tokamak.yaml CI workflow for quality-gate and format checks.
Update snapsync action default image to tokamak-network/ethrex.
Document sync architecture, Hive test matrix, and success criteria.
Add structured benchmark infrastructure to tokamak-bench crate:
- timings.rs: reset(), raw_totals(), raw_counts() accessors
- tokamak-bench: types, runner, report, regression modules + CLI binary
- CI workflow: pr-tokamak-bench.yaml (bench PR vs base, post comparison)
- 11 unit tests covering regression detection, JSON roundtrip, reporting
Feature unification causes these modules to be compiled during L2
workspace clippy. Add targeted allows for arithmetic_side_effects,
as_conversions, expect_used, and unsafe_code lints.
Add the core JIT tiered compilation modules that were missing from
the branch: execution counter, code cache dispatch, types, and
module declaration. These provide the lightweight in-process
infrastructure gated behind the tokamak-jit feature flag.
- tokamak-jit: compiler, backend, adapter, validation, error modules
- JIT backend CI job with LLVM 18 in pr-tokamak.yaml
- jit_bench module in tokamak-bench for interpreter vs JIT comparison
- Phase 2 architecture documentation
- Updated HANDOFF with current status
Add Phase 3 JIT execution wiring so JIT-compiled bytecode actually
runs through the VM dispatch instead of only being compiled.

Key changes:
- JitBackend trait in dispatch.rs for dependency inversion (LEVM
  defines interface, tokamak-jit implements)
- LevmHost: revm Host v14.0 implementation backed by LEVM state
  (GeneralizedDatabase, Substate, Environment)
- Execution bridge: builds revm Interpreter, wraps state in LevmHost,
  transmutes CompiledCode to EvmCompilerFn, maps result to JitOutcome
- vm.rs wiring: try_jit_dispatch() && execute_jit() before interpreter
  loop, with fallback on failure
- register_jit_backend() for startup registration
- E2E tests: fibonacci JIT execution + JIT vs interpreter validation
  (behind revmc-backend feature, requires LLVM 21)
Close 7 gaps preventing production use of the JIT system:

- 4A: Propagate is_static from CallFrame to revm Interpreter
- 4B: Sync gas refunds after JIT execution, pass storage_original_values
  through JIT chain for correct SSTORE original vs present value
- 4C: Add LRU eviction to CodeCache (VecDeque + max_entries)
- 4D: Auto-compile when execution counter hits threshold, add compile()
  to JitBackend trait and backend() accessor to JitState
- 4E: Detect CALL/CREATE/DELEGATECALL/STATICCALL opcodes in analyzer,
  skip JIT compilation for contracts with external calls
- 4F: Skip JIT when tracer is active, add JitMetrics with atomic
  counters, log fallback events via eprintln
…compilation, and validation

Phase 5 addresses three remaining JIT gaps:

5A — Multi-fork support: Cache key changed from H256 to (H256, Fork) so the
same bytecode compiled at different forks gets separate cache entries.
fork_to_spec_id() adapter added. Hardcoded SpecId::CANCUN removed from
compiler, execution, and host — all now use the environment's fork.

5B — Background async compilation: New CompilerThread with std::sync::mpsc
channel and a single background thread. On threshold hit, vm.rs tries
request_compilation() first (non-blocking); falls back to synchronous
compile if no thread is registered. register_jit_backend() now also
starts the background compiler thread.

5C — Validation mode wiring: JitConfig.max_validation_runs (default 3)
gates logging to first N executions per (hash, fork). JitState tracks
validation_counts and logs [JIT-VALIDATE] with gas_used and output_len
for offline comparison. Full dual-execution deferred to Phase 6.
M1: CompilerThread now implements Drop — drops sender to signal
    shutdown, then joins the background thread. Panics are caught
    and logged (no silent swallowing). Fields changed to Option
    for take-on-drop pattern.

M2: SELFDESTRUCT (0xFF) added to has_external_calls detection in
    analyzer.rs. Bytecodes containing SELFDESTRUCT are now skipped
    by the JIT compiler, preventing the incomplete Host::selfdestruct
    (missing balance transfer) from being exercised.

M3: Negative gas refund cast fixed in execution.rs. Previously
    `refunded as u64` would wrap negative i64 (EIP-3529) to a huge
    u64. Now uses `u64::try_from(refunded)` — negative values are
    silently ignored (already reflected in gas remaining).

M4: Documented fork assumption in counter.rs and vm.rs. Counter is
    keyed by bytecode hash only (not fork). Safe because forks don't
    change during a node's runtime; cache miss on new fork falls back
    to interpreter.
…ment

Phase 6A — CALL/CREATE Resume:
- Add JitResumeState, SubCallResult, JitSubCall types for suspend/resume
- Add JitOutcome::Suspended variant for mid-execution suspension
- Extend JitBackend trait with execute_resume for resume-after-subcall
- Rewrite execution.rs: single-step execute, translate_frame_input,
  apply_subcall_result, handle_interpreter_action
- Add resume loop in vm.rs JIT dispatch block
- Add handle_jit_subcall() to execute sub-calls via LEVM interpreter
- Add run_subcall() with depth-bounded interpreter loop
- Remove has_external_calls compilation gate in backend.rs

Phase 6B — LLVM Memory Management:
- Add func_id: Option<u32> to CompiledCode for lifecycle tracking
- Return evicted func_id from CodeCache::insert() on eviction
- Add CompilerRequest enum (Compile/Free) to compiler_thread
- Add send_free() method for cache eviction notifications
- Wire Free request handling in register_jit_backend()
M1: Credit unused child gas back to revm interpreter via erase_cost()
M2: Write CALL output to interpreter memory at return_memory_offset
M3: Complete CREATE semantics (EIP-3860 initcode limit, nonce increment,
    EIP-170 code size check, deploy code storage)
M4: Extract shared interpreter_loop(stop_depth) to eliminate opcode
    dispatch table duplication between run_execution and run_subcall
M5: Add 7 tests for CALL/CREATE resume path (subcall.rs)
M6: Add balance validation before transfer in handle_jit_subcall
…daclass#6197)

## Motivation

The L2 integration test (`test_erc20_roundtrip`) panics with `unwrap()
on a None value` at `integration_tests.rs:705` after ~8 consecutive test
runs against the same L1/L2 instance. The `find_withdrawal_with_widget`
helper creates a fresh `L2ToL1MessagesTable` (starting from block 0),
fetches all withdrawal logs, and searches for the latest withdrawal —
but `on_tick` uses `truncate(50)` which keeps the **oldest** 50 items.
After enough runs accumulate >50 withdrawal events, the newest
withdrawal falls outside the window.

The bug is not easily reproducible manually because `--dev` mode removes
the databases on startup, so you can't restart with a pre-existing store
that has >50 entries. It surfaces in CI when integration tests run
repeatedly against the same L1/L2 instance without clearing state
between runs.

## Description

Replace `truncate(50)` with `drain(..len - 50)` in the `on_tick` methods
so that the **newest** 50 messages are kept instead of the oldest. This
fix is applied to all three monitor widgets that had the same pattern:

- `L2ToL1MessagesTable` — withdrawal messages (original bug)
- `L1ToL2MessagesTable` — deposit messages (same latent bug)
- `BlocksTable` — block list (same latent bug)

## Checklist

- [ ] Updated `STORE_SCHEMA_VERSION` (crates/storage/lib.rs) if the PR
includes breaking changes to the `Store` requiring a re-sync.
R13 fixes (3.0 → 6.0):
- M1: JIT CREATE tests exercising handle_jit_subcall CREATE arm
- M2: EIP-7702 delegation gap documented with TODO comment
- M3: Use from_bytecode_unchecked for CREATE init code
- R1: Precompile value transfer test with identity precompile
- R2: Non-precompile transfer guard aligned with generic_call
- R3: Comment reference format unified (no line numbers)

R14 fixes:
- M1: JitState::reset_for_testing() with clear() on CodeCache,
  ExecutionCounter, JitMetrics for test isolation across #[serial] tests
- M2: Differential JIT vs interpreter comparison in CREATE tests with
  jit_executions metrics assertion proving JIT path execution
- M3: Remaining line number reference removed from vm.rs
- R1: Precompile test strengthened with interpreter baseline comparison
- R2: CREATE collision JIT test with pre-seeded address verification

handle_jit_subcall CALL path: balance check, precompile BAL recording,
value transfer with EIP-7708 log, non-precompile BAL checkpoint.
handle_jit_subcall CREATE path: max nonce check, add_accessed_address,
BAL recording, collision check, deploy nonce, EIP-7708 log.
Gate test-only methods (reset_for_testing, clear, reset) behind
#[cfg(any(test, feature = "test-utils"))] to prevent production
exposure. Add missing reset_for_testing() calls to remaining serial
tests, gas_used differential assertions, and unit tests for new methods.
Reduce per-CALL overhead in JIT suspend/resume cycles without modifying
revmc. Three runtime-level optimizations target DeFi router patterns
(5-10 CALLs per tx):

- Tier 1: Bytecode zero-copy — cache Arc<Bytes> in CompiledCode at
  compile time, use Arc::clone in execute_jit instead of
  Bytes::copy_from_slice (~1-5us/CALL saved)
- Tier 2: Resume state pool — thread-local pool of JitResumeStateInner
  boxes (16-entry cap) eliminates Box alloc/dealloc per suspend/resume
- Tier 3: TX-scoped bytecode cache — FxHashMap<H256, Code> on VM avoids
  repeated db.get_code() for same contract in multi-CALL transactions

Adds bytecode_cache_hits metric to JitMetrics (9-tuple snapshot).
11 new tests in recursive_call_opt.rs, 69 total tokamak-jit tests.
…ifetime erasure

EvmCompiler contains raw LLVM pointers that aren't Send. Remove the
+ Send bound from ArenaCompiler.compilers since it only lives in
thread_local! storage. Add explicit lifetime transmute ('ctx -> 'static)
with safety comment. Also fix HashMap import scope and remove unused
JitBackend import.
Add KeccakLoop (chained hashing, 2.50x), BitwiseOps (XOR/AND/OR/SHL/SHR,
3.50x), and Exponentiation (MULMOD/ADDMOD, 2.19x). All bytecodes are
under 24KB (226-257 bytes), making them JIT-compilable. Includes Solidity
sources and solc-compiled bin-runtime files.
…tests

- fibonacci: add INTRINSIC_GAS import, use 20% tolerance for JIT vs
  interpreter gas comparison (EIP-2929 access list pre-warming causes
  small gas discrepancy between direct execute_jit and full VM paths)
- oversized: remove unused mut on db variable
- group-ib-analysis.md: product ideas from Group-IB Crime Trends 2026
- volkov-derived-ideas.md: 5 service ideas from internal Volkov Review
- autopsy-lab-plan.md: detailed implementation plan for hack post-mortem
  analysis service using existing Time-Travel Debugger infrastructure
Post-hack analysis service built on the Time-Travel Debugger:
- RemoteVmDatabase: archive node RPC client + LEVM Database impl with caching
- StepRecord enrichment: CALL value, LOG topics, SSTORE capture in recorder
- AttackClassifier: reentrancy, flash loan, price manipulation, access control detection
- FundFlowTracer: ETH transfers + ERC-20 Transfer event tracking
- AutopsyReport: JSON + Markdown output with suggested fixes
- CLI: `autopsy --tx-hash <HASH> --rpc-url <URL>` subcommand

New `autopsy` feature flag gates reqwest/sha3/serde_json/rustc-hash deps.
Tests: 28 base + 42 autopsy + 27 cli = 97 total (was 55).
… and all sections

- Add ExecutionOverview with call depth, opcode stats, contract count
- Always show all report sections (empty ones show "None detected")
- Fix legacy/EIP-1559 TX type auto-detection in CLI
- Set gas_price, tx_max_fee_per_gas, tx_nonce in Environment
- Use TX gas limit (not block) for Environment.gas_limit
Reports are now saved to a file instead of printing to stdout. Default
filename is autopsy-{hash_prefix}.{ext} in the current directory.
…callback patterns

Three detection strategies:
1. ETH value: existing CALL value borrow/repay matching
2. ERC-20: matching Transfer events (same token, to/from same address)
3. Callback depth: >60% of ops at depth > entry+1 indicates flash loan callback

Tested on Euler Finance stETH exploit TX — correctly identifies flash loan
provider (Lido stETH) and borrow/repay steps (257→9966 of 9991 total).
100 tests (was 97).
Address 9 issues from /devil review (6.8→target 8.5/10):
- Verdict-first summary ("VERDICT: Flash Loan detected.")
- Known contract labels (~20 mainnet addresses: DAI, WETH, Lido, Aave)
- PUSHn/DUPn/SWAPn aggregated in top opcodes (no duplicates)
- Zero-amount flash loans show "amount unknown" instead of "0 wei"
- All affected contracts listed with Role column (was 3/9, now 9/9)
- Storage value interpretation (MAX_UINT256, 0→nonzero, etc.)
- Section transition text for narrative flow
- Protocol-specific suggested fixes with disclaimer
- Conclusion section with attack timeline and callback span %
W1: Add Fund Flow limitation note (callback amounts not captured)
W2: Provider → "Suspected provider (heuristic)" throughout report
W2: Add "Unlabeled contracts" footer to Affected Contracts table
W3: Truncate storage slot hashes (0xabcdef01…89abcdef) + ABI footnote
W3: ERC-20 zero-value transfers show "(undecoded)" instead of "0"
W4: Uniform table separators (--- for all columns)
W5: Key Steps expanded — SSTORE, CREATE, ERC-20 events (2→7 entries)
W5: Conclusion replaces timeline copy with storage impact analysis
- E-4 (Smart Contract Autopsy Lab): mark as complete in ROADMAP + STATUS
  - RemoteVmDatabase, AttackClassifier (4 patterns), FundFlowTracer,
    AutopsyReport (verdict-first MD/JSON), CLI subcommand, 100 tests
- Phase H (Real-Time Attack Detection — Sentinel): new 5-task roadmap
  - H-1: Block execution recording hook
  - H-2: Lightweight pre-filter (depth/gas/calls/watchlist)
  - H-3: Real-time classification pipeline (async producer-consumer)
  - H-4: Alert & notification system (webhook/Slack/log)
  - H-5: Sentinel dashboard (WebSocket + historical browsing)
- STATUS: update Feature #21 (85%→95%), add Feature #22 (0%)
…provements

Phase I — Network Resilience:
- RPC timeout (30s) + exponential backoff retry (3 retries, 1s→2s→4s)
- Structured RpcError enum (6 variants) with retryable classification
- RpcConfig struct with --rpc-timeout/--rpc-retries CLI flags

Phase II — Data Quality:
- ERC-20 transfer amount decoding from LOG3 data bytes
- Price delta estimation via SLOAD value comparison
- 80+ known mainnet contract labels (DEX, lending, bridges, oracles)
- ABI-based storage slot decoding (keccak256 mapping support)

Phase III — Robustness:
- Bounded caches with FIFO eviction in RemoteVmDatabase
- AutopsyMetrics observability (RPC calls, cache hits, latency)
- 100k-step stress tests (<5s classification, <1s report)

Phase IV — Validation & Confidence:
- DetectedPattern wrapper with 0.0-1.0 confidence + evidence chains
- 10 mainnet exploit validation scaffolds (DAO, Euler, Curve, etc.)

Cross-crate: OpcodeRecorder::record_step now takes &Memory for LOG
data capture. Memory::current_base_offset() added to LEVM.

Tests: 145 passing + 10 ignored mainnet scaffolds (was 97).
Remove JIT compiler (tokamak-jit), benchmark harness (tokamak-bench),
dashboard, LEVM JIT module, L2 scaffolding, and related CI/docs from
the autopsy branch. These features live on feat/tokamak-three-pillars.

Add Phase H Sentinel real-time attack detection plan (SENTINEL-PLAN.md).
…H-3)

Implement the Sentinel system — a real-time hack detection pipeline that
monitors committed blocks for suspicious transactions and generates alerts.

Phase H-1: Pre-Filter Engine
- 7 receipt-based heuristics (flash loan signature, high-value revert,
  multiple ERC-20 transfers, known contract interaction, unusual gas,
  self-destruct indicators, oracle+swap pattern)
- SentinelConfig with configurable thresholds, 14 known mainnet addresses
- `sentinel` feature flag in tokamak-debugger
- 32 tests

Phase H-2: Deep Analysis Engine
- replay_tx_from_store: re-executes suspicious TX from local Store with
  OpcodeRecorder, executing preceding TXs to reconstruct correct state
- DeepAnalyzer: orchestrates replay → AttackClassifier → FundFlowTracer
  (reuses E-4 autopsy infrastructure via #[cfg(feature = "autopsy")])
- SentinelAlert, SentinelError (8 variants), AnalysisConfig types
- 20 tests (14 sentinel-only + 6 autopsy-gated)

Phase H-3: Block Processing Integration
- BlockObserver trait in ethrex-blockchain (DIP — avoids circular dep)
- SentinelService: background worker thread with mpsc channel, two-stage
  PreFilter → DeepAnalyzer pipeline, non-blocking on block processing
- Hooks in add_block/add_block_pipeline after store_block
- AlertHandler trait + LogAlertHandler default implementation
- Graceful shutdown via Drop (send signal + join worker thread)
- 11 tests

Architecture doc: docs/tokamak/SENTINEL-ARCHITECTURE.md

Total: 208 passing + 10 ignored, clippy clean all feature combinations.
…nd metrics (H-4/H-5)

H-4 Alert & Notification System:
- AlertDispatcher composite fan-out to multiple handlers
- JsonlFileAlertHandler (append-only JSONL), StdoutAlertHandler
- WebhookAlertHandler (HTTP POST + exponential backoff, autopsy-gated)
- AlertDeduplicator (block-window suppression), AlertRateLimiter (sliding-window)

H-5 Sentinel Dashboard:
- WsAlertBroadcaster: real-time WebSocket alert feed with dead-subscriber cleanup
- AlertHistory: JSONL-based query engine with pagination, filtering, sorting
- SentinelMetrics: Prometheus text exposition format (8 atomic counters)
- Dashboard UI: Astro+React sentinel page with AlertFeed, AlertCard,
  AlertHistoryTable, SentinelMetricsPanel components
- SentinelService instrumented with timing and counter metrics

Tests: 259 Rust (cli+autopsy+sentinel) + 97 dashboard, clippy clean
…mode

Add prefilter_alert_mode to AnalysisConfig for lightweight monitoring
without full Merkle trie state. When enabled, SentinelService emits
PreFilter-based alerts if deep analysis fails or returns nothing.

Three E2E tests prove the full pipeline: bytecode execution through
LEVM with opcode recording, AttackClassifier reentrancy detection
(confidence >= 0.7), PreFilter receipt-based heuristics, and
SentinelService background worker alert emission.

262 tests pass (+3 new), 10 ignored, clippy clean.
…2E test

6-phase E2E test with real bytecode execution through the entire
Sentinel pipeline (LEVM → AttackClassifier → FundFlowTracer →
SentinelService → alert validation). Also adds an executable demo
example that visualizes each phase.

- Phase 1: Deploy attacker+victim contracts, execute in LEVM (80 steps)
- Phase 2: Verify call depth >= 3 and SSTORE count >= 2
- Phase 3: AttackClassifier detects Reentrancy (confidence >= 70%)
- Phase 4: FundFlowTracer traces ETH transfers (victim → attacker)
- Phase 5: SentinelService processes real receipt, emits alert
- Phase 6: Validate alert content and metrics counters

Test count: 263 passing + 10 ignored (was 262+10), clippy clean.
- Add live reentrancy pipeline entry to Feature #21 and Phase H sections
- Update tokamak-debugger file/line counts (33→45 files, ~6,950→~13,900 lines)
- Update total Tokamak codebase count (~16,630→~23,980 lines)
- Test count: 263 passing + 10 ignored
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the Tokamak debugger's capabilities by introducing a comprehensive Smart Contract Autopsy Lab for forensic analysis and a Sentinel system for real-time attack detection. These new features are supported by robust infrastructure improvements, including enhanced RPC and P2P communication, optimized block processing, and detailed developer documentation. The changes aim to provide powerful tools for understanding and mitigating smart contract vulnerabilities, both reactively and proactively.

Highlights

  • Smart Contract Autopsy Lab (E-4): Introduced a post-hack forensic analysis tool that replays transactions through LEVM, classifies attack patterns (reentrancy, flash loan, price manipulation, access control bypass), traces fund flows (ETH + ERC-20), and generates verdict-first reports.
  • Autopsy Production Hardening: Implemented a 4-phase improvement focusing on RPC resilience, enhanced ERC-20 decoding with 80 labels, bounded caches for memory management and observability, and confidence scoring with evidence chains.
  • Sentinel Real-Time Detection (H-1~H-5): Developed a real-time attack detection system featuring a receipt-based pre-filter (7 heuristics), opcode-level deep analysis, and an alert pipeline with deduplication/rate-limiting, JSONL/webhook/WebSocket handlers, Prometheus metrics, and a dashboard UI.
  • Live Reentrancy E2E Pipeline: Added a comprehensive 6-phase end-to-end test that proves real bytecode execution flows through the entire detection pipeline (LEVM → classifier → fund flow → sentinel → alert).
  • New Developer Tooling Documentation: Added extensive documentation for various developer modes and quality gates under .claude/commands, covering benchmarking, time-travel debugging, differential testing, EVM specialization, JIT compilation, L2 hooks, phase management, quality gates, upstream rebasing, Rust expertise, and safety reviews.
  • RPC and P2P Enhancements: Improved RPC handling by integrating tokamak-debugger RPC methods, and enhanced P2P networking with external IP detection via PONG voting and offloading transaction pool insertion to background tasks.
  • Smart Contract Batch Verification: Modified L2 contracts (OnChainProposer.sol, Timelock.sol) to support batch verification of proofs, improving efficiency for rollup operations.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .claude/commands/bench.md
    • Added documentation for the benchmark runner, detailing execution steps and reporting format.
  • .claude/commands/debugger.md
    • Added documentation for the time-travel debugger developer mode, outlining its role, architecture, and workflow.
  • .claude/commands/diff-test.md
    • Added documentation for differential testing, explaining its purpose and execution sequence.
  • .claude/commands/evm.md
    • Added documentation for the EVM specialist developer mode, covering LEVM architecture and workflow.
  • .claude/commands/jit.md
    • Added documentation for the JIT compiler developer mode, detailing tiered execution and technical challenges.
  • .claude/commands/l2.md
    • Added documentation for the L2 Hook developer mode, describing the Hook architecture and implementation roadmap.
  • .claude/commands/phase.md
    • Added documentation for phase management, defining phases and their entry/exit criteria.
  • .claude/commands/quality-gate.md
    • Added documentation for the quality gate process, outlining execution steps and result determination.
  • .claude/commands/rebase-upstream.md
    • Added documentation for the upstream rebase workflow, including prerequisites and conflict resolution.
  • .claude/commands/rust.md
    • Added documentation for the Rust expert developer mode, detailing codebase context and coding conventions.
  • .claude/commands/safety-review.md
    • Added documentation for the safety review process, outlining principles and execution layers.
  • .github/actions/build-docker/action.yml
    • Updated the description for the 'variant' input to include 'tokamak' as an example.
  • .github/actions/snapsync-run/action.yml
    • Updated the default Docker image repository to ghcr.io/tokamak-network/ethrex.
    • Added a new input build_flags to allow additional cargo build flags.
    • Modified the assertoor test file path to use GITHUB_REPOSITORY for flexibility.
  • .gitignore
    • Added new entries to ignore dashboard-related build artifacts and dependencies.
  • CHANGELOG.md
    • Added new performance entries for February 2026, including LEVM interpreter loop dispatch expansion and precompile result caching.
  • Cargo.lock
    • Added crossbeam-channel and tokamak-debugger as new dependencies.
  • Cargo.toml
    • Added crates/tokamak-debugger to the workspace members.
    • Introduced a new jit-bench profile with specific LTO and codegen-units settings.
    • Added serial_test to workspace dependencies.
  • cmd/ethrex/Cargo.toml
    • Added tokamak-debugger as a feature for ethrex-vm.
  • cmd/ethrex/cli.rs
    • Added environment variable support for various CLI arguments such as bootnodes, syncmode, metrics.addr, dev, log.color, log.dir, mempool.maxsize, and precompute_witnesses.
    • Updated calls to blockchain.add_block_pipeline to include an optional BlockAccessList argument.
  • cmd/ethrex/initializers.rs
    • Updated calls to blockchain.add_block_pipeline to include an optional BlockAccessList argument.
  • cmd/ethrex/l2/command.rs
    • Updated calls to blockchain.add_block_pipeline to include an optional BlockAccessList argument.
  • cmd/ethrex/l2/options.rs
    • Added environment variable support for sponsorable-addresses and proof-coordinator.prover-timeout.
    • Included prover_timeout_ms in SequencerConfig and ProofCoordinatorOptions with a default value.
  • crates/blockchain/blockchain.rs
    • Introduced a BlockObserver trait for receiving notifications when blocks are committed.
    • Removed the Debug derive from Blockchain and implemented a custom Debug formatter.
    • Added BalStateWorkItem struct for BAL state trie shard workers.
    • Added a block_observer field to Blockchain to allow external components to subscribe to block commit events.
    • Implemented with_block_observer and set_block_observer methods to manage the block observer.
    • Modified execute_block_pipeline to accept an optional BlockAccessList for BAL-based pre-warming and merkleization.
    • Added handle_merkleization_bal for BAL-specific parallel storage root computation and state trie updates.
    • Updated add_block_pipeline to accept an optional BlockAccessList and notify the block_observer after successful block storage.
  • crates/blockchain/mempool.rs
    • Renamed clear_broadcasted_txs to remove_broadcasted_txs and modified it to remove specific transaction hashes instead of clearing the entire pool.
  • crates/blockchain/metrics/l2/metrics.rs
    • Added tx_hash as a label to the batch_verification_gas metric for more granular tracking.
  • crates/blockchain/tracing.rs
    • Refactored trace_transaction_calls to use a new prepare_state_for_tx helper function, which prepares the EVM state before transaction execution.
  • crates/common/types/block_access_list.rs
    • Added an all_storage_slots method to AccountChanges to iterate over all storage slots that need prefetching.
  • crates/common/types/transaction.rs
    • Removed rlp_encode_as_pooled_tx and rlp_length_as_pooled_tx methods from EIP4844Transaction.
  • crates/l2/based/README.md
    • Updated documentation to reflect the change from verifyBatch to verifyBatches.
  • crates/l2/contracts/src/l1/OnChainProposer.sol
    • Refactored verifyBatch into an internal _verifyBatchInternal function.
    • Introduced a new verifyBatches function to allow verification of multiple consecutive L2 batches in a single transaction.
  • crates/l2/contracts/src/l1/Timelock.sol
    • Updated the verifyBatch function to verifyBatches to align with the new multiple batch verification functionality.
  • crates/l2/contracts/src/l1/based/OnChainProposer.sol
    • Refactored verifyBatch into an internal _verifyBatchInternal function.
    • Introduced a new verifyBatches function to allow verification of multiple consecutive L2 batches in a single transaction.
  • crates/l2/contracts/src/l1/based/interfaces/IOnChainProposer.sol
    • Updated the interface from verifyBatch to verifyBatches to support multiple batch verification.
  • crates/l2/contracts/src/l1/interfaces/IOnChainProposer.sol
    • Updated the interface from verifyBatch to verifyBatches to support multiple batch verification.
  • crates/l2/contracts/src/l1/interfaces/ITimelock.sol
    • Updated the interface from verifyBatch to verifyBatches to support multiple batch verification.
  • crates/l2/sequencer/configs.rs
    • Added prover_timeout_ms to ProofCoordinatorConfig to configure the timeout for prover assignments.
  • crates/l2/sequencer/l1_committer.rs
    • Updated calls to blockchain.add_block_pipeline to include an optional BlockAccessList argument.
  • crates/l2/sequencer/l1_proof_sender.rs
    • Changed VERIFY_FUNCTION_SIGNATURE to VERIFY_BATCHES_FUNCTION_SIGNATURE to reflect batch verification.
    • Refactored proof sending logic to support sending multiple consecutive batch proofs in a single transaction.
    • Implemented try_delete_invalid_proof to delete invalid proofs from the store upon transaction revert.
    • Added finalize_batch_proof to update the latest sent batch proof and clean up checkpoint directories.
  • crates/l2/sequencer/proof_coordinator.rs
    • Added assignments (a mutex-protected HashMap) and prover_timeout fields to manage prover assignments and timeouts.
    • Implemented next_batch_to_assign to intelligently distribute proof generation work to provers, considering existing proofs and timeouts.
    • Removed request_timestamp metrics and related logic.
  • crates/networking/p2p/discv5/server.rs
    • Added external IP detection via PONG voting, including ip_votes, ip_vote_period_start, and first_ip_vote_round_completed fields.
    • Implemented record_ip_vote and finalize_ip_vote_round to manage the voting process and update the local ENR's IP.
    • Introduced is_private_ip to filter out private IP addresses from voting.
    • Updated handle_find_node to validate the sender's contact before responding.
    • Modified cleanup_stale_entries to include IP vote period checks.
  • crates/networking/p2p/rlpx/connection/handshake.rs
    • Optimized receive_auth and receive_ack to avoid cloning msg_bytes.
    • Simplified receive_handshake_msg to directly return the buffer, removing unnecessary size checks.
  • crates/networking/p2p/rlpx/connection/server.rs
    • Offloaded expensive transaction pool insertion to a background task when handling Transactions messages, improving ConnectionServer responsiveness.
  • crates/networking/p2p/rlpx/eth/transactions.rs
    • Updated NewPooledTransactionHashes to correctly calculate the size of EIP-4844 transactions, including the blobs bundle.
  • crates/networking/p2p/rlpx/l2/l2_connection.rs
    • Updated calls to blockchain.add_block_pipeline to include an optional BlockAccessList argument.
  • crates/networking/p2p/sync/full.rs
    • Updated calls to blockchain.add_block_pipeline to include an optional BlockAccessList argument.
  • crates/networking/p2p/tx_broadcaster.rs
    • Changed clear_broadcasted_txs to remove_broadcasted_txs to remove specific transaction hashes from the broadcast pool.
  • crates/networking/rpc/Cargo.toml
    • Added tokamak-debugger as an optional dependency with a path reference.
  • crates/networking/rpc/debug/mod.rs
    • Added time_travel module, gated by the tokamak-debugger feature.
  • crates/networking/rpc/debug/time_travel.rs
    • Added a new RPC handler for debug_timeTravel, enabling opcode-level transaction replay and state inspection.
  • crates/networking/rpc/engine/payload.rs
    • Updated handle_new_payload_v1_v2, handle_new_payload_v3, handle_new_payload_v4, and add_block to accept an optional BlockAccessList argument.
  • crates/networking/rpc/rpc.rs
    • Updated BlockWorkerMessage type and start_block_executor function to handle an optional BlockAccessList for block processing.
    • Mapped the new debug_timeTravel RPC method to its handler.
  • crates/tokamak-debugger/Cargo.toml
    • Added a new crate tokamak-debugger to the workspace.
    • Defined cli, autopsy, and sentinel features for conditional compilation of debugger components.
  • crates/tokamak-debugger/examples/reentrancy_demo.rs
    • Added a new example demonstrating the full 6-phase reentrancy detection pipeline, including bytecode execution, classification, fund flow tracing, and sentinel alerting.
  • crates/tokamak-debugger/src/autopsy/abi_decoder.rs
    • Added a new module for ABI-based storage slot decoding, enabling human-readable labels for storage slots.
  • crates/tokamak-debugger/src/autopsy/classifier.rs
    • Added a new module for attack pattern classification, detecting reentrancy, flash loans, price manipulation, and access control bypasses with confidence scores.
  • crates/tokamak-debugger/src/autopsy/enrichment.rs
    • Added a new module for post-hoc trace enrichment, specifically for filling in old_value for SSTORE operations.
  • crates/tokamak-debugger/src/autopsy/fund_flow.rs
    • Added a new module for fund flow tracing, extracting ETH and ERC-20 transfers from execution traces.
  • crates/tokamak-debugger/src/autopsy/metrics.rs
    • Added a new module for autopsy-specific metrics, tracking RPC calls, cache hits, and timing.
  • crates/tokamak-debugger/src/autopsy/mod.rs
    • Added a new module for the Smart Contract Autopsy Lab, providing post-hack analysis capabilities.
  • crates/tokamak-debugger/src/autopsy/remote_db.rs
    • Added a new module for a remote VM database, backed by archive node JSON-RPC with bounded caching and retry logic.
  • crates/tokamak-debugger/src/autopsy/report.rs
    • Added a new module for autopsy report generation, producing detailed JSON and Markdown reports with summaries, attack patterns, fund flows, and suggested fixes.
  • crates/tokamak-debugger/src/autopsy/rpc_client.rs
    • Added a new module for a thin JSON-RPC HTTP client with retry and backoff mechanisms.
  • crates/tokamak-debugger/src/autopsy/types.rs
    • Added new core data types for autopsy analysis, including AttackPattern, FundFlow, DetectedPattern, AnnotatedStep, and Severity.
  • crates/tokamak-debugger/src/bin/debugger.rs
    • Added a new binary entry point for the tokamak-debugger CLI application.
  • crates/tokamak-debugger/src/cli/commands.rs
    • Added a new module for debugger CLI commands, including step, continue, break, goto, info, stack, list, and help.
  • crates/tokamak-debugger/src/cli/formatter.rs
    • Added a new module for formatting debugger CLI output, including step details, stack, and breakpoints.
  • crates/tokamak-debugger/src/cli/mod.rs
    • Added a new module for the debugger CLI entry point, handling bytecode and autopsy modes.
  • crates/tokamak-debugger/src/cli/repl.rs
    • Added a new module for the interactive debugger REPL loop.
  • crates/tokamak-debugger/src/engine.rs
    • Added a new module for the replay engine, which records and allows navigation through opcode-level transaction traces.
  • crates/tokamak-debugger/src/error.rs
    • Added new error types for the debugger, including VM, CLI, RPC, and Report errors.
  • crates/tokamak-debugger/src/lib.rs
    • Added a new library for the Tokamak Time-Travel Debugger, exposing its core modules and features.
  • crates/tokamak-debugger/src/recorder.rs
    • Added a new module for an OpcodeRecorder implementation that captures StepRecords, including call values, storage writes, and log data.
  • crates/tokamak-debugger/src/sentinel/alert.rs
    • Added a new module for alert dispatching, deduplication, and rate limiting, providing composable handlers for the Sentinel system.
  • crates/tokamak-debugger/src/sentinel/analyzer.rs
    • Added a new module for the deep analysis engine, re-executing suspicious transactions with opcode recording and running the autopsy pipeline.
  • crates/tokamak-debugger/src/sentinel/history.rs
    • Added a new module for the historical alert query engine, reading alerts from JSONL files and providing filterable access.
  • crates/tokamak-debugger/src/sentinel/metrics.rs
    • Added a new module for Prometheus-compatible metrics collection for the Sentinel pipeline.
  • crates/tokamak-debugger/src/sentinel/mod.rs
    • Added a new module for the Sentinel real-time hack detection system.
  • crates/tokamak-debugger/src/sentinel/pre_filter.rs
    • Added a new module for the receipt-based pre-filter, using lightweight heuristics to flag suspicious transactions.
  • crates/tokamak-debugger/src/sentinel/replay.rs
    • Added a new module for transaction replay for sentinel deep analysis, re-executing transactions from local node state.
  • crates/tokamak-debugger/src/sentinel/service.rs
    • Added a new module for the sentinel background service, implementing BlockObserver to monitor committed blocks.
  • crates/tokamak-debugger/src/sentinel/types.rs
    • Added new sentinel-specific types for configuration, suspicious transactions, reasons, alert priority, and analysis configuration.
  • crates/tokamak-debugger/src/sentinel/webhook.rs
    • Added a new module for a webhook alert handler, posting serialized alerts to an HTTP endpoint with retry logic.
  • crates/tokamak-debugger/src/sentinel/ws_broadcaster.rs
    • Added a new module for a WebSocket alert broadcaster, providing a publish-subscribe layer for alerts.
  • crates/tokamak-debugger/src/tests/autopsy_tests.rs
    • Added new tests for the Smart Contract Autopsy Lab, including classifier, enrichment, fund flow, and report generation.
  • crates/tokamak-debugger/src/tests/basic_replay.rs
    • Added new tests for basic replay functionality, verifying step recording and opcode/PC values.
  • crates/tokamak-debugger/src/tests/cli_tests.rs
    • Added new tests for the CLI module, covering command parsing, formatting, and execution.
  • crates/tokamak-debugger/src/tests/error_handling.rs
    • Added new tests for error handling during replay recording, including REVERT, STOP, empty bytecode, and out-of-gas scenarios.
  • crates/tokamak-debugger/src/tests/gas_tracking.rs
    • Added new tests for gas tracking, verifying gas consumption and consistency with execution reports.
  • crates/tokamak-debugger/src/tests/helpers.rs
    • Added new shared test helpers for setting up test environments and transactions.
  • crates/tokamak-debugger/src/tests/mainnet_validation.rs
    • Added new tests for mainnet exploit validation, replaying real exploit transactions against an archive node.
  • crates/tokamak-debugger/src/tests/mod.rs
    • Added a new module for organizing debugger tests.
  • crates/tokamak-debugger/src/tests/navigation.rs
    • Added new tests for navigation functionality, including forward, backward, and goto cursor operations.
  • crates/tokamak-debugger/src/tests/nested_calls.rs
    • Added new tests for nested calls, verifying depth tracking through CALL and CREATE operations.
  • crates/tokamak-debugger/src/tests/recorder_edge_cases.rs
    • Added new tests for recorder edge cases, focusing on stack capture behavior with various stack sizes and configurations.
  • crates/tokamak-debugger/src/tests/serde_tests.rs
    • Added new tests for serialization round-trip of debugger types.
  • crates/tokamak-debugger/src/tests/stress_tests.rs
    • Added new stress tests for large traces, validating performance of classification, fund flow tracing, and report generation.
  • crates/tokamak-debugger/src/types.rs
    • Added new core data types for the time-travel debugger, including ReplayConfig, StorageWrite, StepRecord, and ReplayTrace.
  • crates/vm/Cargo.toml
    • Added tokamak-debugger as a feature for ethrex-levm.
  • crates/vm/backends/levm/mod.rs
    • Implemented warm_block_from_bal for BAL-based pre-warming of state by loading accounts and storage slots directly.
    • Made setup_env public to allow external modules to set up the EVM environment.
  • crates/vm/levm/Cargo.toml
    • Added rustc-hash and crossbeam-channel as dependencies.
    • Introduced test-utils and tokamak-debugger as new features.
  • crates/vm/levm/bench/revm_comparison/contracts/BitwiseOps.sol
    • Added a new Solidity contract for benchmarking bitwise operations.
  • crates/vm/levm/bench/revm_comparison/contracts/Exponentiation.sol
    • Added a new Solidity contract for benchmarking modular exponentiation.
  • crates/vm/levm/bench/revm_comparison/contracts/KeccakLoop.sol
    • Added a new Solidity contract for benchmarking chained Keccak256 operations.
  • crates/vm/levm/bench/revm_comparison/contracts/bin/BitwiseOps.bin-runtime
    • Added the binary runtime for the BitwiseOps contract.
  • crates/vm/levm/bench/revm_comparison/contracts/bin/Exponentiation.bin-runtime
    • Added the binary runtime for the Exponentiation contract.
  • crates/vm/levm/bench/revm_comparison/contracts/bin/KeccakLoop.bin-runtime
    • Added the binary runtime for the KeccakLoop contract.
  • crates/vm/levm/src/call_frame.rs
    • Added a peek method to the Stack struct, allowing access to stack values without popping.
  • crates/vm/levm/src/db/mod.rs
    • Extended the Database trait with an optional precompile_cache method.
    • Integrated PrecompileCache into CachingDatabase to share precompile results between warmer and executor threads.
  • crates/vm/levm/src/debugger_hook.rs
    • Added a new module defining the OpcodeRecorder trait for per-opcode recording, feature-gated by tokamak-debugger.
  • crates/vm/levm/src/lib.rs
    • Added the debugger_hook module, feature-gated by tokamak-debugger.
  • crates/vm/levm/src/memory.rs
    • Added current_base_offset method to Memory to retrieve the base offset of the current callframe's memory region.
  • crates/vm/levm/src/opcode_handlers/arithmetic.rs
    • Added #[inline] attribute to op_sub and op_mul for potential performance optimization.
  • crates/vm/levm/src/opcode_handlers/bitwise_comparison.rs
    • Added #[inline] attribute to op_lt, op_gt, op_eq, op_iszero, op_and, op_or, op_shl, and op_shr for potential performance optimization.
  • crates/vm/levm/src/opcode_handlers/environment.rs
    • Added #[inline] attribute to op_calldataload for potential performance optimization.
  • crates/vm/levm/src/opcode_handlers/push.rs
    • Added #[inline] attribute to op_push0 for potential performance optimization.
  • crates/vm/levm/src/opcode_handlers/stack_memory_storage_flow.rs
    • Added #[inline] attribute to op_pop, op_mstore, and op_sload for potential performance optimization.
  • crates/vm/levm/src/opcode_handlers/system.rs
    • Added #[inline] attribute to op_return for potential performance optimization.
    • Updated execute_precompile call to pass an optional PrecompileCache.
  • crates/vm/levm/src/opcodes.rs
    • Added #[inline] attribute to op_stop for potential performance optimization.
    • Added direct calls to various opcode handlers within the main interpreter loop for faster dispatch.
  • crates/vm/levm/src/precompiles.rs
    • Introduced PrecompileCache struct to cache precompile results, improving performance by avoiding redundant computations.
    • Modified execute_precompile to check and utilize the PrecompileCache.
  • crates/vm/levm/src/timings.rs
    • Added reset, raw_totals, and raw_counts methods to OpcodeTimings and PrecompilesTimings for better benchmark management.
  • crates/vm/levm/src/vm.rs
    • Optimized Substate methods (add_selfdestruct, add_accessed_slot, add_accessed_address, add_created_account) with early returns to check self-state before parent.
    • Added opcode_recorder field to VM for debugger integration, feature-gated by tokamak-debugger.
    • Refactored the main run_execution loop into a shared interpreter_loop to support bounded execution for JIT sub-calls.
    • Integrated opcode_recorder calls within the interpreter_loop to capture step records.
  • crates/vm/lib.rs
    • Exported PrecompileCache from ethrex_levm::precompiles.
  • crates/vm/tracing.rs
    • Added setup_env_for_tx method to Evm to build the execution environment for a transaction, useful for replaying.
  • dashboard/.gitignore
    • Added a new .gitignore file for the dashboard directory.
  • dashboard/astro.config.ts
    • Added a new Astro configuration file for the dashboard, including React and Tailwind integrations.
  • dashboard/fixtures/2026-02-20/818e015fe-bench.json
    • Added a new benchmark fixture for opcode timings on February 20, 2026.
  • dashboard/fixtures/2026-02-20/818e015fe-cross-client.json
    • Added a new cross-client benchmark fixture for various scenarios on February 20, 2026.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/pr-main_mdbook.yml
Activity
  • Updated STATUS.md with details on the live reentrancy E2E pipeline.
  • Added a comprehensive end-to-end test for the live reentrancy full detection pipeline.
  • Implemented a reentrancy E2E demo with prefilter alert mode.
  • Developed the Sentinel dashboard components (H-4/H-5).
  • Implemented the core Sentinel real-time attack detection features (H-1/H-2/H-3).
  • Refactored the autopsy branch to isolate it from the three-pillars architecture.
  • Implemented production hardening for the autopsy lab across four phases.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive and substantial feature addition, introducing the Smart Contract Autopsy Lab and Sentinel real-time detection system. The changes are extensive, touching everything from the core blockchain logic and networking to the CI/CD pipeline, and introducing the comprehensive tokamak-debugger crate. The architectural changes, such as the BlockObserver trait for Sentinel integration and the highly parallelized state merkleization using Block Access Lists, are well-designed. The performance and robustness improvements, including the fast-path interpreter loop, precompile caching, and the enhanced proof coordinator logic, are excellent. My review focuses on a couple of opportunities to improve maintainability by reducing code duplication and simplifying some logic.

Comment on lines +1932 to +1948
// Clone block + receipts for observer before store_block consumes them
let observer_data = self
.block_observer
.as_ref()
.map(|_| (block.clone(), res.receipts.clone()));

let merkleized = Instant::now();
let result = self.store_block(block, account_updates_list, res);
let stored = Instant::now();

// Notify observer after successful store
if result.is_ok()
&& let Some((block_clone, receipts)) = observer_data
&& let Some(observer) = &self.block_observer
{
observer.on_block_committed(block_clone, receipts);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for notifying the block_observer is duplicated in add_block_pipeline (lines 2034-2050). To improve maintainability and reduce redundancy, consider refactoring this logic into a private helper method.

For example, you could create a method like notify_observer(&self, block: &Block, receipts: &[Receipt]) that you can call from both places.

Comment on lines +441 to +467
let risc0_bytes = proofs
.get(&ProverType::RISC0)
.map(|proof| proof.calldata())
.unwrap_or(ProverType::RISC0.empty_calldata())
.as_slice(),
proofs
.into_iter()
.next()
.unwrap_or(Value::Bytes(vec![].into()));
risc0_array.push(risc0_bytes);

let sp1_bytes = proofs
.get(&ProverType::SP1)
.map(|proof| proof.calldata())
.unwrap_or(ProverType::SP1.empty_calldata())
.as_slice(),
proofs
.into_iter()
.next()
.unwrap_or(Value::Bytes(vec![].into()));
sp1_array.push(sp1_bytes);

let tdx_bytes = proofs
.get(&ProverType::TDX)
.map(|proof| proof.calldata())
.unwrap_or(ProverType::TDX.empty_calldata())
.as_slice(),
]
.concat();
.into_iter()
.next()
.unwrap_or(Value::Bytes(vec![].into()));
tdx_array.push(tdx_bytes);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to extract the calldata bytes for each prover type is a bit complex and could be simplified for better readability and robustness. The chain of map, unwrap_or, into_iter, next, and another unwrap_or is hard to follow.

You could simplify this by using and_then and unwrap_or_else for a more direct approach. This would make the intent clearer and reduce the chance of panics if the behavior of empty_calldata() were to change in the future.

Suggested change
let risc0_bytes = proofs
.get(&ProverType::RISC0)
.map(|proof| proof.calldata())
.unwrap_or(ProverType::RISC0.empty_calldata())
.as_slice(),
proofs
.into_iter()
.next()
.unwrap_or(Value::Bytes(vec![].into()));
risc0_array.push(risc0_bytes);
let sp1_bytes = proofs
.get(&ProverType::SP1)
.map(|proof| proof.calldata())
.unwrap_or(ProverType::SP1.empty_calldata())
.as_slice(),
proofs
.into_iter()
.next()
.unwrap_or(Value::Bytes(vec![].into()));
sp1_array.push(sp1_bytes);
let tdx_bytes = proofs
.get(&ProverType::TDX)
.map(|proof| proof.calldata())
.unwrap_or(ProverType::TDX.empty_calldata())
.as_slice(),
]
.concat();
.into_iter()
.next()
.unwrap_or(Value::Bytes(vec![].into()));
tdx_array.push(tdx_bytes);
}
let risc0_bytes = proofs.get(&ProverType::RISC0)
.and_then(|p| p.calldata().into_iter().next())
.unwrap_or_else(|| Value::Bytes(Default::default()));
risc0_array.push(risc0_bytes);
let sp1_bytes = proofs.get(&ProverType::SP1)
.and_then(|p| p.calldata().into_iter().next())
.unwrap_or_else(|| Value::Bytes(Default::default()));
sp1_array.push(sp1_bytes);
let tdx_bytes = proofs.get(&ProverType::TDX)
.and_then(|p| p.calldata().into_iter().next())
.unwrap_or_else(|| Value::Bytes(Default::default()));
tdx_array.push(tdx_bytes);

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 28, 2026

Benchmark Results Comparison

No significant difference was registered for any benchmark run.

Detailed Results

Benchmark Results: BubbleSort

Command Mean [s] Min [s] Max [s] Relative
main_revm_BubbleSort 2.981 ± 0.065 2.939 3.148 1.07 ± 0.02
main_levm_BubbleSort 2.777 ± 0.024 2.724 2.808 1.00
pr_revm_BubbleSort 2.972 ± 0.033 2.941 3.021 1.07 ± 0.01
pr_levm_BubbleSort 2.777 ± 0.036 2.744 2.857 1.00 ± 0.02

Benchmark Results: ERC20Approval

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Approval 990.9 ± 15.9 972.6 1014.1 1.01 ± 0.02
main_levm_ERC20Approval 1059.1 ± 15.3 1047.6 1101.4 1.08 ± 0.02
pr_revm_ERC20Approval 980.2 ± 14.9 967.9 1008.9 1.00
pr_levm_ERC20Approval 1050.9 ± 13.0 1028.1 1079.3 1.07 ± 0.02

Benchmark Results: ERC20Mint

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Mint 133.6 ± 1.5 131.5 136.2 1.01 ± 0.02
main_levm_ERC20Mint 161.2 ± 2.3 158.1 163.7 1.22 ± 0.02
pr_revm_ERC20Mint 132.0 ± 1.4 130.0 134.9 1.00
pr_levm_ERC20Mint 161.0 ± 2.2 158.9 165.5 1.22 ± 0.02

Benchmark Results: ERC20Transfer

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ERC20Transfer 236.3 ± 4.5 231.0 245.1 1.02 ± 0.02
main_levm_ERC20Transfer 270.2 ± 1.7 267.4 273.1 1.17 ± 0.01
pr_revm_ERC20Transfer 231.1 ± 1.4 227.8 232.7 1.00
pr_levm_ERC20Transfer 271.2 ± 3.7 265.1 278.3 1.17 ± 0.02

Benchmark Results: Factorial

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Factorial 229.0 ± 1.0 226.7 230.5 1.00
main_levm_Factorial 260.8 ± 3.0 256.9 267.4 1.14 ± 0.01
pr_revm_Factorial 230.2 ± 1.3 228.3 232.3 1.01 ± 0.01
pr_levm_Factorial 261.4 ± 3.1 258.7 268.9 1.14 ± 0.01

Benchmark Results: FactorialRecursive

Command Mean [s] Min [s] Max [s] Relative
main_revm_FactorialRecursive 1.656 ± 0.060 1.544 1.728 1.00
main_levm_FactorialRecursive 9.620 ± 0.030 9.588 9.674 5.81 ± 0.21
pr_revm_FactorialRecursive 1.657 ± 0.050 1.544 1.723 1.00 ± 0.05
pr_levm_FactorialRecursive 9.586 ± 0.010 9.572 9.602 5.79 ± 0.21

Benchmark Results: Fibonacci

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Fibonacci 210.1 ± 1.5 206.6 211.3 1.00
main_levm_Fibonacci 236.9 ± 5.2 229.3 250.0 1.13 ± 0.03
pr_revm_Fibonacci 211.2 ± 2.2 208.1 216.8 1.00 ± 0.01
pr_levm_Fibonacci 236.1 ± 3.6 227.7 240.8 1.12 ± 0.02

Benchmark Results: FibonacciRecursive

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_FibonacciRecursive 866.4 ± 11.7 850.4 887.0 1.24 ± 0.02
main_levm_FibonacciRecursive 698.0 ± 7.4 689.2 714.1 1.00
pr_revm_FibonacciRecursive 842.0 ± 10.0 828.9 864.6 1.21 ± 0.02
pr_levm_FibonacciRecursive 702.2 ± 19.4 682.8 754.1 1.01 ± 0.03

Benchmark Results: ManyHashes

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_ManyHashes 8.8 ± 0.1 8.7 9.0 1.01 ± 0.01
main_levm_ManyHashes 10.1 ± 0.1 10.0 10.3 1.16 ± 0.02
pr_revm_ManyHashes 8.7 ± 0.1 8.6 8.9 1.00
pr_levm_ManyHashes 9.9 ± 0.1 9.8 10.3 1.14 ± 0.02

Benchmark Results: MstoreBench

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_MstoreBench 270.7 ± 4.2 267.5 279.7 1.20 ± 0.02
main_levm_MstoreBench 225.9 ± 2.7 224.1 233.1 1.00
pr_revm_MstoreBench 271.8 ± 16.9 265.3 320.0 1.20 ± 0.08
pr_levm_MstoreBench 227.6 ± 3.0 224.4 233.9 1.01 ± 0.02

Benchmark Results: Push

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_Push 295.7 ± 2.9 293.3 303.2 1.01 ± 0.01
main_levm_Push 298.2 ± 3.8 294.8 305.7 1.01 ± 0.01
pr_revm_Push 294.0 ± 0.7 293.2 295.2 1.00
pr_levm_Push 299.0 ± 4.0 294.8 305.8 1.02 ± 0.01

Benchmark Results: SstoreBench_no_opt

Command Mean [ms] Min [ms] Max [ms] Relative
main_revm_SstoreBench_no_opt 173.4 ± 14.9 164.9 215.2 1.58 ± 0.14
main_levm_SstoreBench_no_opt 109.8 ± 1.2 108.8 112.8 1.00
pr_revm_SstoreBench_no_opt 167.9 ± 2.5 164.5 171.6 1.53 ± 0.03
pr_levm_SstoreBench_no_opt 110.8 ± 1.7 109.1 113.3 1.01 ± 0.02

@Zena-park
Copy link
Copy Markdown
Member

@cd4761 Please change the base branch of this PR from main to tokamak-dev.
Same issue as PR #2 — you can do this via the GitHub UI Edit button or by running gh pr edit 5 --base tokamak-dev.

…ptive pipeline, and auto-pause

Expand the Sentinel real-time attack detection system with four H-6
sub-features, plus security hardening from code review:

H-6a (CLI & Configuration):
- TOML SentinelFullConfig with 6 sub-configs (sentinel, analysis,
  alert, dedup, rate-limit, auto-pause)
- load_config/merge_cli_overrides/validate functions
- 6 --sentinel.* CLI flags in ethrex cmd
- init_sentinel() bootstrap returning SentinelComponents

H-6b (Mempool Monitoring):
- MempoolPreFilter with 5 calldata heuristics (flash-loan selector,
  high-value DeFi, high-gas known contract, suspicious creation,
  multicall pattern)
- MempoolObserver trait in ethrex-blockchain with hooks in
  add_transaction_to_pool/add_blob_transaction_to_pool
- MempoolAlert/MempoolSuspicionReason types

H-6c (Adaptive Pipeline):
- AnalysisStep trait + StepResult (Continue/Dismiss/AddSteps)
- FeatureVector (16 numerical features)
- 6 pipeline steps (FlashLoan/Reentrancy/PriceManip/AccessControl/
  FundFlow/AnomalyDetection)
- StatisticalAnomalyDetector (z-score + sigmoid confidence)
- AnalysisPipeline orchestrator with PipelineMetrics

H-6d (Auto Pause):
- PauseController (AtomicBool + Condvar + auto-resume timer) in
  ethrex-blockchain with check_pause() in add_block
- AutoPauseHandler (AlertHandler circuit breaker with configurable
  score threshold)
- sentinel_resume/sentinel_status JSON-RPC endpoints

Security fixes from code review:
- Move sentinel_resume to authrpc-only (was exposed on public HTTP)
- Bound dynamic pipeline step queue to MAX_DYNAMIC_STEPS=64
- Use combined confidence score (max of prefilter, pipeline) in alerts
- PauseController fail-open on lock poisoning (log + unpause, no panic)

Tests: 310 passing + 10 ignored (debugger), 5 passing (PauseController)
Clippy: clean on tokamak-debugger, ethrex-blockchain, ethrex (default)
Add Tokamak-specific features to root README.md (debugger, autopsy lab,
sentinel, benchmarking) and update docs/tokamak/README.md with Feature
#4 (Autopsy Lab) and #5 (Sentinel) in the feature table and competitive
positioning.
Add sentinel_dashboard_demo.rs — a mini HTTP+WS server (Axum on port 3001)
that serves the 3 endpoints expected by the Astro+React dashboard:

- GET /sentinel/metrics  — 4-field JSON metrics snapshot
- GET /sentinel/history  — paginated alert history with filters
- GET /sentinel/ws       — WebSocket real-time alert feed

Key design:
- SuspicionReason remapping from Rust externally-tagged enum to
  dashboard's {type, details} format
- AlertQueryResult field mapping (total_count → total)
- Background block generator (3s cycle, 3 TX patterns)
- CORS permissive for cross-origin dashboard access
- axum(ws), tower-http(cors), tokio(full) added to sentinel feature
@jason-h23 jason-h23 force-pushed the feat/tokamak-autopsy branch from 9a7fe47 to 03fc185 Compare March 2, 2026 14:43
Zena-park added a commit that referenced this pull request Mar 16, 2026
…ntainability

- CORS: restrict origin to Tauri dev/prod allowlist (Copilot #1)
- open-url: use execFile with arg arrays instead of shell exec (Copilot #2)
- fs browse: restrict path traversal to home directory (Copilot #3)
- test-e2e-fork: move RPC URL to SEPOLIA_RPC_URL env var (Copilot #4)
- docker-remote: clear timeout on stream close, close stream on timeout (Copilot #5)
- docker-remote: add shell quoting (q()) and assertSafeName for all
  interpolated shell args to prevent injection (Copilot #6-8)
- genesis.rs: add ChainConfig::validate() for pre-startup checks (Copilot #9)
- listings.js: use named params (@id, @name, ...) instead of 30
  positional ? args for upsertListing (Gemini #1)
Zena-park added a commit that referenced this pull request Mar 17, 2026
- Return 503 instead of {exists:false} on check endpoint errors (#1)
- Sanitize all error messages — log internally, return generic to client (#2,#3,#4,#5)
- Add serverless rate limit limitation comment (#6)
- Add console.warn to all empty catch blocks in github-pr.ts (#9,#10,#11)
- Note: params as Promise is correct for Next.js 15 (#7,#8)
Zena-park added a commit that referenced this pull request Mar 17, 2026
* feat(platform): add appchain registry API routes to Next.js client

Port Express server appchain-registry endpoints to Next.js API routes
for Vercel deployment:
- GET /api/appchain-registry/check/[l1ChainId]/[stackType]/[identityAddress]
- POST /api/appchain-registry/submit
- GET /api/appchain-registry/status/[prNumber]

Shared logic in lib/appchain-registry.ts and lib/github-pr.ts.

* fix: address PR #67 code review feedback

- Return 503 instead of {exists:false} on check endpoint errors (#1)
- Sanitize all error messages — log internally, return generic to client (#2,#3,#4,#5)
- Add serverless rate limit limitation comment (#6)
- Add console.warn to all empty catch blocks in github-pr.ts (#9,#10,#11)
- Note: params as Promise is correct for Next.js 15 (#7,#8)

* fix: address additional Copilot PR #67 review feedback

- Add AbortSignal.timeout(15s) to all GitHub API fetch calls
- Fix authHeaders error message to list both accepted env vars
- Distinguish RPC errors (502) from permission denied (403) in ownership check
- Add typeof validation for metadata.signedBy
- Add unit test suggestion acknowledged (future work)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants