Skip to content

fix(context): persist resume_at in call_with_interpreter#85

Open
SipengXie2024 wants to merge 1 commit intoparadigmxyz:mainfrom
SipengXie:fix/resume-at-persistence
Open

fix(context): persist resume_at in call_with_interpreter#85
SipengXie2024 wants to merge 1 commit intoparadigmxyz:mainfrom
SipengXie:fix/resume-at-persistence

Conversation

@SipengXie2024
Copy link

Summary

call_with_interpreter creates a temporary EvmContext from the interpreter on each invocation. When JIT code suspends at a CALL or CREATE, it writes the resume point to ecx.resume_at. However, this value was never written back to the interpreter's bytecode PC before ecx was dropped.

On the next call, ResumeAt::load(pc, code) sees pc < code.len() (since PC was never updated) and returns 0, causing the JIT function to restart from the beginning instead of resuming at the correct point.

Impact

  • Invisible for simple contracts without nested calls (e.g., the existing bench.rs benchmarks)
  • Catastrophic performance degradation for real-world contracts with multiple CALL/CREATE operations

Uniswap V2 swap benchmark results:

Metric Before fix After fix
JIT execution ~921 µs ~29.5 µs
vs interpreter 17x slower 1.86x faster

Fix

Save ecx.resume_at before the borrow ends, then persist it via interpreter.bytecode.absolute_jump(resume_at) (+8 lines).

Root Cause Analysis

  1. JIT suspends at CALL/CREATE → writes resume block address to ecx.resume_at
  2. call_with_interpreter returns → ecx is dropped, resume_at is lost
  3. Next invocation → EvmContext::from_interpreter_with_stackResumeAt::load(pc, code)pc < code.len() → returns 0
  4. JIT restarts from the beginning of the bytecode

Note: call() (low-level API used in bench.rs) is not affected because the caller manages EvmContext lifetime and ecx.resume_at persists across calls.

`call_with_interpreter` creates a temporary `EvmContext` from the
interpreter on each invocation. When JIT code suspends at a CALL or
CREATE, it writes the resume point to `ecx.resume_at`. However, this
value was never written back to the interpreter's bytecode PC before
`ecx` was dropped.

On the next call, `ResumeAt::load(pc, code)` would see `pc < code.len()`
(since the PC was never updated) and return 0, causing the JIT function
to restart from the beginning instead of resuming at the correct point.

This bug is invisible for simple contracts without nested calls, but
causes catastrophic performance degradation for real-world contracts
with multiple CALL/CREATE operations (e.g., Uniswap swaps). In our
benchmarks, fixing this improved JIT performance from ~921µs to ~29µs
for a Uniswap V2 swap (a ~31x speedup).

The fix saves `ecx.resume_at` before the borrow ends, then persists it
via `interpreter.bytecode.absolute_jump(resume_at)`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant