Add pre-match queue guard for internal matching engine (T2)#1424
Add pre-match queue guard for internal matching engine (T2)#1424sehyunc wants to merge 1 commit intocodex/implement-internal-match-order-validity-guardfrom
Conversation
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
8bae126 to
6506171
Compare
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
6506171 to
4336acd
Compare
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
4336acd to
4b102ad
Compare
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Closing: Investigation determined that the pre-match queue guards are redundant. The v2 state layer already handles settlement races correctly through atomic multi-queue locking in The guards were also causing a bug where See investigation in renegade-map for details. |
Problem
After a successful internal match settlement,
RunMatchingEngineHookfires two matching engine jobs — one for each party's residual order. Both jobs race to find the same counterparty and attempt settlement. The loser hits"serial preemption not allowed"because the winner already preempted the queue:PR #1420 added a post-failure
order_still_validguard that catches this and stops retrying, but the matching engine still does full candidate search, match computation, and a failed settlement attempt before bailing out.Options
Option 1: Guard at enqueue time (match v1 exactly)
Add a queue-empty check inside
RunMatchingEngineHook::run()before sending the job to the channel. Mirrors v1'sshould_run_matching_engine()which gates inside the Raft transaction that pops the settlement task.Statethrough the hook trait, changing theTaskHookinterfaceOption 2: Guard at execution time (pre-match check)
Add a queue-length check at the top of
run_internal_matching_engine()— bail immediately if the account has tasks in flight. Also check the counterparty's queue intry_settle_match()before attempting settlement.Option 3: Rely solely on PR #1420 (no additional guard)
The post-failure
order_still_validcheck already catches the busy-queue case."serial preemption not allowed") on every successful partial fill.Rationale
Chose Option 2. It eliminates the wasted work and error-level log noise with a single queue-length read at the top of the matching engine — the same check
order_still_validalready performs, just earlier. The job entering the channel is negligible cost.This matches the semantics of v1's
should_run_matching_engine()(state/src/applicator/task_queue.rs:281) without requiring v1's mechanism (Raft-atomic gating at enqueue time). In v1, the matching engine could never run unless the wallet's serial queue was confirmed empty inside the same Raft write transaction that popped the settlement task. V2's hook architecture fires unconditionally, so we add the equivalent guard at execution time instead.Changes
In
internal_engine.rs:run_internal_matching_engine(): skip if own account'sserial_tasks_queue_len > 0try_settle_match(): skip if counterparty'sserial_tasks_queue_len > 0Testing
cargo test -p matching-engine-worker— 6 tests pass (5 from PR Remove internal match order validity guard #1420 + 1 new)