Skip to content

fix: handle replacement txs in TxStore#1074

Open
technicallyty wants to merge 10 commits intomainfrom
technicallyty/recheck-snapshot-invalidation
Open

fix: handle replacement txs in TxStore#1074
technicallyty wants to merge 10 commits intomainfrom
technicallyty/recheck-snapshot-invalidation

Conversation

@technicallyty
Copy link
Contributor

@technicallyty technicallyty commented Mar 11, 2026

Description

updates the TxStore to handle tx replacements. txs that get replaced as a result of an insert should invalidate all same-sender txs with a higher nonce value, as they were dependent on the original tx's execution.

Closes: STACK-2479


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • tackled an existing issue or discussed with a team member
  • left instructions on how to review the changes
  • targeted the main branch

@linear
Copy link

linear bot commented Mar 11, 2026

@greptile-apps
Copy link

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR updates the TxStore in both the legacypool and the Cosmos recheck mempool to properly handle transaction replacements: when a tx at nonce N is replaced, all same-sender txs with nonce ≥ N are cascaded out of the store, preventing invalid dependent transactions from being included in block proposals.

Key changes:

  • legacypool/tx_store.go: RemoveTx now delegates to the new RemoveTxsFromNonce(addr, minNonce), which removes all txs for an address with nonce ≥ minNonce. t.total is correctly decremented for multi-tx removals. A minor dedup fix was also applied to AddTxs.
  • mempool/tx_store.go: CosmosTxStore gains a keys index (signer/nonce → position) and a new InvalidateFrom(tx) method that removes the replaced tx and all higher-nonce same-sender txs from the snapshot, then reindexes.
  • mempool/recheck_pool.go: Insert now calls markTxInserted instead of markTxRechecked. On a replacement, InvalidateFrom clears the stale snapshot entries and the replacement tx is intentionally left out of recheckedTxs until the next recheck cycle rebuilds it.
  • Tests added for both layers exercising the replacement invalidation and post-recheck snapshot rebuild.

One memory leak was identified in RemoveTxsFromNonce: the in-place txs[:0] filter leaves removed *types.Transaction pointers alive in the backing array's unreachable tail, preventing garbage collection for as long as the address remains in the map.

Confidence Score: 3/5

  • The PR is mostly safe to merge but contains a memory leak in RemoveTxsFromNonce that will accumulate under heavy replacement workloads.
  • The core logic for replacement invalidation is correct and well-tested across both the legacypool and Cosmos recheck mempool. The main concern is a memory leak in RemoveTxsFromNonce caused by reusing the backing array without zeroing the removed-pointer tail — this could grow unbounded in production if many replacements occur. The redundant lookup write in AddTxs is harmless but should be cleaned up.
  • mempool/txpool/legacypool/tx_store.go — the RemoveTxsFromNonce in-place filter and the redundant second lookup loop need attention.

Important Files Changed

Filename Overview
mempool/txpool/legacypool/tx_store.go Core change: RemoveTx now delegates to RemoveTxsFromNonce (cascade semantics). Memory leak found in RemoveTxsFromNonce from in-place slice filter not zeroing the dead tail. Also has a redundant double-write to the lookup map in AddTxs.
mempool/tx_store.go Adds InvalidateFrom and key-based indexing to CosmosTxStore; logic is correct and the duplicate-add guard is now a warn+return instead of panic. reindex rebuilds a fresh slice/map so no memory leak. Minor: txs with no derivable key are still appended to s.txs but not indexed, which is a silent no-op for replacement detection.
mempool/recheck_pool.go Insert path switched from markTxRechecked to markTxInserted; replacement txs invalidate same-sender higher-nonce entries in recheckedTxs but the replacement itself is intentionally not added until the next recheck cycle. Logic looks correct.
mempool/txpool/legacypool/legacypool_test.go Adds TestMarkTxRemovedInvalidatesPending which exercises the full replacement flow through the legacypool and verifies that pending txs are correctly re-promoted after reset.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant RecheckMempool
    participant CosmosTxStore
    participant LegacyTxStore
    participant ExtMempool

    Note over Caller,ExtMempool: Replacement tx insert flow
    Caller->>RecheckMempool: Insert(replacementTx)
    RecheckMempool->>RecheckMempool: ante handler check
    RecheckMempool->>ExtMempool: Insert(replacementTx)
    Note over ExtMempool: replaces old tx internally
    RecheckMempool->>RecheckMempool: markTxInserted(replacementTx)
    RecheckMempool->>CosmosTxStore: InvalidateFrom(replacementTx)
    CosmosTxStore->>CosmosTxStore: lookup key in s.keys (signer/nonce)
    alt key exists (replacement)
        CosmosTxStore->>CosmosTxStore: build nonce thresholds map
        CosmosTxStore->>CosmosTxStore: filter out all txs with nonce >= threshold
        CosmosTxStore->>CosmosTxStore: reindex remaining txs
        CosmosTxStore-->>RecheckMempool: removed > 0
        Note over RecheckMempool: replacement NOT added to snapshot yet
    else key missing (fresh insert)
        CosmosTxStore-->>RecheckMempool: 0
        RecheckMempool->>CosmosTxStore: AddTx(tx)
    end

    Note over Caller,ExtMempool: Legacypool replacement path
    Caller->>LegacyTxStore: RemoveTx(addr, oldTx)
    LegacyTxStore->>LegacyTxStore: RemoveTxsFromNonce(addr, oldTx.Nonce())
    LegacyTxStore->>LegacyTxStore: filter txs with nonce >= minNonce
    LegacyTxStore->>LegacyTxStore: delete from lookup map
    LegacyTxStore->>LegacyTxStore: total -= numRemoved

    Note over Caller,ExtMempool: Next recheck rebuilds snapshot
    Caller->>RecheckMempool: TriggerRecheckSync(newHead)
    RecheckMempool->>CosmosTxStore: StartNewHeight (clears store)
    loop each tx in ExtMempool
        RecheckMempool->>RecheckMempool: RecheckCosmos(tx)
        RecheckMempool->>CosmosTxStore: markTxRechecked (AddTx)
    end
    CosmosTxStore-->>Caller: recheckedTxs now includes replacement
Loading

Last reviewed commit: 64c1ccf

@technicallyty technicallyty marked this pull request as draft March 12, 2026 00:04
require.Equal(t, uint64(1), result[addr1][0].Tx.Nonce())
}

func TestTxStoreConcurrentRemove(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this test makes sense to have anymore


result := store.Txs(txpool.PendingFilter{})
require.Len(t, result[addr1], 500)
require.Len(t, result[addr1], 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating this bc i think the behavior we want now is if a tx is removed, all txs with a greater nonce should now be considered invalid and removed as well. they cannot be included in a block proposal.

@technicallyty technicallyty marked this pull request as ready for review March 12, 2026 04:06
@technicallyty
Copy link
Contributor Author

@greptile re-review

@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 83.15789% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.54%. Comparing base (43a1f17) to head (64c1ccf).

Files with missing lines Patch % Lines
mempool/tx_store.go 83.33% 4 Missing and 8 partials ⚠️
mempool/recheck_pool.go 71.42% 1 Missing and 1 partial ⚠️
mempool/txpool/legacypool/tx_store.go 87.50% 0 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1074      +/-   ##
==========================================
+ Coverage   65.45%   65.54%   +0.09%     
==========================================
  Files         331      331              
  Lines       23262    23346      +84     
==========================================
+ Hits        15225    15303      +78     
+ Misses       6894     6880      -14     
- Partials     1143     1163      +20     
Files with missing lines Coverage Δ
mempool/recheck_pool.go 82.10% <71.42%> (-2.24%) ⬇️
mempool/txpool/legacypool/tx_store.go 92.77% <87.50%> (-7.23%) ⬇️
mempool/tx_store.go 86.00% <83.33%> (-14.00%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aljo242
Copy link
Contributor

aljo242 commented Mar 12, 2026

@greptile re-review

Comment on lines +153 to 170
next := txs[:0]
numRemoved := 0
for _, existing := range txs {
if existing.Nonce() >= minNonce {
delete(t.lookup, existing.Hash())
numRemoved++
continue
}
next = append(next, existing)
}
t.total -= uint64(numRemoved)

if len(next) == 0 {
delete(t.txs, addr)
return
}
t.txs[addr] = next
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory leak: removed pointers retained in backing array

next := txs[:0] reuses the same backing array as txs. After the in-place filter loop, positions [len(next):len(txs)] of that array still hold references to the removed *types.Transaction pointers. Because t.txs[addr] = next keeps the same underlying array alive (with cap == len(txs)), those pointer slots are never collected by the GC — they stay pinned for as long as the address has any txs in the map.

For example: if the store holds 6 txs for an address and a replacement at nonce 4 causes txs 4–6 to be removed, the backing array still holds 3 live *types.Transaction references that the GC can never reclaim.

Zero out the tail before returning to release those references:

Suggested change
next := txs[:0]
numRemoved := 0
for _, existing := range txs {
if existing.Nonce() >= minNonce {
delete(t.lookup, existing.Hash())
numRemoved++
continue
}
next = append(next, existing)
}
t.total -= uint64(numRemoved)
if len(next) == 0 {
delete(t.txs, addr)
return
}
t.txs[addr] = next
}
next := txs[:0]
numRemoved := 0
for _, existing := range txs {
if existing.Nonce() >= minNonce {
delete(t.lookup, existing.Hash())
numRemoved++
continue
}
next = append(next, existing)
}
// Zero out the dead tail so the GC can collect the removed tx pointers.
tail := txs[len(next):]
for i := range tail {
tail[i] = nil
}
t.total -= uint64(numRemoved)
if len(next) == 0 {
delete(t.txs, addr)
return
}
t.txs[addr] = next

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants