-
Notifications
You must be signed in to change notification settings - Fork 35
Description
Summary
There is a data race between BaseApp.Simulate() and BaseApp.Commit() when they access the IAVL tree concurrently. This race is not covered by the checkStateMu mutex added in #715, which only protects the CreateQueryContext() path.
Race trace
Reproduced by running go test -v -race -run TestV2SubmitMethods -count=3 github.com/celestiaorg/celestia-app/v8/pkg/user/v2 after bumping to cosmos-sdk v0.52.2.
Read side — Simulate() → getContextForTx() → GetConsensusParams() reads the IAVL tree:
Read at 0x00c005c0a420 by goroutine 1190:
github.com/cosmos/iavl.(*MutableTree).Get()
mutable_tree.go:174
cosmossdk.io/store/iavl.(*Store).Get()
iavl/store.go:209
cosmossdk.io/store/cachekv.(*Store).Get()
cachekv/store.go:61
cosmossdk.io/store/gaskv.(*Store).Get()
gaskv/store.go:37
github.com/cosmos/cosmos-sdk/runtime.coreKVStore.Get()
runtime/store.go:65
cosmossdk.io/collections.Map[...].Get()
map.go:81
cosmossdk.io/collections.Item[...].Get()
item.go:31
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).GetConsensusParams()
baseapp/baseapp.go:552
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).getContextForTx()
baseapp/baseapp.go:694
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).runTx()
baseapp/baseapp.go:892
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).Simulate()
baseapp/test_helpers.go:28
Write side — Commit() writes to the same IAVL tree:
Previous write at 0x00c005c0a420 by goroutine 505:
github.com/cosmos/iavl.(*MutableTree).SaveVersion()
mutable_tree.go:778
cosmossdk.io/store/iavl.(*Store).Commit()
iavl/store.go:135
cosmossdk.io/store/rootmulti.commitStores()
rootmulti/store.go:1193
cosmossdk.io/store/rootmulti.(*Store).Commit()
rootmulti/store.go:489
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).Commit()
baseapp/abci.go:968
Root cause
PR #715 added checkStateMu to protect the CreateQueryContext() path from racing with Commit(). However, the Simulate() path goes through a different code path:
Simulate() → runTx(execModeSimulate) → getContextForTx() → GetConsensusParams()
getContextForTx() creates a context from checkState and then reads the IAVL store (via GetConsensusParams()) without holding checkStateMu. Meanwhile, Commit() holds checkStateMu while calling cms.Commit() which writes to the same IAVL tree via SaveVersion().
Suggested fix
Extend checkStateMu to also protect the Simulate() path, or at minimum protect the getContextForTx() / runTx() calls when they read from the IAVL store during simulation. For example, acquiring a read lock on checkStateMu in Simulate() would prevent it from running concurrently with Commit().
Upstream reference
- celestia-app issue: v2 SubmitMethods data race celestia-app#6776
- Prior fix for
CreateQueryContextrace: fix: protect lastCommitInfo race at baseapp level #715 - Prior fix for store-level
lastCommitInforace: fix: data race between gRPC queries and Commit #705