Conversation
Change-Id: Ic8ecdc370e8ff3eb7d99065c7b389b062a133f5e
📝 WalkthroughWalkthroughA new strict-wait mechanism is introduced to the StoreSet store dependency predictor. A configuration parameter enables the feature, which is propagated through the MemDepUnit to the StoreSet. The StoreSet tracks pending stores and strict-wait eligibility to refine load dependency predictions based on store-set strictness. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[Generated by GEM5 Performance Robot] Ideal BTB PerformanceOverall Score
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/cpu/o3/store_set.cc (1)
230-238: Clarify the strict-wait promotion logic.When
store_SSID == load_SSID(both already in the same store set), settingSSITStrict[load_index] = truepromotes the load to strict-wait mode. This means repeated violations on the same store set trigger the more conservative strict-wait behavior, which aligns with the paper's approach of escalating conservatism for problematic PCs.Consider adding a brief comment explaining this escalation policy for maintainability.
📝 Suggested documentation
} else { SSIT[load_index] = store_SSID; + // Escalate to strict-wait when load/store are already in the same + // store set (repeated violation on the same set). SSITStrict[load_index] = (store_SSID == load_SSID); DPRINTF(StoreSet, "StoreSet: Store had smaller store set: %i; " "for load %#x, store %#x\n", store_SSID, load_PC, store_PC); }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/cpu/o3/BaseO3CPU.pysrc/cpu/o3/mem_dep_unit.ccsrc/cpu/o3/store_set.ccsrc/cpu/o3/store_set.hh
🧰 Additional context used
🧬 Code graph analysis (2)
src/cpu/o3/mem_dep_unit.cc (1)
src/cpu/o3/mem_dep_unit.hh (1)
params(109-109)
src/cpu/o3/store_set.cc (3)
src/cpu/o3/store_set.hh (9)
StoreSet(71-71)StoreSet(74-74)StoreSet(77-77)load_PC(95-95)store_PC(84-84)store_PC(99-99)store_SSID(125-125)PC(105-105)squashed_num(113-113)src/cpu/o3/inst_queue.hh (1)
store_seq_num(208-208)src/cpu/o3/mem_dep_unit.hh (1)
squashed_num(152-152)
🪛 Ruff (0.14.10)
src/cpu/o3/BaseO3CPU.py
200-200: Param may be undefined, or defined from star imports
(F405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Quick Build, Unit Tests & Smoke Test
🔇 Additional comments (16)
src/cpu/o3/mem_dep_unit.cc (2)
66-66: LGTM!The
setStrictWaitEnabledcall is appropriately placed after thedepPredobject is constructed, correctly propagating the configuration parameter to the store-set predictor.
101-103: LGTM!The
setStrictWaitEnabledcall ininit()mirrors the constructor call, ensuring the strict-wait setting is consistently applied afterdepPred.init()completes.src/cpu/o3/BaseO3CPU.py (1)
200-200: LGTM!The new
enable_storeSet_strict_waitparameter follows the established pattern and is logically grouped withenable_storeSet_train. The static analysis hint aboutParambeing undefined is a false positive—it's imported viafrom m5.params import *on line 40.src/cpu/o3/store_set.hh (3)
35-35: LGTM!The
<unordered_set>include is correctly added to support the newpendingStoresmember.
107-108: LGTM!The inline setter is simple and follows the pattern of other configuration toggles in the codebase.
152-160: LGTM!The
pendingStoresset andenableStrictWaitflag are well-documented and appropriately declared. The default value oftrueforenableStrictWaitmatches the Python configuration default.src/cpu/o3/store_set.cc (10)
57-62: LGTM!
SSITStrictis correctly resized and initialized alongsidevalidSSITin the constructor.
74-74: LGTM!
pendingStores.clear()ensures a clean state at construction.
118-127: LGTM!The
init()method correctly initializesSSITStrictand clearspendingStores, mirroring the constructor logic for the separate initialization path.
178-183: LGTM!Resetting
SSITStricttofalsewhen establishing new store-set entries is correct—strict mode should only be enabled after repeated violations on the same store set.
275-275: LGTM!Inserting into
pendingStoresbefore the SSIT validity check ensures all stores are tracked for strict-wait purposes, regardless of whether they have a valid store-set entry.
339-347: LGTM!The strict-wait logic correctly returns all pending stores when the feature is enabled and the instruction is marked strict. This provides the conservative waiting behavior described in the PR.
380-381: LGTM!Removing the store from
pendingStoresupon issue correctly maintains the set of outstanding stores.
450-450: LGTM!
SSITStrictis correctly reset alongsidevalidSSITinclear().
458-458: LGTM!Clearing
pendingStoresinclear()ensures the strict-wait tracking is reset along with other predictor state.
422-428: Current implementation is appropriate for gem5's C++17 requirement.The manual iterator-based erasure loop is the correct pattern for C++17. The
std::erase_ifsuggestion would require C++20, which is not the project's target standard.
🚀 Coremark Smoke Test Results
✅ Difftest smoke test passed! |
Change-Id: Ic8ecdc370e8ff3eb7d99065c7b389b062a133f5e
Summary by CodeRabbit
Release Notes
enable_storeSet_strict_waitconfiguration parameter to control strict-wait behavior in CPU memory dependency tracking✏️ Tip: You can customize this high-level summary in your review settings.