Split microtage perf by CJ362ff · Pull Request #694 · OpenXiangShan/GEM5

CJ362ff · 2026-01-06T08:39:56Z

Summary by CodeRabbit

New Features
- Full MicroTAGE predictor: selectable multi-table TAGE with richer history/table parameters, bank-aware update simulation, expanded runtime statistics, UNIT_TEST hooks, and improved prediction/update flow.
Refactor
- Replaced lightweight stub with a public, fully configurable MicroTAGE interface and updated predictor wiring to use the new type.
Chores
- Registered MicroTAGE in selectable predictor list and added build sources.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Change-Id: I8b6a4bda945564cc961251fc1297a6bacd8636cb

coderabbitai · 2026-01-06T08:40:12Z

📝 Walkthrough

Walkthrough

Replaces a lightweight BTBTAGE stub with a full MicroTAGE TimedBaseBTBPredictor (new header + source), exposes many Params and C++ mappings, wires MicroTAGE into DecoupledBPUWithBTB (member type and Param), and adds folded-history, banking, allocation/update, and extensive statistics logic.

Changes

Cohort / File(s)	Summary
SimObject declaration & build `src/cpu/pred/BranchPredictor.py`, `src/cpu/pred/SConscript`	Add public `MicroTAGE` SimObject (`type`, `cxx_class`, `cxx_header`) with many new Params (histLengths, tableSizes, TTagBitSizes, numBanks, enableBankConflict, numDelay, etc.); add `'MicroTAGE'` to `sim_objects`; include `Source('btb/microtage.cc')`.
Decoupled BPU integration & headers `src/cpu/pred/btb/decoupled_bpred.hh`, `src/cpu/pred/BranchPredictor.py`	Change `DecoupledBPUWithBTB::microtage` member type from `BTBTAGE` → `MicroTAGE`; change factory Param from `Param.BTBTAGE(...)` → `Param.MicroTAGE(...)`; add/reorder includes to reference `microtage.hh`.
MicroTAGE implementation (header & source) `src/cpu/pred/btb/microtage.hh`, `src/cpu/pred/btb/microtage.cc`	New full MicroTAGE class derived from `TimedBaseBTBPredictor`: UNIT_TEST/runtime ctors, folded-history types, per-table entries, prediction paths (generateSinglePrediction/lookupHelper/putPCHistory), update/allocation and bank-conflict logic (`canResolveUpdate`/`doResolveUpdate`), commit/update APIs, helpers (index/tag/folding), and extensive statistics and tracing.

Sequence Diagram(s)

sequenceDiagram
  participant Fetch as FetchStream
  participant Dec as DecoupledBPUWithBTB
  participant MT as MicroTAGE
  participant BTB as BTB

  Note over Fetch,Dec: Fetch-time lookup
  Fetch->>Dec: provide FetchStream + BTB entries
  Dec->>MT: putPCHistory(startAddr, history) / lookupHelper(entries)
  MT->>MT: generateSinglePrediction (folded history -> index/tag -> main/alt)
  MT-->>Dec: predictions + predMeta
  Dec->>BTB: combine predictions with BTB entries -> stage preds

  Note over Dec,MT: Update & allocation flow
  Dec->>MT: update(fetchStream)
  MT->>MT: updatePredictorStateAndCheckAllocation / handleNewEntryAllocation
  MT->>MT: canResolveUpdate / doResolveUpdate (bank conflict resolution)
  MT-->>Dec: updated state / allocation decisions
  Dec->>MT: commitBranch on retirement

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

cpu-o3:Implement predwrongSource method #683 — Modifies decoupled BTB predictor interfaces; likely touches the same DecoupledBPUWithBTB member wiring.
cpu-o3: fix bugs https://github.com/OpenXiangShan/GEM5/pull/595 #613 — Related folded-history checks and calls to checkFoldedHist used by MicroTAGE.
align RTL: tage window method, block resolveQ if update failed due to bank conflict. #644 — Overlapping bank-aware resolve/update (canResolveUpdate/doResolveUpdate) and update flow changes.

Suggested labels

perf

Suggested reviewers

jensen-yan

Poem

🐇 I hopped in with tags and tables bright,
Folded histories dancing in moonlight,
Banks and counters tap a steady beat,
Allocations shuffle on tiny feet,
MicroTAGE hums — a rabbit's delight.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title 'Split microtage perf' is vague and does not clearly convey the main changes; it uses abbreviated non-descriptive terms ('Split', 'perf') that lack meaningful specificity about what aspect of MicroTAGE is being modified.	Use a more descriptive title that specifies the key change, such as 'Adjust MicroTAGE prediction timing and update behavior' or 'Optimize MicroTAGE performance configuration'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI Agents

In @src/cpu/pred/btb/microtage.hh:
- Line 271: The header declares two member functions—setTag(Addr &dest, Addr
src, int table) and recoverFoldedHist()—but they have no definitions, causing
linker errors; either implement both in microtage.cc using the exact signatures
(implement setTag to update the tag for the given table and dest/src semantics
used by the class and implement recoverFoldedHist to restore folded history
state consistent with the rest of the predictor state), or remove their
declarations from microtage.hh if they are unused; ensure the implemented
functions match the class name and argument types exactly to resolve linker
errors.

🧹 Nitpick comments (5)

src/cpu/pred/btb/microtage.cc (3)
384-389: Type mismatch in loop comparison.

getDelay() returns an unsigned value being assigned to int s, while stagePreds.size() returns size_t. This comparison between signed and unsigned could cause issues if getDelay() exceeds stagePreds.size(), though the loop would correctly not execute in that case.
🔎 Suggested fix
-    for (int s = getDelay(); s < stagePreds.size(); s++) {
+    for (unsigned s = getDelay(); s < stagePreds.size(); s++) {
774-776: Minor: Inconsistent spacing in condition.

Minor style inconsistency.
🔎 Suggested fix
-    if (getDelay() <2){
+    if (getDelay() < 2) {
1157-1157: Minor: Inconsistent brace spacing.
🔎 Suggested fix
-    }else{
+    } else {
src/cpu/pred/btb/microtage.hh (2)
61-61: Minor: Non-ASCII character in comment.

The comment contains a Chinese comma character (，) instead of an ASCII comma.
🔎 Suggested fix
-            short counter;  // Prediction counter (-4 to 3), 3bits， 0 and -1 are weak
+            short counter;  // Prediction counter (-4 to 3), 3 bits, 0 and -1 are weak
393-393: Redundant duplicate public: access specifiers.

Lines 393 and 398 both declare public: sections consecutively. The second one at line 398 is redundant and can be removed.
🔎 Suggested fix
 public:

     // Recover folded history after misprediction
     void recoverFoldedHist(const bitset& history);

-public:
-
Also applies to: 398-398

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecce168 and 5cfbb4f.

📒 Files selected for processing (5)

src/cpu/pred/BranchPredictor.py
src/cpu/pred/SConscript
src/cpu/pred/btb/decoupled_bpred.hh
src/cpu/pred/btb/microtage.cc
src/cpu/pred/btb/microtage.hh

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/microtage.hh (2)

src/cpu/pred/btb/microtage.cc (3)

MicroTAGE (38-1176)

MicroTAGE (71-149)

MicroTAGE (151-153)

src/cpu/pred/btb/btb_ubtb.hh (1)

meta (144-147)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

🪛 Ruff (0.14.10)

src/cpu/pred/BranchPredictor.py

1072-1072: Param may be undefined, or defined from star imports

(F405)

1073-1073: Param may be undefined, or defined from star imports

(F405)

1074-1074: Param may be undefined, or defined from star imports

(F405)

1075-1075: Param may be undefined, or defined from star imports

(F405)

1076-1076: VectorParam may be undefined, or defined from star imports

(F405)

1077-1077: VectorParam may be undefined, or defined from star imports

(F405)

1078-1078: VectorParam may be undefined, or defined from star imports

(F405)

1079-1079: Param may be undefined, or defined from star imports

(F405)

1081-1081: VectorParam may be undefined, or defined from star imports

(F405)

1082-1082: Param may be undefined, or defined from star imports

(F405)

1083-1083: Param may be undefined, or defined from star imports

(F405)

1084-1084: Param may be undefined, or defined from star imports

(F405)

1085-1085: Param may be undefined, or defined from star imports

(F405)

1086-1086: Param may be undefined, or defined from star imports

(F405)

1087-1087: Param may be undefined, or defined from star imports

(F405)

1088-1088: Param may be undefined, or defined from star imports

(F405)

1089-1089: Param may be undefined, or defined from star imports

(F405)

1090-1090: Param may be undefined, or defined from star imports

(F405)

1091-1091: Param may be undefined, or defined from star imports

(F405)

1092-1092: Param may be undefined, or defined from star imports

(F405)

1182-1182: Param may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (7)

src/cpu/pred/SConscript (1)

52-53: LGTM!

The build configuration correctly registers MicroTAGE as a SimObject and includes the new source file for compilation. The placement in the sim_objects list alongside other BTB predictors is appropriate.

Also applies to: 105-105

src/cpu/pred/btb/decoupled_bpred.hh (1)

19-19: LGTM!

The header correctly updates the microtage member type from BTBTAGE* to MicroTAGE* and adds the necessary include for microtage.hh. This aligns with the Python configuration changes where the parameter type is now Param.MicroTAGE.

Also applies to: 26-27, 107-107

src/cpu/pred/BranchPredictor.py (2)

1066-1092: LGTM!

The MicroTAGE class is properly defined as a full TimedBaseBTBPredictor with appropriate defaults for a micro-sized predictor (e.g., numPredictors=1, tableSizes=[512], numDelay=0). The parameter set correctly mirrors BTBTAGE while being tuned for lower latency operation alongside uBTB.

The Ruff static analysis warnings (F405) are false positives—Param and VectorParam are correctly imported via from m5.params import * at line 29, which is standard gem5 Python configuration style.

1182-1182: LGTM!

The microtage parameter correctly uses Param.MicroTAGE(MicroTAGE(), ...) instead of the previous Param.BTBTAGE(...), aligning with the new dedicated MicroTAGE type.
src/cpu/pred/btb/microtage.cc (1)
116-137: Potential undefined behavior when tableIndexBits or tableTagBits >= 64.

The ceilLog2() result is used later in mask calculations like (1ULL << tableIndexBits[t]) - 1. If tableSizes[i] is 0 or extremely large, this could lead to shifting by 64+ bits, which is undefined behavior for 64-bit values.

While current configurations use safe values (e.g., tableSizes=[512] → ~9 bits), adding defensive bounds would prevent future misconfiguration issues.
🔎 Suggested defensive check
         tableIndexBits[i] = ceilLog2(tableSizes[i]);
+        assert(tableIndexBits[i] < 64 && "Table index bits must be < 64 to avoid UB");
         tableIndexMasks[i].resize(tableIndexBits[i], true);

         assert(histLengths.size() >= numPredictors);

         assert(tableTagBits.size() >= numPredictors);
+        assert(tableTagBits[i] < 64 && "Table tag bits must be < 64 to avoid UB");
         tableTagMasks[i].resize(tableTagBits[i], true);
src/cpu/pred/btb/microtage.hh (2)

42-446: Overall structure looks good.

The MicroTAGE class is well-structured with clear separation of concerns:

Nested types (TageEntry, TageTableInfo, TagePrediction, TageMeta) are well-defined

UNIT_TEST conditional compilation allows isolated testing

Statistics tracking via TageStats provides good observability

The public API matches the TimedBaseBTBPredictor interface appropriately

239-240: Verify allocLFSR usage in implementation: The claim that allocLFSR is unused needs confirmation by searching the implementation files (.cc/.cpp) for any invocations of this member, including initialization, method calls, or indirect references through the LFSR64 interface.

src/cpu/pred/btb/microtage.hh

…cases

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI Agents

In @src/cpu/pred/btb/microtage.cc:
- Line 1154: Fix the typo in the comment inside microtage.cc: change "tage olnly
deals with conditional branches" to "tage only deals with conditional branches"
(look for the comment text "tage olnly" near the conditional-branch handling
code in microtage).
- Line 247: Fix the typo in the inline comment that reads "only one way can be
matched, aviod multi hit, TODO: RTL how to do this?" by changing "aviod" to
"avoid" so it reads "only one way can be matched, avoid multi hit, TODO: RTL how
to do this?"; locate the comment near the break; statement in microtage.cc (the
line containing that exact comment) and update only the comment text.
- Around line 48-49: The UNIT_TEST constructor for TimedBaseBTBPredictor leaves
the member blockSize uninitialized but immediately computes blockWidth using
floorLog2(blockSize) in the initializer list (via bankIdWidth and blockWidth),
so update the UNIT_TEST constructor to accept an explicit blockSize parameter
(or explicitly initialize blockSize in its own initializer entry) and pass that
value into TimedBaseBTBPredictor()’s initializer list so that blockSize is
initialized before calling floorLog2; reference the UNIT_TEST constructor,
TimedBaseBTBPredictor(), blockSize, blockWidth, bankIdWidth and floorLog2 when
making the change.

🧹 Nitpick comments (2)

src/cpu/pred/btb/microtage.cc (2)
274-276: Simplify complex nested ternary operator.

The triple-nested ternary operator is difficult to read and reason about. Consider refactoring into explicit if-else statements for clarity.
🔎 Proposed refactor
-    bool base_taken = getDelay() != 0 ? (usingBasetable ? baseTable[base_idx][branch_idx] >= 0 : btb_entry.ctr >= 0)
-                                                                                     : btb_entry.ctr >= 0;
-    //bool base_taken = btb_entry.ctr >= 0;
+    bool base_taken;
+    if (getDelay() != 0) {
+        base_taken = usingBasetable ? baseTable[base_idx][branch_idx] >= 0 
+                                    : btb_entry.ctr >= 0;
+    } else {
+        base_taken = btb_entry.ctr >= 0;
+    }
1115-1147: Remove unused LRU functions.

The updateLRU and getLRUVictim functions are defined but never called anywhere in the codebase. Comments throughout the code (lines 509, 516) explicitly state "No LRU maintenance", indicating these functions are dead code and should be removed.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cfbb4f and 9469e8f.

📒 Files selected for processing (1)

src/cpu/pred/btb/microtage.cc

🧰 Additional context used

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (3)

src/cpu/pred/btb/microtage.cc (3)
810-926: LGTM: Helper functions are well-implemented.

The helper functions for counter updates, tag/index calculations, and addressing are correctly implemented with proper bounds checking and saturation arithmetic. The use of warn_once for edge cases is good defensive programming.

947-1032: LGTM: History management is correctly implemented.

The branch history update and recovery mechanisms are well-designed, with proper separation of speculative execution paths and recovery logic. The conditional debug logging is a good practice to avoid performance overhead.

773-776: Add braces to single-statement if block.

The if statement lacks braces, which can lead to errors if additional statements are added during future maintenance.
🔎 Proposed fix
-    if (getDelay() <2){
+    if (getDelay() < 2) {
         checkUtageUpdateMisspred(stream);
+    }
-    }
Likely an incorrect or invalid review comment.

src/cpu/pred/btb/microtage.cc

Change-Id: I44b6175022a3a593ed385407bd95ec0c40c74642

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

src/cpu/pred/btb/microtage.cc (5)
274-276: Consider simplifying nested ternary for readability.

The triple-nested ternary on lines 274-276 makes the base_taken logic harder to follow. Consider extracting this into a small helper method or using explicit if-else branches.
🔎 Suggested refactor
-    bool base_taken = getDelay() != 0 ? (usingBasetable ? baseTable[base_idx][branch_idx] >= 0 : btb_entry.ctr >= 0)
-                                                                                     : btb_entry.ctr >= 0;
-    //bool base_taken = btb_entry.ctr >= 0;
+    // Determine base prediction source based on configuration
+    bool base_taken;
+    if (getDelay() != 0 && usingBasetable) {
+        base_taken = baseTable[base_idx][branch_idx] >= 0;
+    } else {
+        base_taken = btb_entry.ctr >= 0;
+    }
774-774: Minor spacing inconsistency.

Line 774 has getDelay() <2 without a space before the < operator. Consider getDelay() < 2 for consistency with the rest of the codebase.

932-932: Use unsigned literal for bit shift to avoid potential overflow.

Line 932 uses (1 << bankIdWidth) which could theoretically overflow if bankIdWidth >= 31 (though unlikely given typical configurations). Use (1U << bankIdWidth) or (1ULL << bankIdWidth) for type safety.
🔎 Proposed fix
-    return (pc >> bankBaseShift) & ((1 << bankIdWidth) - 1);
+    return (pc >> bankBaseShift) & ((1U << bankIdWidth) - 1);
1115-1147: Consider removing unused LRU methods.

The updateLRU and getLRUVictim methods (lines 1115-1147) are defined but not used anywhere in the code. Comments throughout the file (e.g., line 243, 509, 516) explicitly state that LRU is not used. Consider removing these methods to reduce code maintenance burden, or add a clear comment explaining why they're kept (e.g., for future use or alternative configurations).

827-827: The shift operations at lines 827 and 852 are technically safe given the current Python parameter configurations, but lack runtime bounds validation as defensive programming. Consider adding assertions in the constructor to validate that tableTagBits[i] and tableIndexBits[i] never exceed 63 to prevent undefined behavior if parameters are misconfigured in the future. Similar validation would benefit line 932 in getTageIndex for bankIdWidth.

Also applies to: 852-852

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9469e8f and a089f57.

📒 Files selected for processing (1)

src/cpu/pred/btb/microtage.cc

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/microtage.cc (1)

src/cpu/pred/btb/microtage.hh (41)

MicroTAGE (49-49)

MicroTAGE (114-114)

MicroTAGE (116-116)

entry (88-90)

entry (153-153)

entry (154-154)

entry (155-155)

entry (425-428)

TageTableInfo (85-85)

TageTableInfo (86-87)

taken (283-283)

TagePrediction (104-104)

TagePrediction (106-109)

pred (384-384)

history (131-131)

history (135-136)

history (140-143)

history (140-140)

history (145-149)

history (145-146)

history (164-164)

history (204-204)

history (396-396)

stream (158-158)

stream (280-280)

stream (422-422)

TageEntry (66-66)

TageEntry (68-69)

table (442-442)

table (443-443)

counter (70-72)

max (286-286)

pc (175-175)

pc (178-178)

pc (182-182)

pc (186-186)

pc (189-191)

pc (189-189)

pc (194-194)

pc (201-201)

pc (292-292)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

github-actions · 2026-01-06T09:45:06Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1689`	-
This PR	`2.1689`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-06T10:56:52Z

[Generated by GEM5 Performance Robot]
commit: a089f57
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.91	19.90	+0.02 🟢

[Generated by GEM5 Performance Robot]
commit: a089f57
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.91	19.90	+0.03 🟢

Change-Id: I16ce79a7d8488d9a138d0a26f5500576ae54132e

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @src/cpu/pred/btb/microtage.cc:
- Around line 268-274: The comment says to use the base table but the code
computes base_idx then still sets base_taken from btb_entry.ctr and
unconditionally assigns alt_pred = base_taken; either replace the use of
btb_entry.ctr with the base table entry (e.g., base_taken =
baseTable[base_idx][branch_idx] >= 0) and restore the conditional alt selection
(alt_pred = alt_provided ? alt_taken : base_taken), or if the current behavior
is intentional, update the comment to state that btb_entry.ctr is used and that
alt_pred always follows base_taken; modify the comment to reflect that. Ensure
you reference getBaseTableIndex, getBranchIndexInBlock, baseTable,
btb_entry.ctr, alt_provided, alt_taken, base_taken, and alt_pred in the change.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a089f57 and 3fa08b4.

📒 Files selected for processing (2)

src/cpu/pred/BranchPredictor.py
src/cpu/pred/btb/microtage.cc

🧰 Additional context used

🧬 Code graph analysis (2)

src/cpu/pred/BranchPredictor.py (4)

src/cpu/pred/btb/microtage.cc (3)

MicroTAGE (38-1186)

MicroTAGE (71-146)

MicroTAGE (148-150)

src/cpu/pred/btb/microtage.hh (3)

MicroTAGE (49-49)

MicroTAGE (114-114)

MicroTAGE (116-116)

src/cpu/pred/btb/timed_base_pred.hh (3)

TimedBaseBTBPredictor (48-48)

TimedBaseBTBPredictor (53-53)

numDelay (77-77)

src/cpu/pred/btb/timed_base_pred.cc (2)

TimedBaseBTBPredictor (15-15)

TimedBaseBTBPredictor (18-26)

src/cpu/pred/btb/microtage.cc (2)

src/cpu/pred/btb/microtage.hh (35)

entry (88-90)

entry (153-153)

entry (154-154)

entry (155-155)

entry (425-428)

TageTableInfo (85-85)

TageTableInfo (86-87)

taken (283-283)

TagePrediction (104-104)

TagePrediction (106-109)

pred (384-384)

history (131-131)

history (135-136)

history (140-143)

history (140-140)

history (145-149)

history (145-146)

history (164-164)

history (204-204)

history (396-396)

TageEntry (66-66)

TageEntry (68-69)

table (442-442)

table (443-443)

counter (70-72)

max (286-286)

pc (175-175)

pc (178-178)

pc (182-182)

pc (186-186)

pc (189-191)

pc (189-189)

pc (194-194)

pc (201-201)

pc (292-292)

src/cpu/pred/btb/folded_hist.hh (1)

PathFoldedHist (167-167)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

🪛 Ruff (0.14.10)

src/cpu/pred/BranchPredictor.py

1072-1072: Param may be undefined, or defined from star imports

(F405)

1073-1073: Param may be undefined, or defined from star imports

(F405)

1074-1074: Param may be undefined, or defined from star imports

(F405)

1075-1075: Param may be undefined, or defined from star imports

(F405)

1076-1076: VectorParam may be undefined, or defined from star imports

(F405)

1077-1077: VectorParam may be undefined, or defined from star imports

(F405)

1078-1078: VectorParam may be undefined, or defined from star imports

(F405)

1079-1079: Param may be undefined, or defined from star imports

(F405)

1081-1081: VectorParam may be undefined, or defined from star imports

(F405)

1082-1082: Param may be undefined, or defined from star imports

(F405)

1083-1083: Param may be undefined, or defined from star imports

(F405)

1084-1084: Param may be undefined, or defined from star imports

(F405)

1085-1085: Param may be undefined, or defined from star imports

(F405)

1086-1086: Param may be undefined, or defined from star imports

(F405)

1087-1087: Param may be undefined, or defined from star imports

(F405)

1088-1088: Param may be undefined, or defined from star imports

(F405)

1089-1089: Param may be undefined, or defined from star imports

(F405)

1090-1090: Param may be undefined, or defined from star imports

(F405)

1091-1091: Param may be undefined, or defined from star imports

(F405)

1181-1181: Param may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (6)

src/cpu/pred/BranchPredictor.py (2)

1066-1091: LGTM - Well-structured micro-sized TAGE configuration.

The MicroTAGE class is properly defined with comprehensive parameters. The default values are appropriately sized for a lightweight predictor (2 predictors, 512-entry tables, 1-way, 0 delay). Note that enableBankConflict defaults to False (line 1090), which differs from BTBTAGE's True default. This is intentional for the micro variant and aligns with the lower latency design.

1181-1181: Correct type update for MicroTAGE integration.

The parameter type correctly changed from Param.BTBTAGE to Param.MicroTAGE, ensuring proper binding to the new MicroTAGE C++ class.

src/cpu/pred/btb/microtage.cc (4)

71-146: Well-structured constructor with proper initialization.

The constructor correctly initializes all predictor data structures from configuration parameters. The assertions (lines 118, 127, 129, 132) ensure parameter vectors are properly sized. The informational warning at lines 100-103 about bank simulation and updateOnRead is helpful for users tuning performance.

654-691: Well-designed bank conflict handling with probe-commit pattern.

The bank conflict simulation correctly uses a two-phase approach: canResolveUpdate probes for conflicts without mutation, and doResolveUpdate performs the actual update. The conflict detection (line 665) prevents simultaneous prediction and update to the same bank, with proper statistics tracking. The predBankValid flag management ensures correct conflict state propagation.

437-563: Robust update logic with proper TAGE mechanisms.

The update function correctly implements:

useAltOnNa counter updates for weak providers (lines 454-469)

Standard TAGE counter updates with saturation

Useful bit management with humility mechanism (lines 488-502)

Conservative allocation that avoids wasteful entries (lines 557-559)

Note that line 451 correctly uses baseTable[base_idx][branch_idx] for the base prediction, which should be consistent with the prediction path (see earlier comment about line 271).

575-649: Sound set-associative allocation with age penalties.

The allocation function implements proper TAGE allocation policies:

Prefers invalid or replaceable (not-useful AND weak) entries (lines 600-614)

Applies age penalty to strong entries to eventually make them replaceable (lines 618-627)

Periodic useful bit reset prevents allocation starvation (lines 633-643)

The allocation logic balances aggressiveness with conservation of well-performing entries.

src/cpu/pred/btb/microtage.cc

Change-Id: I8be037a1cdacd7151f4fe8e743a32cca90a85036

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In @src/cpu/pred/btb/microtage.cc:
- Around line 842-845: In MicroTAGE::getTageIndex the expression (1ULL <<
tableIndexBits[t]) can overflow if tableIndexBits[t] >= 64; add a defensive
check for tableIndexBits[t] to ensure it is < 64 (or assert) and compute mask
safely (for example, if bits >= 64 use all-ones mask (~0ULL) else
(1ULL<<bits)-1) so tableIndexBits[t] cannot trigger undefined behavior when
building mask.
- Around line 816-820: The expression in MicroTAGE::getTageTag that computes
Addr mask = (1ULL << tableTagBits[t]) - 1 can UB if tableTagBits[t] >= 64; add a
guard/assert and clamp to ensure tableTagBits[t] is in [0,63] before shifting
(or handle the special case tableTagBits[t]==0 and return 0 mask), e.g.
assert/tableTagBits validation in the beginning of getTageTag and use a safe
computation (e.g., if tableTagBits[t] >= 64 then set mask = ~0ULL) so shifts
never exceed 63 bits while preserving intended behavior for tableTagBits, and
update any places relying on mask accordingly.
- Around line 922-926: MicroTAGE::getBankId uses (1 << bankIdWidth) which can
overflow for large bankIdWidth; change the shift to use an unsigned long long
literal (e.g., 1ULL << bankIdWidth) so the mask is computed in 64-bit space,
then apply the mask and return the result (keeping the existing shift by
bankBaseShift and the return type Addr).
- Around line 268-273: The current code computes base_taken from btb_entry.ctr
and overrides alt_pred unconditionally; change base_taken to use the baseTable
entry at indices computed by getBaseTableIndex(startPC) and
getBranchIndexInBlock(btb_entry.pc, startPC) (i.e., base_taken =
baseTable[base_idx][branch_idx] >= 0) and restore the conditional alt logic:
alt_pred = alt_provided ? alt_taken : base_taken so the alternative prediction
is used when available.

🧹 Nitpick comments (7)

src/cpu/pred/btb/microtage.cc (7)
5-5: Consider removing unused include.

The <ctime> header is included but doesn't appear to be used anywhere in the implementation.
♻️ Proposed fix
 #include <algorithm>
 #include <cmath>
-#include <ctime>
133-135: Hardcoded maxShamt parameter in PathFoldedHist construction.

The maxShamt parameter is hardcoded to 16 for all folded history objects. Consider whether this should be configurable via parameters for better flexibility.

243-243: Fix typo in comment.

"aviod multi hit" should be "avoid multi hit".
📝 Proposed fix
-                break;  // only one way can be matched, aviod multi hit, TODO: RTL how to do this?
+                break;  // only one way can be matched, avoid multi hit, TODO: RTL how to do this?
537-546: Remove or re-enable commented-out code.

This large block of commented code should either be removed if obsolete, or re-enabled behind a feature flag if still needed. Leaving it commented out reduces code clarity.

632-632: Replace magic number with named constant.

The value 256 for the useful bit reset threshold should be defined as a named constant for better maintainability and clarity.
♻️ Proposed improvement

Define a constant at the class level:
static constexpr unsigned USEFUL_RESET_THRESHOLD = 256;
Then use it:
-    if (usefulResetCnt >= 256) {
+    if (usefulResetCnt >= USEFUL_RESET_THRESHOLD) {
319-325: Consider extracting counter confidence logic into helper methods.

The formula abs(pred.mainInfo.entry.counter*2 + 1) is repeated multiple times (lines 320, 322, 324, 484, 601, 619) to determine counter confidence levels. Extracting this into helper methods like isCounterSaturated(), isCounterWeak() would improve readability and maintainability.

1108-1140: Remove unused LRU functions or re-enable LRU replacement policy.

The updateLRU() and getLRUVictim() functions are fully implemented but never called. Multiple comments throughout the code explicitly state "Do not use LRU" (line 239) and "No LRU maintenance" (lines 504, 511). Either:

Remove these functions if LRU is definitively not used, or

Re-enable LRU and remove the conflicting comments

Keeping dead code reduces maintainability.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fa08b4 and e665826.

📒 Files selected for processing (1)

src/cpu/pred/btb/microtage.cc

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/microtage.cc (2)

src/cpu/pred/btb/microtage.hh (16)

MicroTAGE (49-49)

MicroTAGE (114-114)

MicroTAGE (116-116)

entry (88-90)

entry (153-153)

entry (154-154)

entry (155-155)

entry (425-428)

TageTableInfo (85-85)

TageTableInfo (86-87)

pred (384-384)

TageEntry (66-66)

TageEntry (68-69)

table (442-442)

table (443-443)

max (286-286)

src/cpu/pred/btb/folded_hist.hh (1)

PathFoldedHist (167-167)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (5)

src/cpu/pred/btb/microtage.cc (5)

948-951: Verify history update behavior for not-taken branches.

The function returns early without updating folded histories when a branch is not taken. Verify that this is the correct behavior for the MicroTAGE algorithm implementation, as some TAGE variants update history for all conditional branches regardless of direction.

599-626: LGTM - Well-structured allocation with aging mechanism.

The allocation logic properly prioritizes:

Invalid ways (line 602)

Weak and not-useful ways (line 602)

Applies age penalty to strong not-useful ways (lines 616-626)

This implements a sound replacement policy that balances freshness with history preservation.

664-674: LGTM - Correct bank conflict detection and deferral.

The bank conflict logic properly:

Detects when update and prediction access the same bank

Defers the update by returning false

Clears predBankValid to prevent repeated conflict detection

Updates statistics for monitoring

This ensures correct serialization of conflicting accesses.

1027-1106: LGTM - Comprehensive statistics implementation.

The statistics provide detailed visibility into:

Prediction accuracy (hits, misses, useAlt correctness)

Allocation success/failure rates

Per-bank access patterns and conflicts

Per-table hit distributions

This will be valuable for performance analysis and tuning.

38-54: No action required. Both blockSize and instShiftAmt are properly initialized before use in the constructor initializer list. blockSize is initialized by the base class TimedBaseBTBPredictor() constructor to 32, and instShiftAmt is a member variable with default initializer {1}. C++ construction order ensures both are initialized before the member initializer list executes at lines 49–50.

Likely an incorrect or invalid review comment.

coderabbitai · 2026-01-08T10:28:59Z

src/cpu/pred/btb/microtage.cc

+    Addr base_idx = getBaseTableIndex(startPC);
+    unsigned branch_idx = getBranchIndexInBlock(btb_entry.pc, startPC);
+    bool base_taken = btb_entry.ctr >= 0;
+
+    //bool alt_pred = alt_provided ? alt_taken : base_taken; // if alt provided, use alt prediction, otherwise use base
+    bool alt_pred = base_taken;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n src/cpu/pred/btb/microtage.cc | sed -n '260,285p'

Repository: OpenXiangShan/GEM5

Length of output: 1213

🏁 Script executed:

rg -A 30 "Addr base_idx = getBaseTableIndex" src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 2587

🏁 Script executed:

rg -B 10 "bool base_taken = btb_entry.ctr" src/cpu/pred/btb/microtage.cc | head -30

Repository: OpenXiangShan/GEM5

Length of output: 456

🏁 Script executed:

rg "baseTable\[" src/cpu/pred/btb/microtage.cc | head -5

Repository: OpenXiangShan/GEM5

Length of output: 290

🏁 Script executed:

rg -B 5 "Generate final prediction" src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 256

🏁 Script executed:

rg "struct.*btb_entry|btb_entry.*=" src/cpu/pred/btb/microtage.cc | head -10

Repository: OpenXiangShan/GEM5

Length of output: 725

🏁 Script executed:

rg -B 20 "Generate final prediction" src/cpu/pred/btb/microtage.cc | grep -E "^\s*(TagePrediction|bool|void|public:|private:)" | tail -10

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

sed -n '240,275p' src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 1670

🏁 Script executed:

sed -n '320,360p' src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 1914

🏁 Script executed:

rg "struct.*BtbEntry|class.*BtbEntry" src/cpu/pred/btb/microtage.cc -A 10

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

find src/cpu/pred -name "*.h" -o -name "*.hh" | xargs grep -l "BtbEntry"

Repository: OpenXiangShan/GEM5

Length of output: 46

🏁 Script executed:

rg "BtbEntry" src/cpu/pred/btb/microtage.h -A 5

Repository: OpenXiangShan/GEM5

Length of output: 130

🏁 Script executed:

find src/cpu/pred -name "*microtage*"

Repository: OpenXiangShan/GEM5

Length of output: 121

🏁 Script executed:

rg "struct.*BtbEntry|class.*BtbEntry" src/cpu/pred/btb/microtage.hh -A 10

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

rg "BtbEntry" src/cpu/pred/btb/microtage.hh | head -20

Repository: OpenXiangShan/GEM5

Length of output: 44

🏁 Script executed:

rg "btb_entry" src/cpu/pred/btb/microtage.cc -B 2 | head -40

Repository: OpenXiangShan/GEM5

Length of output: 1767

Fix base_taken calculation to use baseTable instead of btb_entry.ctr

Line 270 should use baseTable[base_idx][branch_idx] >= 0 instead of btb_entry.ctr >= 0. The comment on line 267 ("Use base table instead of btb_entry.ctr") indicates the intended behavior, and base_idx/branch_idx are already calculated but unused. The correct pattern exists elsewhere in the codebase (see the update path at line 5-6 in the update function). Additionally, line 273 should restore the conditional logic for alt_pred: alt_pred = alt_provided ? alt_taken : base_taken; to properly use the alternative prediction when available, rather than unconditionally using base_taken.

🤖 Prompt for AI Agents

In @src/cpu/pred/btb/microtage.cc around lines 268 - 273, The current code computes base_taken from btb_entry.ctr and overrides alt_pred unconditionally; change base_taken to use the baseTable entry at indices computed by getBaseTableIndex(startPC) and getBranchIndexInBlock(btb_entry.pc, startPC) (i.e., base_taken = baseTable[base_idx][branch_idx] >= 0) and restore the conditional alt logic: alt_pred = alt_provided ? alt_taken : base_taken so the alternative prediction is used when available.

coderabbitai · 2026-01-08T10:28:59Z

src/cpu/pred/btb/microtage.cc

+Addr
+MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)
+{
+    // Create mask for tableTagBits[t] to limit result size
+    Addr mask = (1ULL << tableTagBits[t]) - 1;


⚠️ Potential issue | 🟠 Major

Potential undefined behavior from bit-shift overflow.

If tableTagBits[t] is >= 64, the expression (1ULL << tableTagBits[t]) invokes undefined behavior. Add bounds checking or assertions to ensure tag/index bit widths are within safe ranges (< 64).

🛡️ Proposed fix with assertion

Addr MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position) { + assert(tableTagBits[t] < 64 && "Tag bits must be < 64 to avoid shift overflow"); // Create mask for tableTagBits[t] to limit result size Addr mask = (1ULL << tableTagBits[t]) - 1;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Addr

MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)

{

// Create mask for tableTagBits[t] to limit result size

Addr mask = (1ULL << tableTagBits[t]) - 1;

Addr

MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)

{

assert(tableTagBits[t] < 64 && "Tag bits must be < 64 to avoid shift overflow");

// Create mask for tableTagBits[t] to limit result size

Addr mask = (1ULL << tableTagBits[t]) - 1;

🤖 Prompt for AI Agents

In @src/cpu/pred/btb/microtage.cc around lines 816 - 820, The expression in MicroTAGE::getTageTag that computes Addr mask = (1ULL << tableTagBits[t]) - 1 can UB if tableTagBits[t] >= 64; add a guard/assert and clamp to ensure tableTagBits[t] is in [0,63] before shifting (or handle the special case tableTagBits[t]==0 and return 0 mask), e.g. assert/tableTagBits validation in the beginning of getTageTag and use a safe computation (e.g., if tableTagBits[t] >= 64 then set mask = ~0ULL) so shifts never exceed 63 bits while preserving intended behavior for tableTagBits, and update any places relying on mask accordingly.

coderabbitai · 2026-01-08T10:29:00Z

src/cpu/pred/btb/microtage.cc

+MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)
+{
+    // Create mask for tableIndexBits[t] to limit result size
+    Addr mask = (1ULL << tableIndexBits[t]) - 1;


⚠️ Potential issue | 🟠 Major

Potential undefined behavior from bit-shift overflow.

If tableIndexBits[t] is >= 64, the expression (1ULL << tableIndexBits[t]) invokes undefined behavior. Add bounds checking or assertions to ensure index bit widths are within safe ranges (< 64).

🛡️ Proposed fix with assertion

Addr MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist) { + assert(tableIndexBits[t] < 64 && "Index bits must be < 64 to avoid shift overflow"); // Create mask for tableIndexBits[t] to limit result size Addr mask = (1ULL << tableIndexBits[t]) - 1;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)

{

// Create mask for tableIndexBits[t] to limit result size

Addr mask = (1ULL << tableIndexBits[t]) - 1;

Addr

MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)

{

assert(tableIndexBits[t] < 64 && "Index bits must be < 64 to avoid shift overflow");

// Create mask for tableIndexBits[t] to limit result size

Addr mask = (1ULL << tableIndexBits[t]) - 1;

🤖 Prompt for AI Agents

In @src/cpu/pred/btb/microtage.cc around lines 842 - 845, In MicroTAGE::getTageIndex the expression (1ULL << tableIndexBits[t]) can overflow if tableIndexBits[t] >= 64; add a defensive check for tableIndexBits[t] to ensure it is < 64 (or assert) and compute mask safely (for example, if bits >= 64 use all-ones mask (~0ULL) else (1ULL<<bits)-1) so tableIndexBits[t] cannot trigger undefined behavior when building mask.

coderabbitai · 2026-01-08T10:29:00Z

src/cpu/pred/btb/microtage.cc

+MicroTAGE::getBankId(Addr pc) const
+{
+    // Extract bank ID bits after removing instruction alignment
+    return (pc >> bankBaseShift) & ((1 << bankIdWidth) - 1);
+}


🛠️ Refactor suggestion | 🟠 Major

Use ULL suffix to prevent potential overflow.

Line 925 uses (1 << bankIdWidth) with a plain int constant. If bankIdWidth >= 31, this could overflow. Use 1ULL for consistency and safety.

🛡️ Proposed fix

unsigned MicroTAGE::getBankId(Addr pc) const { // Extract bank ID bits after removing instruction alignment - return (pc >> bankBaseShift) & ((1 << bankIdWidth) - 1); + return (pc >> bankBaseShift) & ((1ULL << bankIdWidth) - 1); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

MicroTAGE::getBankId(Addr pc) const

{

// Extract bank ID bits after removing instruction alignment

return (pc >> bankBaseShift) & ((1 << bankIdWidth) - 1);

}

MicroTAGE::getBankId(Addr pc) const

{

// Extract bank ID bits after removing instruction alignment

return (pc >> bankBaseShift) & ((1ULL << bankIdWidth) - 1);

}

🤖 Prompt for AI Agents

In @src/cpu/pred/btb/microtage.cc around lines 922 - 926, MicroTAGE::getBankId uses (1 << bankIdWidth) which can overflow for large bankIdWidth; change the shift to use an unsigned long long literal (e.g., 1ULL << bankIdWidth) so the mask is computed in 64-bit space, then apply the mask and return the result (keeping the existing shift by bankBaseShift and the return type Addr).

Change-Id: I6b9c92c18c574a12c19532ae0b894e64c1187342

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In @src/cpu/pred/btb/microtage.cc:
- Line 856: The expression computing alignedPC (const Addr alignedPC = startPC &
~(blockSize - 1);) can underflow when blockSize==0; add a defensive check: in
the Microtage constructor (where blockWidth/blockSize is set, referenced by the
ternary at the constructor) assert or throw if blockSize == 0 (e.g., ensure
blockSize > 0), or alternatively guard the alignedPC computation by handling
blockSize==0 early (return/error) before using ~(blockSize - 1). Update either
the constructor validation for blockSize or add the guard around the alignedPC
calculation to avoid the underflow.
- Around line 246-251: Remove the dead commented alternative-prediction lookup
block: delete the commented lines that reference alt_provided, alt_info,
TageTableInfo(...), matching_entry, matching_way and the unused second-match
logic inside the loop in microtage.cc so no vestigial alt prediction variables
or commented branches remain; ensure any remaining code does not reference
alt_provided or alt_info and compile-check the function containing this loop
(e.g., the Tage table lookup function) to confirm no dangling references.
- Line 203: Remove the leftover commented-out declaration "bool alt_provided =
true;" (symbol: alt_provided) since it is unused after removing alternative
prediction tables; delete that commented line and any other related
commented-out remnants to keep microtage implementation clean.
- Around line 259-267: Remove the vestigial commented-out base-table code and
the misleading comment in the block around main_taken/alt_pred; keep the actual
logic that computes alt_pred using btb_entry.ctr (the existing line bool
alt_pred = btb_entry.ctr >= 0;), and replace the incorrect "Use base table
instead of btb_entry.ctr" comment with a short, accurate comment like "alt_pred
derived from btb_entry.ctr (base table not used)". Ensure no other dead
commented lines (e.g., getBaseTableIndex, getBranchIndexInBlock, alt_taken,
base_taken) remain in that scope.
- Line 87: Validate and enforce that p.blockSize is never zero before using it:
in the constructor/initializer where blockWidth is computed (the expression
using blockWidth(p.blockSize ? floorLog2(p.blockSize) : 0)), either assert/throw
if p.blockSize == 0 or clamp it to a minimum of 1 so downstream uses are safe;
also update any mask computations that do (blockSize - 1) to use a defensive
expression like (p.blockSize ? p.blockSize - 1 : 0) or rely on the
validated/clamped value so mask logic cannot underflow.

🧹 Nitpick comments (3)

src/cpu/pred/btb/microtage.hh (1)
99-102: Clarify comments after removing alternative prediction.

The comment on line 100 states "Whether to use alternative prediction" but the alternative prediction mechanism has been removed per the PR objective. The mainprovided field now indicates whether the main prediction was found, and altPred (line 102) is simply the base table prediction from btb_entry.ctr.
📝 Suggested comment clarifications
-            bool mainprovided;    // Whether to use alternative prediction, true if main is weak or no main prediction
+            bool mainprovided;    // Whether main prediction was found; false means fallback to base
             bool taken;           // Final prediction outcome
-            bool altPred;          // Alternative prediction = alt_provided ? alt_taken : base_taken;
+            bool altPred;          // Base table prediction (fallback when main not found)
src/cpu/pred/btb/microtage.cc (2)

236-236: Address TODO: clarify multi-hit handling.

The TODO comment asks "how does RTL handle multi-hit?" The current code breaks after finding the first match (line 236), preventing multiple ways from matching.

Do you want me to search the codebase for RTL documentation or similar TAGE implementations to understand the expected multi-hit behavior? This could help validate whether the break statement is correct or if additional logic is needed.

432-447: Vestigial useAltOnNa logic after removing alternative prediction.

The useAltOnNa mechanism (lines 432-447) was designed to learn when to prefer the alternative predictor over a weak main predictor. However, since alternative prediction tables have been removed and alt_pred is now just the base table from btb_entry.ctr, this logic no longer serves its original purpose. The statistics are still tracked but the useAlt table (line 253 in header) is never actually consulted in the prediction path.

Consider either:

Removing the useAltOnNa mechanism entirely if not needed, or

Documenting that it's kept for potential future restoration of alternative prediction

Based on learnings, this appears to be technical debt from the incomplete removal of the alternative prediction mechanism.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e665826 and be0c88c.

📒 Files selected for processing (2)

src/cpu/pred/btb/microtage.cc
src/cpu/pred/btb/microtage.hh

🧰 Additional context used

🧬 Code graph analysis (2)

src/cpu/pred/btb/microtage.cc (1)

src/cpu/pred/btb/microtage.hh (27)

MicroTAGE (49-49)

MicroTAGE (113-113)

MicroTAGE (115-115)

entry (88-90)

entry (152-152)

entry (153-153)

entry (154-154)

entry (415-418)

TageTableInfo (85-85)

TageTableInfo (86-87)

taken (274-274)

TagePrediction (104-104)

TagePrediction (105-108)

pred (374-374)

TageEntry (66-66)

TageEntry (68-69)

table (432-432)

table (433-433)

max (277-277)

pc (174-174)

pc (177-177)

pc (181-181)

pc (185-185)

pc (188-190)

pc (188-188)

pc (197-197)

pc (283-283)

src/cpu/pred/btb/microtage.hh (2)

src/cpu/pred/BranchPredictor.py (1)

MicroTAGE (1066-1091)

src/cpu/pred/btb/btb_ubtb.hh (1)

meta (144-147)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (2)

src/cpu/pred/btb/microtage.cc (2)

587-587: Verify age penalty correctly weakens counter.

Line 587 applies age penalty: if (cand.counter > 0) cand.counter--; else cand.counter++;

This moves the counter toward zero from both directions (positive counters decrement, negative increment), which correctly weakens strong predictions over time. However, ensure this matches your intended aging policy—some implementations age only in one direction or use different thresholds.

If this aging behavior differs from your RTL or prior BTBTAGE implementation, please verify it's intentional.

905-908: This early return is correct for path history—no change needed.

The code updates folded path history, not global direction history. In TAGE, path history is intentionally updated only for taken branches to encode the PC footprint of taken branch targets, while global direction history (GHR) is updated for all branches. The implementation matches the TAGE specification.

coderabbitai · 2026-01-09T09:26:17Z

src/cpu/pred/btb/microtage.cc

+updateOnRead(p.updateOnRead),
+numBanks(p.numBanks),
+bankIdWidth(ceilLog2(p.numBanks)),
+blockWidth(p.blockSize ? floorLog2(p.blockSize) : 0),


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for MicroTAGE and blockSize parameter definitions find . -type f \( -name "*.py" -o -name "*.cc" -o -name "*.h" \) | xargs grep -l "blockSize" | head -20

Repository: OpenXiangShan/GEM5

Length of output: 822

🏁 Script executed:

# Look for parameter definitions in Python files rg -n "blockSize" --type py -A 3 -B 3 | head -100

Repository: OpenXiangShan/GEM5

Length of output: 4925

🏁 Script executed:

# Check the specific line 856 in microtage.cc sed -n '850,865p' src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 632

🏁 Script executed:

# Search for MicroTAGE parameter class definition rg -n "class MicroTAGE" --type py -A 20 | head -80

Repository: OpenXiangShan/GEM5

Length of output: 2151

Add validation to ensure blockSize is never zero.

The ternary operator at line 87 p.blockSize ? floorLog2(p.blockSize) : 0 indicates blockSize could be zero, but line 856's blockSize - 1 in a mask operation would produce incorrect alignment results if blockSize is 0. Either add a constraint to the blockSize parameter to enforce a minimum value, or add defensive checks at all usage sites similar to line 87.

🤖 Prompt for AI Agents

In @src/cpu/pred/btb/microtage.cc at line 87, Validate and enforce that p.blockSize is never zero before using it: in the constructor/initializer where blockWidth is computed (the expression using blockWidth(p.blockSize ? floorLog2(p.blockSize) : 0)), either assert/throw if p.blockSize == 0 or clamp it to a minimum of 1 so downstream uses are safe; also update any mask computations that do (blockSize - 1) to use a defensive expression like (p.blockSize ? p.blockSize - 1 : 0) or rely on the validated/clamped value so mask logic cannot underflow.

src/cpu/pred/btb/microtage.cc

Change-Id: I953965b8e6feb1c15a238ac832d65bc16b32f496

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @src/cpu/pred/btb/microtage.cc:
- Line 241: Typo in the inline comment: change "aviod multi hit" to "avoid multi
hit" (or "avoid multiple hits") in the comment that reads "only one way can be
matched, aviod multi hit, TODO: RTL how to do this?" so update that comment in
microtage.cc (near the break; statement) to correct the spelling.

🧹 Nitpick comments (2)

src/cpu/pred/btb/microtage.cc (2)

508-517: Consider cleaning up commented code blocks.

Two large blocks of commented code exist:

Lines 508-517: Misprediction statistics tracking

Lines 718-734: Database trace writing for debugging

If these features are deferred or no longer needed, remove the comments to reduce clutter. If they're planned for future work, consider adding a TODO or tracking issue.

Also applies to: 718-734

1072-1102: Remove unused updateLRU and getLRUVictim functions.

These functions are defined but completely unused. Comments at lines 237 and 488 explicitly state "Do not use LRU" and "No LRU maintenance", and a codebase search confirms neither function is called anywhere. They should be removed to avoid confusion about the replacement policy. If LRU support is planned for future use, add an explanatory comment before removing them or mark them as deprecated.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be0c88c and 28084ef.

📒 Files selected for processing (2)

src/cpu/pred/BranchPredictor.py
src/cpu/pred/btb/microtage.cc

🧰 Additional context used

🧬 Code graph analysis (2)

src/cpu/pred/BranchPredictor.py (3)

src/cpu/pred/btb/microtage.cc (3)

MicroTAGE (38-1147)

MicroTAGE (70-143)

MicroTAGE (145-147)

src/cpu/pred/btb/microtage.hh (3)

MicroTAGE (49-49)

MicroTAGE (113-113)

MicroTAGE (115-115)

src/cpu/pred/btb/timed_base_pred.hh (3)

TimedBaseBTBPredictor (48-48)

TimedBaseBTBPredictor (53-53)

numDelay (77-77)

src/cpu/pred/btb/microtage.cc (1)

src/cpu/pred/btb/microtage.hh (43)

MicroTAGE (49-49)

MicroTAGE (113-113)

MicroTAGE (115-115)

startPC (170-171)

startPC (421-428)

entry (88-90)

entry (152-152)

entry (153-153)

entry (154-154)

entry (415-418)

TageTableInfo (85-85)

TageTableInfo (86-87)

taken (274-274)

TagePrediction (104-104)

TagePrediction (105-108)

pred (374-374)

history (130-130)

history (134-135)

history (139-142)

history (139-139)

history (144-148)

history (144-145)

history (163-163)

history (200-200)

history (386-386)

stream (157-157)

stream (271-271)

stream (412-412)

TageEntry (66-66)

TageEntry (68-69)

table (432-432)

table (433-433)

startAddr (120-120)

startAddr (122-124)

pc (174-174)

pc (177-177)

pc (181-181)

pc (185-185)

pc (188-190)

pc (188-188)

pc (197-197)

pc (283-283)

branchPC (193-193)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

🪛 Ruff (0.14.10)

src/cpu/pred/BranchPredictor.py

1072-1072: Param may be undefined, or defined from star imports

(F405)

1073-1073: Param may be undefined, or defined from star imports

(F405)

1074-1074: Param may be undefined, or defined from star imports

(F405)

1076-1076: Param may be undefined, or defined from star imports

(F405)

1077-1077: VectorParam may be undefined, or defined from star imports

(F405)

1078-1078: VectorParam may be undefined, or defined from star imports

(F405)

1079-1079: VectorParam may be undefined, or defined from star imports

(F405)

1080-1080: Param may be undefined, or defined from star imports

(F405)

1082-1082: VectorParam may be undefined, or defined from star imports

(F405)

1083-1083: Param may be undefined, or defined from star imports

(F405)

1084-1084: Param may be undefined, or defined from star imports

(F405)

1085-1085: Param may be undefined, or defined from star imports

(F405)

1086-1086: Param may be undefined, or defined from star imports

(F405)

1087-1087: Param may be undefined, or defined from star imports

(F405)

1088-1088: Param may be undefined, or defined from star imports

(F405)

1089-1089: Param may be undefined, or defined from star imports

(F405)

1090-1090: Param may be undefined, or defined from star imports

(F405)

1091-1091: Param may be undefined, or defined from star imports

(F405)

1092-1092: Param may be undefined, or defined from star imports

(F405)

1182-1182: Param may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (7)

src/cpu/pred/BranchPredictor.py (2)

1066-1092: LGTM - Well-structured MicroTAGE configuration.

The new MicroTAGE SimObject is properly defined with:

Correct inheritance from TimedBaseBTBPredictor

Vector parameter sizes consistent with numPredictors=2

Comprehensive parameter set covering all C++ constructor requirements

Note: The Python defaults (numPredictors=2, numWays=1, tableSizes=[512]*2) differ from the UNIT_TEST constructor defaults in microtage.cc (numPredictors=4, numWays=2, tableSize=1024). This appears intentional for a micro-sized variant, but verify this aligns with the "Micro" naming and performance objectives.

1182-1182: Integration updated correctly.

The parameter type change from Param.BTBTAGE to Param.MicroTAGE properly integrates the new predictor implementation into DecoupledBPUWithBTB.

src/cpu/pred/btb/microtage.cc (5)

38-143: Constructor implementation is solid.

Both constructors (UNIT_TEST and runtime) properly initialize:

TAGE table structures with correct dimensionality

Index/tag masks and bit widths

Path folded history objects for tag/index computation

Bank conflict simulation parameters

The warning at lines 97-100 about bank simulation working better with updateOnRead=true is helpful, especially since the Python config sets enableBankConflict=False by default.

625-661: Bank conflict simulation implemented correctly.

The two-phase update approach (canResolveUpdate probes, doResolveUpdate commits) properly simulates hardware bank conflicts:

Detects when prediction and update target the same bank

Defers conflicting updates to avoid unrealistic concurrent access

Tracks statistics per bank for analysis

This design aligns with the warning in the constructor about bank simulation working better with updateOnRead=true.

773-888: Helper functions are well-implemented.

The tag/index computation helpers correctly:

Apply saturating arithmetic for counters

XOR PC, folded history, and branch position for tags

Handle block alignment and position calculation

Provide detailed diagnostic warnings for edge cases

The warning at lines 868-877 is appropriately verbose for debugging misaligned or out-of-range branches.

903-987: History management implementation is correct.

The speculative history update and recovery logic properly:

Maintains three folded history types (index, tag, alt tag) for each predictor table

Updates histories speculatively on predictions

Recovers to checkpointed state on mispredictions

Includes debug validation via checkFoldedHist

This follows standard TAGE predictor design for managing path history efficiently.

991-1068: Comprehensive statistics tracking.

The TageStats implementation provides excellent observability:

Per-table hit/misprediction tracking to evaluate history length effectiveness

Per-bank access/conflict statistics for bank simulation validation

Allocation success/failure tracking

Useful bit reset monitoring

This detailed instrumentation will be valuable for tuning and debugging the predictor.

src/cpu/pred/btb/microtage.cc

github-actions · 2026-01-09T10:51:39Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1732`	-
This PR	`2.1662`	📉 `-0.0069` (`-0.32%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-09T12:04:02Z

[Generated by GEM5 Performance Robot]
commit: 28084ef
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.90	19.89	+0.02 🟢

[Generated by GEM5 Performance Robot]
commit: 28084ef
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.90	19.91	-0.05 🔴

Change-Id: I39cd471d452aa343c3dd741a80fdfa7d126e3a9f

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In @src/cpu/pred/btb/microtage.cc:
- Around line 758-769: The mask computation in MicroTAGE::getTageIndex uses
(1ULL << tableIndexBits[t]) - 1 which is undefined behavior when
tableIndexBits[t] >= 64; change the mask creation to guard the shift by checking
the bit count first (e.g., if tableIndexBits[t] >= sizeof(mask)*8 use an
all-ones mask, otherwise compute (1ULL << tableIndexBits[t]) - 1), ensure you
use an unsigned 64-bit type for the intermediate to avoid signed/width issues
and assign to Addr; update the mask variable logic accordingly so
pcBits/foldedBits masking is safe for all tableIndexBits[t] values.
- Around line 736-750: The code in MicroTAGE::getTageTag uses (1ULL <<
tableTagBits[t]) - 1 which is undefined when tableTagBits[t] >= 64; update
getTageTag to guard against large tag widths by checking tableTagBits[t] and
computing mask without shifting by >=64 (e.g., if tableTagBits[t] >= 64 set mask
to all-ones for the Addr width or compute mask as (tableTagBits[t] == 0 ? 0 :
((1ULL << tableTagBits[t] - 1) << 1) - 1) / or simply clamp tableTagBits[t] to
min(tableTagBits[t], 63) before shifting), ensuring tableTagBits[t] is still
referenced and the rest of the function (pcBits, foldedBits, XOR with position)
remains unchanged.

In @src/cpu/pred/btb/microtage.hh:
- Around line 55-73: In the TageEntry struct comment "short counter (-4 to 3),
3bits， 0 and -1 are weak" replace the non-ASCII comma (，) with a standard ASCII
comma so the comment reads e.g. "short counter (-4 to 3), 3 bits, 0 and -1 are
weak"; update the comment near the TageEntry definition to use the ASCII comma
and optionally add a space in "3bits" to "3 bits" for clarity.

🧹 Nitpick comments (7)

src/cpu/pred/btb/microtage.hh (3)
93-109: Clarify comment vs. field name mismatch.

Line 100 comment says "Whether to use alternative prediction, true if main is weak or no main prediction" but the field is named mainprovided. The semantics appear inverted—mainprovided being true means main was provided (i.e., don't use alternative). Consider renaming or updating the comment to avoid confusion.
-            bool mainprovided;    // Whether to use alternative prediction, true if main is weak or no main prediction
+            bool mainprovided;    // Whether main prediction was provided (false means fallback to base)
248-252: Dead code: useAlt vector declared but never initialized or used.

The useAlt vector is declared but never resized in either constructor. The only access (useAlt[uidx]) in the implementation is commented out. This is dead code that should be removed.

382-398: Minor style: Duplicate public: access specifiers and C-style typedef.

Lines 382 and 387 both declare public:. Consider consolidating. Also, the typedef struct TageMeta { ... } TageMeta; pattern (line 391) is C-style; modern C++ prefers struct TageMeta { ... };.
♻️ Suggested cleanup
-public:
-
-    // Recover folded history after misprediction
-    void recoverFoldedHist(const bitset& history);
-
-public:
-
-
-    // Metadata for TAGE prediction
-    typedef struct TageMeta
+public:
+    // Recover folded history after misprediction
+    void recoverFoldedHist(const bitset& history);
+
+    // Metadata for TAGE prediction
+    struct TageMeta
     {
         std::unordered_map<Addr, TagePrediction> preds;
         std::vector<PathFoldedHist> tagFoldedHist;
         std::vector<PathFoldedHist> indexFoldedHist;
         bitset history;     // for viewing
         TageMeta() {}
-    } TageMeta;
+    };
src/cpu/pred/btb/microtage.cc (4)
407-421: Commented-out code and incomplete logic.

Line 414 has updateCounter commented out, and while statistics are updated, the actual useAlt vector is never modified. If the use-alt-on-NA feature is intentionally disabled, consider removing the dead code path entirely rather than leaving partial implementations.

669-687: Large commented-out trace block should be removed or enabled.

This 17-line commented block adds noise. If it's for debugging, consider putting it behind a #ifdef DEBUG_TRACE or removing it entirely. The code also references recomputed.useAlt (line 682) which doesn't exist in TagePrediction.

928-935: Unused variables in loop.

Lines 930-931 declare buf2 and buf3 but never use them. These appear to be debug leftovers.
     for (int t = 0; t < numPredictors; t++) {
         for (int type = 0; type < 3; type++) {
-            std::string buf2, buf3;
             auto &foldedHist = type == 0 ? indexFoldedHist[t] : type == 1 ? tagFoldedHist[t] : altTagFoldedHist[t];
             foldedHist.check(hist);
         }
     }
1018-1050: Remove unused LRU functions updateLRU and getLRUVictim.

These functions (lines 1018–1050) are implemented but never called. The design intentionally avoids LRU maintenance, as documented in comments: "Do not use LRU; keep logic simple and align with CBP-style replacement" (line 235) and "No LRU maintenance" (line 458). Remove these unused functions to reduce code clutter.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28084ef and e57b012.

📒 Files selected for processing (2)

src/cpu/pred/btb/microtage.cc
src/cpu/pred/btb/microtage.hh

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-01-12T02:49:08.514Z

Learnt from: jensen-yan
Repo: OpenXiangShan/GEM5 PR: 686
File: src/cpu/pred/btb/btb_mgsc.cc:226-230
Timestamp: 2026-01-12T02:49:08.514Z
Learning: In src/cpu/pred/btb/btb_mgsc.cc, the calculateScaledPercsum function returns the unscaled percsum (weight scaling disabled) to align with RTL implementation. The RTL currently does not have a weight table, so setting weight to 1 (equivalent to disabling the weight table) maintains hardware/software consistency with minimal performance impact.

Applied to files:

src/cpu/pred/btb/microtage.cc

🧬 Code graph analysis (2)

src/cpu/pred/btb/microtage.cc (1)

src/cpu/pred/btb/microtage.hh (35)

MicroTAGE (49-49)

MicroTAGE (113-113)

MicroTAGE (115-115)

TageTableInfo (85-85)

TageTableInfo (86-87)

taken (273-273)

TagePrediction (104-104)

TagePrediction (105-108)

pred (373-373)

history (130-130)

history (134-135)

history (139-142)

history (139-139)

history (144-148)

history (144-145)

history (163-163)

history (199-199)

history (385-385)

stream (157-157)

stream (270-270)

stream (410-410)

TageEntry (66-66)

TageEntry (68-69)

table (430-430)

table (431-431)

counter (70-72)

pc (173-173)

pc (176-176)

pc (180-180)

pc (184-184)

pc (187-189)

pc (187-187)

pc (196-196)

pc (282-282)

expected (258-258)

src/cpu/pred/btb/microtage.hh (2)

src/cpu/pred/btb/microtage.cc (7)

MicroTAGE (38-1095)

MicroTAGE (70-141)

MicroTAGE (143-145)

specUpdatePHist (886-891)

specUpdatePHist (887-887)

recoverPHist (906-916)

recoverPHist (907-908)

src/cpu/pred/btb/btb_ubtb.hh (1)

meta (144-147)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

🔇 Additional comments (5)

src/cpu/pred/btb/microtage.hh (2)

1-46: LGTM: Header structure and includes are well organized.

The conditional compilation for UNIT_TEST vs production paths is cleanly handled, and the namespace structure follows the gem5 conventions.

225-232: Remove this review comment: altTagFoldedHist is actively used in update and check operations.

The variable is read in src/cpu/pred/btb/microtage.cc at lines 866 and 931 within ternary operator expressions that select it when type == 2. The selected reference is then passed to .update() and .check() methods, making it an active part of the prediction logic, not dead code.

Likely an incorrect or invalid review comment.

src/cpu/pred/btb/microtage.cc (3)

189-265: LGTM: Core prediction logic is well-structured.

The generateSinglePrediction method cleanly implements TAGE table lookup with proper table priority (high to low), tag matching, and fallback to base prediction. Good use of position encoding in the tag.

497-571: LGTM: Allocation logic with aging is sound.

The set-associative allocation with preference for invalid/weak entries and age penalties aligns with standard TAGE allocation strategies. The useful bit reset mechanism at 256 failures is reasonable.

36-67: No action needed. The blockSize member is properly initialized in UNIT_TEST mode through the base class TimedBaseBTBPredictor() default constructor, which sets it to 32. The code at line 48 that uses floorLog2(blockSize) is safe.

Likely an incorrect or invalid review comment.

src/cpu/pred/btb/microtage.cc

src/cpu/pred/btb/microtage.hh

github-actions · 2026-01-12T03:54:16Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-12T05:12:18Z

[Generated by GEM5 Performance Robot]
commit: e57b012
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.90	19.90	+0.02 🟢

[Generated by GEM5 Performance Robot]
commit: e57b012
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.90	19.90	+0.01 🟢

Change-Id: I3567ae93652aac218c5b4646003abadddaf7cf32

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @src/cpu/pred/btb/microtage.cc:
- Around line 96-100: The vector tableTagBits is being overwritten after
initialization from p.TTagBitSizes; remove the unconditional
resize(tableTagBits.resize(numPredictors)) so the configured tag bit sizes are
preserved, or if a resize is required only for testing, wrap that resize in an
#ifdef UNIT_TEST guard; update the code around the constructor where
tableTagBits, p.TTagBitSizes, numPredictors, and the other table* vectors
(tageTable, tableIndexBits, tableIndexMasks, tableTagMasks) are handled to
ensure tableTagBits retains the parameter values.

🧹 Nitpick comments (5)

src/cpu/pred/btb/microtage.cc (2)
329-334: Minor: signed/unsigned comparison.

On line 329, s is int but stagePreds.size() returns size_t (unsigned). This could cause issues if stagePreds.size() exceeds INT_MAX, though unlikely in practice.
♻️ Suggested fix
-    for (int s = getDelay(); s < stagePreds.size(); s++) {
+    for (size_t s = getDelay(); s < stagePreds.size(); s++) {
427-429: Consider documenting the magic number.

The value 5 in (abs(2 * entry.ctr + 1) == 5) represents maximum counter strength for a 3-bit saturating counter. A named constant or comment would improve readability.
+        // Check if base counter is at maximum strength (|2*ctr+1| == 5 for 3-bit counter with range [-4,3])
+        constexpr int kMaxCounterStrength = 5;
         bool base_is_correct_and_strong =
                                      (base_taken == actual_taken) &&
-                                     (abs(2 * entry.ctr + 1) == 5);
+                                     (abs(2 * entry.ctr + 1) == kMaxCounterStrength);
src/cpu/pred/btb/microtage.hh (3)
100-101: Misleading comment on mainprovided.

The comment says "true if main is weak or no main prediction" but the variable name and usage suggest the opposite: mainprovided should be true when a main prediction IS provided. Looking at generateSinglePrediction (line 256 in .cc), mainprovided is set to provided which is true when a matching entry is found.
-            bool mainprovided;    // Whether to use alternative prediction, true if main is weak or no main prediction
+            bool mainprovided;    // Whether a main prediction from TAGE tables was found
354-357: Uninitialized members in UNIT_TEST path.

The bankIdx, numPredictors, and numBanks members are declared but not initialized in the UNIT_TEST path (there's no constructor for that case). While these may not be used during testing, accessing them would cause undefined behavior.

Consider adding inline initialization or a simple UNIT_TEST constructor:
         int bankIdx;
         int numPredictors;
         int numBanks;
+
+#ifdef UNIT_TEST
+        TageStats() : bankIdx(0), numPredictors(0), numBanks(0) {}
+#else
370-375: Minor: redundant public: labels.

Lines 370 and 375 both declare public: sections with nothing between them. This is harmless but can be consolidated.
-public:
-
-    // Recover folded history after misprediction
-    void recoverFoldedHist(const bitset& history);
-
 public:
+    // Recover folded history after misprediction
+    void recoverFoldedHist(const bitset& history);

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e57b012 and b8fc3d6.

📒 Files selected for processing (3)

src/cpu/pred/BranchPredictor.py
src/cpu/pred/btb/microtage.cc
src/cpu/pred/btb/microtage.hh

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-01-12T02:49:08.514Z

Learnt from: jensen-yan
Repo: OpenXiangShan/GEM5 PR: 686
File: src/cpu/pred/btb/btb_mgsc.cc:226-230
Timestamp: 2026-01-12T02:49:08.514Z
Learning: In src/cpu/pred/btb/btb_mgsc.cc, the calculateScaledPercsum function returns the unscaled percsum (weight scaling disabled) to align with RTL implementation. The RTL currently does not have a weight table, so setting weight to 1 (equivalent to disabling the weight table) maintains hardware/software consistency with minimal performance impact.

Applied to files:

src/cpu/pred/btb/microtage.cc

🧬 Code graph analysis (1)

src/cpu/pred/BranchPredictor.py (3)

src/cpu/pred/btb/microtage.cc (3)

MicroTAGE (38-1074)

MicroTAGE (67-135)

MicroTAGE (137-139)

src/cpu/pred/btb/microtage.hh (3)

MicroTAGE (49-49)

MicroTAGE (113-113)

MicroTAGE (115-115)

src/cpu/pred/btb/timed_base_pred.cc (2)

TimedBaseBTBPredictor (15-15)

TimedBaseBTBPredictor (18-26)

🪛 Cppcheck (2.19.0)

src/cpu/pred/btb/microtage.cc

[error] 119-119: Reference to temporary returned.

(returnTempReference)

[error] 132-132: Reference to temporary returned.

(returnTempReference)

[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

[error] 138-138: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

🪛 Ruff (0.14.10)

src/cpu/pred/BranchPredictor.py

1072-1072: Param may be undefined, or defined from star imports

(F405)

1073-1073: Param may be undefined, or defined from star imports

(F405)

1074-1074: Param may be undefined, or defined from star imports

(F405)

1076-1076: Param may be undefined, or defined from star imports

(F405)

1077-1077: VectorParam may be undefined, or defined from star imports

(F405)

1078-1078: VectorParam may be undefined, or defined from star imports

(F405)

1079-1079: VectorParam may be undefined, or defined from star imports

(F405)

1080-1080: Param may be undefined, or defined from star imports

(F405)

1082-1082: VectorParam may be undefined, or defined from star imports

(F405)

1083-1083: Param may be undefined, or defined from star imports

(F405)

1084-1084: Param may be undefined, or defined from star imports

(F405)

1085-1085: Param may be undefined, or defined from star imports

(F405)

1086-1086: Param may be undefined, or defined from star imports

(F405)

1087-1087: Param may be undefined, or defined from star imports

(F405)

1088-1088: Param may be undefined, or defined from star imports

(F405)

1089-1089: Param may be undefined, or defined from star imports

(F405)

1090-1090: Param may be undefined, or defined from star imports

(F405)

1091-1091: Param may be undefined, or defined from star imports

(F405)

1092-1092: Param may be undefined, or defined from star imports

(F405)

1182-1182: Param may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.8c)

🔇 Additional comments (19)

src/cpu/pred/btb/microtage.cc (14)
1-35: LGTM for includes and namespace setup.

The conditional includes for UNIT_TEST vs production builds are properly structured, and the namespace hierarchy is correctly defined.

125-127: Potential undefined behavior: bit shifting with large widths.

Static analysis flags potential undefined behavior when shifting by 64+ bits. The PathFoldedHist constructor receives tableTagBits[i] and tableIndexBits[i] as width parameters. If any of these values are >= 64, bit shifting operations inside PathFoldedHist could trigger undefined behavior.

Verify that the configuration parameters ensure tableTagBits and tableIndexBits values remain below 64. Consider adding assertions:
assert(tableTagBits[i] < 64 && "Tag bits must be < 64 to avoid UB");
assert(tableIndexBits[i] < 64 && "Index bits must be < 64 to avoid UB");
192-257: LGTM for generateSinglePrediction.

The TAGE table lookup logic correctly iterates from highest to lowest table, finds matching entries by tag, and properly handles the main prediction selection. The early break on match per table is appropriate for set-associative lookup.

266-281: LGTM for lookupHelper.

The function correctly iterates through BTB entries, generates predictions for valid conditional branches, stores metadata, and updates statistics.

349-375: LGTM for prepareUpdateEntries.

The function correctly prepares BTB entries for update, handles new entry insertion for BTB misses, and properly filters to only conditional branches using the erase-remove idiom.

488-562: LGTM for handleNewEntryAllocation.

The allocation logic correctly implements:

Priority for invalid or weak/not-useful ways

Age penalty for strong entries when no allocation slot available

Global useful bit reset mechanism to prevent entry starvation

The implementation aligns with standard TAGE allocation policies.

568-604: LGTM for bank conflict handling.

The canResolveUpdate and doResolveUpdate functions correctly implement bank conflict detection and deferral logic, allowing updates to be retried when prediction and update target the same bank.

611-682: LGTM for update function.

The update flow correctly:

Retrieves prediction metadata

Re-reads predictions when updateOnRead is enabled

Updates predictor state and handles allocation

The commented trace code (lines 660-678) is acceptable for debugging purposes.

684-712: LGTM for checkUtageUpdateMisspred.

The function correctly identifies mispredictions by sorting predictions by PC and comparing the first predicted-taken branch against actual execution.

800-806: Verify modulo clamping behavior is intentional.

Line 805 uses position %= maxBranchPositions which wraps around (e.g., position 33 becomes 1 for maxBranchPositions=32). This could cause incorrect tag computation since the wrapped position doesn't reflect the actual branch location.

If the intent is to clamp to the maximum valid position, consider:
position = maxBranchPositions - 1;
If wrap-around is intentional (to match RTL behavior), please add a comment explaining the rationale.

830-851: LGTM for doUpdateHist.

The function correctly updates all folded histories (index, tag, alt-tag) for each predictor table. The debug flag check before the expensive to_string operation is a good optimization.

916-962: LGTM for TageStats constructor.

Comprehensive statistics initialization covering prediction accuracy, allocation success/failure, bank conflicts, and per-table/per-bank breakdowns.

998-1029: LRU functions appear unused.

The updateLRU and getLRUVictim functions are implemented but the allocation logic in handleNewEntryAllocation uses useful bits and counter strength instead of LRU. The comment at line 227 also states "Do not use LRU; keep logic simple and align with CBP-style replacement."

If LRU is intentionally not used, consider removing these functions or marking them for future use. If they should be integrated, the allocation logic may need updates.

1032-1068: LGTM for commitBranch.

The function correctly updates commit-time statistics, distinguishing between prediction hits/misses and correct/incorrect predictions.
src/cpu/pred/btb/microtage.hh (3)

1-26: LGTM for header guards and includes.

Proper include guards and well-organized conditional includes for UNIT_TEST vs production builds.

56-73: LGTM for TageEntry struct.

Clean struct definition with appropriate fields for TAGE entry representation. The short type for counter is appropriate with the 3-bit constraint enforced by the saturation logic.

178-184: LGTM for getTageTag declarations.

Properly declared overloads with default position parameter for backward compatibility.

src/cpu/pred/BranchPredictor.py (2)

1066-1092: LGTM for MicroTAGE SimObject definition.

The MicroTAGE class is correctly defined with:

Proper inheritance from TimedBaseBTBPredictor

Correct cxx_class and cxx_header references

Sensible default parameters that are consistent with the C++ implementation

Good documentation in the docstring

The Ruff warnings about star imports (F405) are false positives - from m5.params import * is standard gem5 practice and correctly provides Param, VectorParam, etc.

1182-1182: LGTM for DecoupledBPUWithBTB integration.

The microtage parameter correctly uses Param.MicroTAGE(MicroTAGE(), ...) to match the new class hierarchy where MicroTAGE directly extends TimedBaseBTBPredictor rather than BTBTAGE.

coderabbitai · 2026-01-13T03:11:42Z

src/cpu/pred/btb/microtage.cc

+    tageTable.resize(numPredictors);
+    tableIndexBits.resize(numPredictors);
+    tableIndexMasks.resize(numPredictors);
+    tableTagBits.resize(numPredictors);
+    tableTagMasks.resize(numPredictors);


⚠️ Potential issue | 🟠 Major

Bug: tableTagBits is overwritten after being set from params.

On line 71, tableTagBits is initialized from p.TTagBitSizes. Then on line 99, it's resized to numPredictors with default values, discarding the parameter values. This appears to be a bug that would cause the tag bit sizes from configuration to be lost.

🐛 Proposed fix

tageTable.resize(numPredictors); tableIndexBits.resize(numPredictors); tableIndexMasks.resize(numPredictors); - tableTagBits.resize(numPredictors); tableTagMasks.resize(numPredictors);

The tableTagBits vector is already initialized from params on line 71, so it should not be resized here. If this resize is needed for the UNIT_TEST path only, consider guarding it with #ifdef UNIT_TEST.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

tageTable.resize(numPredictors);

tableIndexBits.resize(numPredictors);

tableIndexMasks.resize(numPredictors);

tableTagBits.resize(numPredictors);

tableTagMasks.resize(numPredictors);

tageTable.resize(numPredictors);

tableIndexBits.resize(numPredictors);

tableIndexMasks.resize(numPredictors);

tableTagMasks.resize(numPredictors);

🤖 Prompt for AI Agents

In @src/cpu/pred/btb/microtage.cc around lines 96 - 100, The vector tableTagBits is being overwritten after initialization from p.TTagBitSizes; remove the unconditional resize(tableTagBits.resize(numPredictors)) so the configured tag bit sizes are preserved, or if a resize is required only for testing, wrap that resize in an #ifdef UNIT_TEST guard; update the code around the constructor where tableTagBits, p.TTagBitSizes, numPredictors, and the other table* vectors (tageTable, tableIndexBits, tableIndexMasks, tableTagMasks) are handled to ensure tableTagBits retains the parameter values.

github-actions · 2026-01-13T03:17:11Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-13T04:12:53Z

[Generated by GEM5 Performance Robot]
commit: b8fc3d6
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.83	19.90	-0.33 🔴

[Generated by GEM5 Performance Robot]
commit: b8fc3d6
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.83	19.90	-0.34 🔴

Change-Id: If7c9d3aa68a23c36dde74d8cf3a286c9c48f3e3c

github-actions · 2026-01-13T07:22:13Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

Change-Id: Ib9daaa3a60777c83fed779dd0169d17002a44d43

github-actions · 2026-01-16T04:21:41Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

github-actions · 2026-01-16T04:32:01Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-16T05:26:14Z

[Generated by GEM5 Performance Robot]
commit: 3450ffd
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.89	20.15	-1.31 🔴

[Generated by GEM5 Performance Robot]
commit: 3450ffd
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.89	19.87	+0.08 🟢

Change-Id: Ia5934e526e8d6bea03bd0711eb9cf6947eecc2aa

github-actions · 2026-01-16T06:15:37Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-16T07:22:43Z

[Generated by GEM5 Performance Robot]
commit: c7b8871
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.90	20.15	-1.27 🔴

[Generated by GEM5 Performance Robot]
commit: c7b8871
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.90	19.89	+0.04 🟢

Change-Id: Ie3f6b39d8f6e989018fc477a7b55d3879657338b

github-actions · 2026-01-16T08:55:27Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

Change-Id: I31aa12e8e2f6dd2b5421f9c4deab11dddcb53eda

github-actions · 2026-01-16T09:21:02Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

github-actions · 2026-01-16T09:22:16Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

Change-Id: I6c27be3fbd0db5110e0b1948ac3c39c1400b9b64

github-actions · 2026-01-16T09:39:07Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

Change-Id: I21054d48efddcff13308e1164109dfbbc2d034e2

Change-Id: I84d82d8499186909ee5f79e398b8a9ff07d5bb72

github-actions · 2026-01-16T09:54:24Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

github-actions · 2026-01-16T10:03:47Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1691`	-
This PR	`2.1691`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-16T10:55:34Z

[Generated by GEM5 Performance Robot]
commit: d1b20ee
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.78	20.15	-1.87 🔴

[Generated by GEM5 Performance Robot]
commit: d1b20ee
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.78	19.78	-0.02 🔴

Change-Id: Ie92f7106b798596ba7b0ed86cc9346fa3e0f9c38

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/cpu/pred/btb/microtage.cc`:
- Around line 519-522: The comment for the weakish condition is wrong: the
expression std::abs(cand.counter * 2 + 1) <= 3 is true only for cand.counter ∈
{-2,-1,0,1}, not -3..2; update the inline comment next to the weakish definition
(variable weakish and expression using cand.counter) to the correct range (e.g.
"// counter ∈ {-2, -1, 0, 1}" or similar) or, if the original intent was to
cover -3..2, change the expression accordingly so the comment and condition
match.

♻️ Duplicate comments (5)

src/cpu/pred/btb/microtage.cc (5)
739-769: Potential undefined behavior from bit-shift when bit width ≥ 64.

Lines 743 and 762 compute (1ULL << tableTagBits[t]) and (1ULL << tableIndexBits[t]). If either value is ≥ 64, this is UB. While typical configurations use smaller values, defensive bounds checking is recommended.

826-831: Use 1ULL to prevent potential overflow in mask calculation.

Line 830 uses (1 << bankIdWidth) with a plain int constant. If bankIdWidth >= 31, this overflows.

801-824: Potential underflow if blockSize is zero.

Line 804 computes ~(blockSize - 1) which underflows to ~(0xFFFF...FFFF) if blockSize == 0. While the Python config defaults blockSize=32, the constructor at line 81 handles blockSize=0 defensively. Consider adding a runtime check here.

229-232: Fix typo: "aviod" → "avoid".

96-100: Bug: tableTagBits is overwritten after being set from params.

Line 99 calls tableTagBits.resize(numPredictors) which discards the values already initialized from p.TTagBitSizes at line 71. The resize with default values will reset all tag bit sizes to 0, causing tag matching to fail.
🐛 Proposed fix: Remove the redundant resize
     tageTable.resize(numPredictors);
     tableIndexBits.resize(numPredictors);
     tableIndexMasks.resize(numPredictors);
-    tableTagBits.resize(numPredictors);
     tableTagMasks.resize(numPredictors);
The tableTagBits vector is already initialized from p.TTagBitSizes at line 71, so this resize is both unnecessary and destructive. If this resize is needed only for the UNIT_TEST path where tableTagBits is not initialized from params, guard it with #ifdef UNIT_TEST.

🧹 Nitpick comments (1)

src/cpu/pred/btb/microtage.cc (1)
727-736: Add validation for width parameter to prevent shift overflow.

If width is 0, (width-1) underflows. If width >= 32, the shift overflows since 1 is an int. Currently only called with width=3 (line 424), but the function is fragile for reuse.
♻️ Proposed defensive fix
 void
 MicroTAGE::updateCounter(bool taken, unsigned width, short &counter) {
+    assert(width > 0 && width <= 16 && "Counter width must be in range [1, 16]");
     int max = (1 << (width-1)) - 1;
     int min = -(1 << (width-1));

coderabbitai · 2026-01-23T02:20:10Z

src/cpu/pred/btb/microtage.cc

+        for (unsigned way = 0; way < numWays; ++way) {
+            auto &cand = set[way];
+            const bool weakish = std::abs(cand.counter * 2 + 1) <= 3; // -3,-2,-1,0,1,2
+            if (!cand.valid || (!cand.useful && weakish)) {


⚠️ Potential issue | 🟡 Minor

Fix misleading comment for weakish condition.

The comment says // -3,-2,-1,0,1,2 but the formula std::abs(cand.counter * 2 + 1) <= 3 evaluates to true only for counter ∈ [-2, -1, 0, 1].

📝 Proposed fix

- const bool weakish = std::abs(cand.counter * 2 + 1) <= 3; // -3,-2,-1,0,1,2 + const bool weakish = std::abs(cand.counter * 2 + 1) <= 3; // counter in [-2, -1, 0, 1]

🤖 Prompt for AI Agents

In `@src/cpu/pred/btb/microtage.cc` around lines 519 - 522, The comment for the weakish condition is wrong: the expression std::abs(cand.counter * 2 + 1) <= 3 is true only for cand.counter ∈ {-2,-1,0,1}, not -3..2; update the inline comment next to the weakish definition (variable weakish and expression using cand.counter) to the correct range (e.g. "// counter ∈ {-2, -1, 0, 1}" or similar) or, if the original intent was to cover -3..2, change the expression accordingly so the comment and condition match.

github-actions · 2026-01-23T02:31:02Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1981`	-
This PR	`2.1981`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-23T03:21:14Z

[Generated by GEM5 Performance Robot]
commit: 14c53b4
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.89	20.29	-1.99 🔴

[Generated by GEM5 Performance Robot]
commit: 14c53b4
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.89	19.78	+0.57 🟢

Change-Id: Id6d0cea61bda99e19f3e6bee63e86797b98f4149

github-actions · 2026-01-23T03:32:53Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1981`	-
This PR	`2.1981`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-01-23T04:51:44Z

[Generated by GEM5 Performance Robot]
commit: 38f4c13
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	19.89	20.29	-1.99 🔴

[Generated by GEM5 Performance Robot]
commit: 38f4c13
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	19.89	19.89	-0.00

Cao Jiaming added 2 commits January 6, 2026 16:02

cpu-o3: split microtage out of tage

8ada7dd

cpu-o3: reorganize includes in microtage files for clarity

5cfbb4f

Change-Id: I8b6a4bda945564cc961251fc1297a6bacd8636cb

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

src/cpu/pred/btb/microtage.hh Show resolved Hide resolved

cpu-o3: enhance branch index calculation in MicroTAGE to handle edge …

9469e8f

…cases

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

src/cpu/pred/btb/microtage.cc Show resolved Hide resolved

src/cpu/pred/btb/microtage.cc Outdated Show resolved Hide resolved

cpu-o3: update blockWidth calculation to handle zero blockSize

a089f57

Change-Id: I44b6175022a3a593ed385407bd95ec0c40c74642

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

cpu-o3: adjust MicroTAGE parameters and clean up code for clarity

3fa08b4

Change-Id: I16ce79a7d8488d9a138d0a26f5500576ae54132e

coderabbitai bot reviewed Jan 8, 2026

View reviewed changes

src/cpu/pred/btb/microtage.cc Outdated Show resolved Hide resolved

cpu-o3: remove unused variable 'usingBasetable'

e665826

Change-Id: I8be037a1cdacd7151f4fe8e743a32cca90a85036

coderabbitai bot reviewed Jan 8, 2026

View reviewed changes

cpu-o3: MicroTAGE to remove base table and alternative prediction

be0c88c

Change-Id: I6b9c92c18c574a12c19532ae0b894e64c1187342

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

cpu-o3: update MicroTAGE parameters for consistency

28084ef

Change-Id: I953965b8e6feb1c15a238ac832d65bc16b32f496

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

src/cpu/pred/btb/microtage.cc Show resolved Hide resolved

cpu-o3: remove alternative prediction logic

e57b012

Change-Id: I39cd471d452aa343c3dd741a80fdfa7d126e3a9f

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

src/cpu/pred/btb/microtage.cc Show resolved Hide resolved

src/cpu/pred/btb/microtage.cc Show resolved Hide resolved

src/cpu/pred/btb/microtage.cc Show resolved Hide resolved

src/cpu/pred/btb/microtage.hh Show resolved Hide resolved

cpu-o3: refine MicroTAGE parameters and remove unused variables

b8fc3d6

Change-Id: I3567ae93652aac218c5b4646003abadddaf7cf32

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

cpu-o3: update MicroTAGE history length parameter to 16

9a98605

Change-Id: If7c9d3aa68a23c36dde74d8cf3a286c9c48f3e3c

cpu-o3: update MicroTAGE history length parameter to 20

3450ffd

Change-Id: Ib9daaa3a60777c83fed779dd0169d17002a44d43

cpu-o3: update MicroTAGE history length parameter to 18

c7b8871

Change-Id: Ia5934e526e8d6bea03bd0711eb9cf6947eecc2aa

cpu-o3: modify MicroTAGE to include alternative tag history

60879ec

Change-Id: Ie3f6b39d8f6e989018fc477a7b55d3879657338b

Cao Jiaming added 2 commits January 16, 2026 17:05

cpu-o3: enhance MicroTAGE statistics to track utage prediction accuracy

4e09462

cpu-o3: update MicroTAGE prediction latency to 1 cycle

ae37932

Change-Id: I31aa12e8e2f6dd2b5421f9c4deab11dddcb53eda

cpu-o3: enable update on read and adjust prediction latency

1e17791

Change-Id: I6c27be3fbd0db5110e0b1948ac3c39c1400b9b64

Cao Jiaming added 2 commits January 16, 2026 17:45

cpu-o3: disable update on read for MicroTAGE predictor

48a8238

Change-Id: I21054d48efddcff13308e1164109dfbbc2d034e2

cpu-o3: update MicroTAGE prediction latency to 1 cycle

d1b20ee

Change-Id: I84d82d8499186909ee5f79e398b8a9ff07d5bb72

cpu-o3: add recovery for alternative tag folded history

14c53b4

Change-Id: Ie92f7106b798596ba7b0ed86cc9346fa3e0f9c38

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

cpu-o3:change utage delay to 0

38f4c13

Change-Id: Id6d0cea61bda99e19f3e6bee63e86797b98f4149

CJ362ff added align-kmhv3 and removed align-kmhv3 labels Jan 23, 2026

Conversation

CJ362ff commented Jan 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 6, 2026

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Jan 6, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jan 9, 2026

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Jan 9, 2026

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

coderabbitai bot left a comment

CJ362ff commented Jan 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 6, 2026 •

edited

Loading