Skip to content

Split microtage1 align#734

Open
CJ362ff wants to merge 4 commits intoxs-devfrom
split-microtage1-align
Open

Split microtage1 align#734
CJ362ff wants to merge 4 commits intoxs-devfrom
split-microtage1-align

Conversation

@CJ362ff
Copy link
Collaborator

@CJ362ff CJ362ff commented Jan 26, 2026

Summary by CodeRabbit

  • New Features
    • Adds a fully configurable MicroTAGE branch predictor with extensive tuning (table sizing, history lengths, banking, allocation) and detailed statistics.
    • Integrates MicroTAGE into the predictor ensemble and build, replacing the prior compact predictor configuration.
    • Enables the MicroTAGE path in an example configuration to activate the new predictor for relevant setups.

✏️ Tip: You can customize this high-level summary in your review settings.

Cao Jiaming added 2 commits January 26, 2026 14:52
@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

Adds a new public MicroTAGE SimObject (Python params + C++ bindings), implements its C++ header and implementation with full TAGE prediction/update, folded-history and bank-conflict handling, updates build config, and switches DecoupledBPUWithBTB to use MicroTAGE.

Changes

Cohort / File(s) Summary
Python config & build
src/cpu/pred/BranchPredictor.py, src/cpu/pred/SConscript
Introduce public MicroTAGE SimObject with parameters and C++ bindings; add btb/microtage.cc to build and insert MicroTAGE into predictor list.
Public header
src/cpu/pred/btb/microtage.hh
Add MicroTAGE class declaration (derives TimedBaseBTBPredictor) and public data structures: TageEntry, TageTableInfo, TagePrediction, TageMeta, TageStats; expose folded-history, banking, table configs, and public APIs.
Implementation
src/cpu/pred/btb/microtage.cc
Add full MicroTAGE implementation: constructors/destructor, prediction/update flows (generateSinglePrediction, lookupHelper, putPCHistory, getPredictionMeta), folded-history ops, allocation/LRU, bank-conflict handling, resolve/update/commit, and detailed statistics.
Integration header
src/cpu/pred/btb/decoupled_bpred.hh
Change member type from BTBTAGE *microtage to MicroTAGE *microtage and adjust includes to reference microtage.hh.
Config example
configs/example/kmhv3.py
Enable microtage for DecoupledBPUWithBTB in example config (flag changed to True).

Sequence Diagram(s)

sequenceDiagram
    participant FetchStream
    participant DecoupledBPU
    participant MicroTAGE
    participant BTB
    FetchStream->>DecoupledBPU: provide fetch entries
    DecoupledBPU->>MicroTAGE: putPCHistory(startPC, history)
    MicroTAGE->>MicroTAGE: compute folded histories / indices / tags
    MicroTAGE->>BTB: probe BTB entries (lookupHelper)
    BTB-->>MicroTAGE: BTBEntry list
    MicroTAGE->>DecoupledBPU: return predictions & meta
    DecoupledBPU->>MicroTAGE: later update(stream) / doResolveUpdate()
    MicroTAGE->>MicroTAGE: update predictor state, allocate entries, update stats
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

align-kmhv3

Suggested reviewers

  • Yakkhini
  • jensen-yan

Poem

🐰 I hopped from header into code,
Tables folded down the road,
Banks and tags all in a line,
LRU hums in clever time,
MicroTAGE — a rabbit's ode. 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Split microtage1 align' is vague and does not clearly convey the primary changes in the pull request. Consider using a more descriptive title that clearly indicates the main objective, such as 'Refactor MicroTAGE to a public class with configurable parameters' or 'Extract MicroTAGE predictor into dedicated public class'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@CJ362ff CJ362ff added the perf label Jan 26, 2026
@github-actions
Copy link

🚀 Performance test triggered: spec06-0.8c

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/cpu/pred/btb/microtage.cc`:
- Around line 723-729: The code currently reads recomputed =
predMeta->preds[btb_entry.pc] when updateOnRead is false, which can create a
default prediction for missing BTB entries and corrupt training; instead, check
whether predMeta->preds actually contains an entry for btb_entry.pc (or
otherwise whether predMeta has a valid prediction for that PC) and only use it
if present, otherwise call generateSinglePrediction(btb_entry, startAddr,
predMeta) to produce a proper recomputed prediction; update the branch in the
block using the identifiers recomputed, updateOnRead, predMeta, preds,
btb_entry.pc, and generateSinglePrediction accordingly so missing metadata does
not insert defaults.
- Around line 823-859: The code in getTageTag and getTageIndex constructs masks
with (1ULL << tableTagBits[t]) - 1 and (1ULL << tableIndexBits[t]) - 1 which can
UB if the bit count equals or exceeds 64; modify both functions (getTageTag
overloads and getTageIndex) to guard the shift by testing tableTagBits[t] /
tableIndexBits[t] and, if >= (sizeof(Addr)*8) or >= 64, set mask to all-ones
(e.g., ~0ULL or the max value for Addr), otherwise compute the mask with the
shift; reference tableTagBits, tableIndexBits, getTageTag, and getTageIndex when
making the change.
🧹 Nitpick comments (1)
src/cpu/pred/btb/microtage.hh (1)

4-9: Add explicit standard headers for std::unordered_map and std::shared_ptr.

This header uses both types; relying on transitive includes is brittle.

♻️ Proposed fix
 `#include` <cstdint>
 `#include` <deque>
+#include <memory>
 `#include` <map>
+#include <unordered_map>
 `#include` <utility>
 `#include` <vector>

Comment on lines +723 to +729
TagePrediction recomputed;
if (updateOnRead) { // if update on read is enabled, re-read providers using snapshot
// Re-read providers using snapshot (do not rely on prediction-time main/alt)
recomputed = generateSinglePrediction(btb_entry, startAddr, predMeta);
} else { // otherwise, use the prediction from the prediction-time main/alt
recomputed = predMeta->preds[btb_entry.pc];
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle missing prediction metadata when updateOnRead is false.

predMeta->preds[btb_entry.pc] inserts a default prediction when the entry wasn’t predicted (e.g., new BTB entry), which can skip base-table updates and distort training.

🛠 Proposed fix
-        if (updateOnRead) { // if update on read is enabled, re-read providers using snapshot
-            // Re-read providers using snapshot (do not rely on prediction-time main/alt)
-            recomputed = generateSinglePrediction(btb_entry, startAddr, predMeta);
-        } else { // otherwise, use the prediction from the prediction-time main/alt
-            recomputed = predMeta->preds[btb_entry.pc];
-        }
+        if (updateOnRead) { // if update on read is enabled, re-read providers using snapshot
+            // Re-read providers using snapshot (do not rely on prediction-time main/alt)
+            recomputed = generateSinglePrediction(btb_entry, startAddr, predMeta);
+        } else { // otherwise, use the prediction from the prediction-time main/alt
+            auto it = predMeta->preds.find(btb_entry.pc);
+            if (it != predMeta->preds.end()) {
+                recomputed = it->second;
+            } else {
+                // Fallback: recompute to avoid default-constructed prediction
+                recomputed = generateSinglePrediction(btb_entry, startAddr, predMeta);
+            }
+        }
🤖 Prompt for AI Agents
In `@src/cpu/pred/btb/microtage.cc` around lines 723 - 729, The code currently
reads recomputed = predMeta->preds[btb_entry.pc] when updateOnRead is false,
which can create a default prediction for missing BTB entries and corrupt
training; instead, check whether predMeta->preds actually contains an entry for
btb_entry.pc (or otherwise whether predMeta has a valid prediction for that PC)
and only use it if present, otherwise call generateSinglePrediction(btb_entry,
startAddr, predMeta) to produce a proper recomputed prediction; update the
branch in the block using the identifiers recomputed, updateOnRead, predMeta,
preds, btb_entry.pc, and generateSinglePrediction accordingly so missing
metadata does not insert defaults.

Comment on lines 823 to 859
Addr
MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)
{
// Create mask for tableTagBits[t] to limit result size
Addr mask = (1ULL << tableTagBits[t]) - 1;

// Extract lower bits of PC directly (remove instruction alignment bits)
Addr pcBits = (pc >> bankBaseShift) & mask;

// Extract and prepare folded history bits
Addr foldedBits = foldedHist & mask;

// Extract alt tag bits and shift left by 1
Addr altTagBits = (altFoldedHist << 1) & mask;

// XOR all components together, including position (like RTL)
return pcBits ^ foldedBits ^ altTagBits ^ position;
}

Addr
MicroTAGE::getTageTag(Addr pc, int t, Addr position)
{
return getTageTag(pc, t, tagFoldedHist[t].get(), altTagFoldedHist[t].get(), position);
}

Addr
MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)
{
// Create mask for tableIndexBits[t] to limit result size
Addr mask = (1ULL << tableIndexBits[t]) - 1;

const unsigned pcShift = enableBankConflict ? indexShift : bankBaseShift;
Addr pcBits = (pc >> pcShift) & mask;
Addr foldedBits = foldedHist & mask;

return pcBits ^ foldedBits;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual file around the mentioned lines
head -n 870 src/cpu/pred/btb/microtage.cc | tail -n 60

Repository: OpenXiangShan/GEM5

Length of output: 1780


🏁 Script executed:

# Let's also check the file structure and see more context
wc -l src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 96


🏁 Script executed:

# Search for tableTagBits and tableIndexBits declarations to understand their types and constraints
rg "tableTagBits|tableIndexBits" src/cpu/pred/btb/microtage.cc -B 2 -A 2

Repository: OpenXiangShan/GEM5

Length of output: 1927


🏁 Script executed:

# Search for the header file to see member declarations
fd "microtage.hh" src/

Repository: OpenXiangShan/GEM5

Length of output: 91


🏁 Script executed:

# Let's examine the header file to see the member declarations
cat -n src/cpu/pred/btb/microtage.hh | head -n 200

Repository: OpenXiangShan/GEM5

Length of output: 8918


🏁 Script executed:

# Search for the declarations of tableTagBits and tableIndexBits in the header
rg "tableTagBits|tableIndexBits" src/cpu/pred/btb/microtage.hh -B 1 -A 1

Repository: OpenXiangShan/GEM5

Length of output: 249


🏁 Script executed:

# Check what ceilLog2 returns and its maximum value
rg "ceilLog2" src/ -A 5 -B 1 | head -n 50

Repository: OpenXiangShan/GEM5

Length of output: 2667


🏁 Script executed:

# Look at the initialization context more carefully
sed -n '600,750p' src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 5906


🏁 Script executed:

# Let's find the ceilLog2 function definition to understand its return type and constraints
rg "ceilLog2" src/base/ -A 10 -B 2 | head -n 80

Repository: OpenXiangShan/GEM5

Length of output: 1594


🏁 Script executed:

# Check if there are any guards or assertions around tableTagBits/tableIndexBits values
rg "tableTagBits|tableIndexBits" src/cpu/pred/btb/microtage.cc | head -n 30

Repository: OpenXiangShan/GEM5

Length of output: 954


🏁 Script executed:

# Check what types Addr is
rg "typedef.*Addr|using Addr" src/base/types.hh -A 2 -B 2

Repository: OpenXiangShan/GEM5

Length of output: 165


🏁 Script executed:

# Let's check if there are any existing guards or validations for tableTagBits/tableIndexBits
grep -n "tableTagBits\|tableIndexBits" src/cpu/pred/btb/microtage.cc | grep -E "assert|check|if|guard|clamp|min|max"

Repository: OpenXiangShan/GEM5

Length of output: 119


🏁 Script executed:

# Check the PathFoldedHist constructor to see if it has any constraints on the bits parameter
grep -n "PathFoldedHist" src/cpu/pred/btb/ -r | head -20

Repository: OpenXiangShan/GEM5

Length of output: 2189


🏁 Script executed:

# Let's look at the folded_hist.hh file to understand constraints
cat -n src/cpu/pred/btb/folded_hist.hh | head -100

Repository: OpenXiangShan/GEM5

Length of output: 4091


🏁 Script executed:

# Check what happens with ceilLog2 and what maximum values it can return for realistic table sizes
python3 << 'EOF'
# Simulate ceilLog2 for different values
def ceil_log2(n):
    if n == 1:
        return 0
    # This is equivalent to floor(log2(n-1)) + 1
    return (n - 1).bit_length()

# Check for typical table sizes
for size_bits in [8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 48, 62, 63, 64]:
    table_size = 2 ** size_bits
    result = ceil_log2(table_size)
    print(f"ceilLog2(2^{size_bits}) = {result}")

# Check if we can ever get 64
print("\nCan ceilLog2 return 64?")
for val in [2**63, 2**63 + 1, 2**64 - 1]:
    result = ceil_log2(val)
    print(f"ceilLog2({val}) = {result}")
EOF

Repository: OpenXiangShan/GEM5

Length of output: 148


🏁 Script executed:

# Check for the maximum size value allowed in configuration
rg "tableSizes" src/cpu/pred/btb/microtage.cc -B 2 -A 2 | head -40

Repository: OpenXiangShan/GEM5

Length of output: 821


🏁 Script executed:

# Check if there are range checks or assertions around tableTagBits/tableIndexBits values before they are used in shifts
rg "tableTagBits|tableIndexBits" src/cpu/pred/btb/microtage.cc -B 5 -A 5 | grep -E "assert|check|if.*<|clamp|max|min" | head -20

Repository: OpenXiangShan/GEM5

Length of output: 345


🏁 Script executed:

# Check what maximum values tableSizes can have (which determines tableIndexBits via ceilLog2)
rg "tableSizes" src/cpu/pred/btb/ -r | grep -E "resize|push_back|=" | head -20

Repository: OpenXiangShan/GEM5

Length of output: 475


🏁 Script executed:

# Check the params file to understand configuration constraints
fd "MicroTAGE.py" src/

Repository: OpenXiangShan/GEM5

Length of output: 44


🏁 Script executed:

# Let me check the actual usage context more carefully - look at the complete function implementations
sed -n '823,870p' src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 1437


🏁 Script executed:

# Verify the constraint in PathFoldedHist more carefully by checking actual assertions
sed -n '82,91p' src/cpu/pred/btb/folded_hist.hh

Repository: OpenXiangShan/GEM5

Length of output: 537


🏁 Script executed:

# Check if there are any runtime checks or asserts that validate tableTagBits/tableIndexBits before usage
rg "getTageTag|getTageIndex" src/cpu/pred/btb/microtage.cc -B 3 -A 10 | grep -E "assert|check|if" | head -20

Repository: OpenXiangShan/GEM5

Length of output: 532


🏁 Script executed:

# Check the actual maximum practical value - look at initialization in detail
sed -n '110,145p' src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 1472


🏁 Script executed:

# Check if there's any explicit assert or guard right before or at the shift operation
sed -n '823,859p' src/cpu/pred/btb/microtage.cc | grep -E "assert|if.*tableTagBits|if.*tableIndexBits"

Repository: OpenXiangShan/GEM5

Length of output: 44


🏁 Script executed:

# Look for any documentation that explains the constraint on these values
rg "tableTagBits|tableIndexBits" src/cpu/pred/btb/microtage.hh -B 3 -A 1

Repository: OpenXiangShan/GEM5

Length of output: 330


🏁 Script executed:

# Check if ceilLog2 has any upper bound constraints documented
sed -n '1,100p' src/base/intmath.hh

Repository: OpenXiangShan/GEM5

Length of output: 3539


Guard against shift‑by‑64 UB when building tag/index masks.

The shift operations 1ULL << tableTagBits[t] and 1ULL << tableIndexBits[t] lack explicit bounds checks. While an implicit constraint via PathFoldedHist initialization currently prevents UB, explicit guards at the shift site follow defensive programming best practices and protect against future refactoring.

🛠 Proposed fix
Addr
MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)
{
    // Create mask for tableTagBits[t] to limit result size
-   Addr mask = (1ULL << tableTagBits[t]) - 1;
+   unsigned tagBits = tableTagBits[t];
+   assert(tagBits < 64);
+   Addr mask = (1ULL << tagBits) - 1;

    // Extract lower bits of PC directly (remove instruction alignment bits)
    Addr pcBits = (pc >> bankBaseShift) & mask;

    // Extract and prepare folded history bits
    Addr foldedBits = foldedHist & mask;

    // Extract alt tag bits and shift left by 1
    Addr altTagBits = (altFoldedHist << 1) & mask;

    // XOR all components together, including position (like RTL)
    return pcBits ^ foldedBits ^ altTagBits ^ position;
}

Addr
MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)
{
    // Create mask for tableIndexBits[t] to limit result size
-   Addr mask = (1ULL << tableIndexBits[t]) - 1;
+   unsigned idxBits = tableIndexBits[t];
+   assert(idxBits < 64);
+   Addr mask = (1ULL << idxBits) - 1;

    const unsigned pcShift = enableBankConflict ? indexShift : bankBaseShift;
    Addr pcBits = (pc >> pcShift) & mask;
    Addr foldedBits = foldedHist & mask;

    return pcBits ^ foldedBits;
}
🤖 Prompt for AI Agents
In `@src/cpu/pred/btb/microtage.cc` around lines 823 - 859, The code in getTageTag
and getTageIndex constructs masks with (1ULL << tableTagBits[t]) - 1 and (1ULL
<< tableIndexBits[t]) - 1 which can UB if the bit count equals or exceeds 64;
modify both functions (getTageTag overloads and getTageIndex) to guard the shift
by testing tableTagBits[t] / tableIndexBits[t] and, if >= (sizeof(Addr)*8) or >=
64, set mask to all-ones (e.g., ~0ULL or the max value for Addr), otherwise
compute the mask with the shift; reference tableTagBits, tableIndexBits,
getTageTag, and getTageIndex when making the change.

@github-actions
Copy link

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.1727 -
This PR 2.1727 ➡️ 0.0000 (0.00%)

✅ Difftest smoke test passed!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@src/cpu/pred/btb/microtage.cc`:
- Around line 17-25: The UNIT_TEST build fails because ceilLog2/floorLog2 are
used in the UNIT_TEST constructor but base/intmath.hh is only included inside
the `#ifndef` UNIT_TEST block; move or add `#include` "base/intmath.hh" so it is
available for UNIT_TEST builds (ensure the header is included alongside other
headers outside the `#ifndef` or added inside the UNIT_TEST branch), referencing
the use sites of ceilLog2 and floorLog2 in the microtage constructor to verify
compilation.
- Around line 96-123: The code currently resizes tableTagBits (and tablePcShifts
earlier) before validating that the provided configuration arrays are at least
numPredictors, which silently pads them and defeats the subsequent asserts;
instead, validate all config vectors (tableTagBits, tableSizes, histLengths,
tablePcShifts) for size >= numPredictors up front (or fail fast) before
performing any resize or default-padding, then proceed to initialize tageTable,
tableIndexBits, tableIndexMasks, tableTagMasks, etc.; reference tableTagBits,
tableTagMasks, tablePcShifts, tableSizes, histLengths, numPredictors, and
tageTable when locating where to add the pre-checks and reorder the resize calls
so padding only happens after validation and explicit defaults are intended.

In `@src/cpu/pred/btb/microtage.hh`:
- Around line 4-9: The header microtage.hh is missing standard includes for
types it uses; add `#include` <memory> and `#include` <unordered_map> alongside the
existing includes so references to std::shared_ptr and std::unordered_map
resolve regardless of include order; update the include block at the top of
microtage.hh near the existing `#include` <vector>, `#include` <map>, etc., and
ensure no other symbols (e.g., any typedefs or member declarations that use
shared_ptr or unordered_map) need forward declarations after adding these
headers.
- Around line 178-189: Remove the unused declarations from microtage.hh: delete
the getTageTag overload declared as getTageTag(Addr pc, int table, Addr position
= 0) and the setTag() declaration (the no-definition/no-usage symbols referenced
in the review). Locate these declarations in the MicroTAGE class in microtage.hh
(the getTageTag overload around the existing getTageTag signatures and the
setTag declaration near other tag-related methods) and remove them so there are
no orphaned declarations that could cause link errors; leave the remaining
getTageTag(uint64_t foldedHist, ...) overload intact.
♻️ Duplicate comments (2)
src/cpu/pred/btb/microtage.cc (2)

640-645: Avoid default‑inserting missing prediction metadata when updateOnRead=false.

predMeta->preds[btb_entry.pc] inserts a default prediction if the entry wasn’t predicted, which can distort training. Use a find + fallback to recompute when missing.

🛠 Proposed fix
-        } else { // otherwise, use the prediction from the prediction-time main/alt
-            recomputed = predMeta->preds[btb_entry.pc];
+        } else { // otherwise, use the prediction from the prediction-time main/alt
+            auto it = predMeta->preds.find(btb_entry.pc);
+            if (it != predMeta->preds.end()) {
+                recomputed = it->second;
+            } else {
+                // Fallback: recompute to avoid default-constructed prediction
+                recomputed = generateSinglePrediction(btb_entry, startAddr, predMeta);
+            }
         }

740-763: Guard against shift‑by‑64 UB when building tag/index masks.

(1ULL << tableTagBits[t]) and (1ULL << tableIndexBits[t]) are UB if the bit count reaches 64. Add bounds checks before shifting.

🛠 Proposed fix
-    Addr mask = (1ULL << tableTagBits[t]) - 1;
+    unsigned tagBits = tableTagBits[t];
+    assert(tagBits < 64);
+    Addr mask = (1ULL << tagBits) - 1;
...
-    Addr mask = (1ULL << tableIndexBits[t]) - 1;
+    unsigned idxBits = tableIndexBits[t];
+    assert(idxBits < 64);
+    Addr mask = (1ULL << idxBits) - 1;

Comment on lines +17 to +25
#ifndef UNIT_TEST
#include "base/debug_helper.hh"
#include "base/intmath.hh"
#include "base/trace.hh"
#include "base/types.hh"
#include "cpu/o3/dyn_inst.hh"
#include "debug/TAGE.hh"

#endif
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

UNIT_TEST build needs base/intmath.hh for ceilLog2/floorLog2.

ceilLog2/floorLog2 are used in the UNIT_TEST constructor, but base/intmath.hh is only included in the non‑UNIT_TEST block. UNIT_TEST builds will fail to compile.

🛠 Suggested fix
+#include "base/intmath.hh"
 `#ifndef` UNIT_TEST
 `#include` "base/debug_helper.hh"
-#include "base/intmath.hh"
 `#include` "base/trace.hh"
 `#include` "base/types.hh"
 `#include` "cpu/o3/dyn_inst.hh"
 `#include` "debug/TAGE.hh"
 
 `#endif`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#ifndef UNIT_TEST
#include "base/debug_helper.hh"
#include "base/intmath.hh"
#include "base/trace.hh"
#include "base/types.hh"
#include "cpu/o3/dyn_inst.hh"
#include "debug/TAGE.hh"
#endif
`#include` "base/intmath.hh"
`#ifndef` UNIT_TEST
`#include` "base/debug_helper.hh"
`#include` "base/trace.hh"
`#include` "base/types.hh"
`#include` "cpu/o3/dyn_inst.hh"
`#include` "debug/TAGE.hh"
`#endif`
🤖 Prompt for AI Agents
In `@src/cpu/pred/btb/microtage.cc` around lines 17 - 25, The UNIT_TEST build
fails because ceilLog2/floorLog2 are used in the UNIT_TEST constructor but
base/intmath.hh is only included inside the `#ifndef` UNIT_TEST block; move or add
`#include` "base/intmath.hh" so it is available for UNIT_TEST builds (ensure the
header is included alongside other headers outside the `#ifndef` or added inside
the UNIT_TEST branch), referencing the use sites of ceilLog2 and floorLog2 in
the microtage constructor to verify compilation.

Comment on lines +96 to +123
tageTable.resize(numPredictors);
tableIndexBits.resize(numPredictors);
tableIndexMasks.resize(numPredictors);
tableTagBits.resize(numPredictors);
tableTagMasks.resize(numPredictors);
// Ensure PC shift vector has entries for all predictors (fallback default = 1)
if (tablePcShifts.size() < numPredictors) {
tablePcShifts.resize(numPredictors, 1);
}

// Initialize base table for fallback predictions
for (unsigned int i = 0; i < numPredictors; ++i) {
//initialize ittage predictor
assert(tableSizes.size() >= numPredictors);
tageTable[i].resize(tableSizes[i]);
for (unsigned int j = 0; j < tableSizes[i]; ++j) {
tageTable[i][j].resize(numWays);
}

tableIndexBits[i] = ceilLog2(tableSizes[i]);
tableIndexMasks[i].resize(tableIndexBits[i], true);

assert(histLengths.size() >= numPredictors);

assert(tableTagBits.size() >= numPredictors);
tableTagMasks[i].resize(tableTagBits[i], true);

assert(tablePcShifts.size() >= numPredictors);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid padding tableTagBits before validating config size.

tableTagBits.resize(numPredictors) runs before any validation, so an undersized config is silently zero‑padded and the later assert becomes ineffective. This can cascade into invalid tag widths (and tableTagBits[i]-1).

🛠 Suggested fix
-    tableTagBits.resize(numPredictors);
+    assert(tableTagBits.size() >= numPredictors);
+    if (tableTagBits.size() > numPredictors) {
+        tableTagBits.resize(numPredictors);
+    }
...
-        assert(tableTagBits.size() >= numPredictors);
+        // size already validated above
🧰 Tools
🪛 Cppcheck (2.19.0)

[error] 119-119: Reference to temporary returned.

(returnTempReference)

🤖 Prompt for AI Agents
In `@src/cpu/pred/btb/microtage.cc` around lines 96 - 123, The code currently
resizes tableTagBits (and tablePcShifts earlier) before validating that the
provided configuration arrays are at least numPredictors, which silently pads
them and defeats the subsequent asserts; instead, validate all config vectors
(tableTagBits, tableSizes, histLengths, tablePcShifts) for size >= numPredictors
up front (or fail fast) before performing any resize or default-padding, then
proceed to initialize tageTable, tableIndexBits, tableIndexMasks, tableTagMasks,
etc.; reference tableTagBits, tableTagMasks, tablePcShifts, tableSizes,
histLengths, numPredictors, and tageTable when locating where to add the
pre-checks and reorder the resize calls so padding only happens after validation
and explicit defaults are intended.

Comment on lines +4 to +9
#include <cstdint>
#include <deque>
#include <map>
#include <utility>
#include <vector>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add missing standard headers for std::shared_ptr and std::unordered_map.

This header uses both types but doesn’t include <memory> or <unordered_map>, which can break compilation depending on include order.

🛠 Suggested fix
 `#include` <cstdint>
 `#include` <deque>
+#include <memory>
 `#include` <map>
+#include <unordered_map>
 `#include` <utility>
 `#include` <vector>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include <cstdint>
#include <deque>
#include <map>
#include <utility>
#include <vector>
`#include` <cstdint>
`#include` <deque>
`#include` <memory>
`#include` <map>
`#include` <unordered_map>
`#include` <utility>
`#include` <vector>
🤖 Prompt for AI Agents
In `@src/cpu/pred/btb/microtage.hh` around lines 4 - 9, The header microtage.hh is
missing standard includes for types it uses; add `#include` <memory> and `#include`
<unordered_map> alongside the existing includes so references to std::shared_ptr
and std::unordered_map resolve regardless of include order; update the include
block at the top of microtage.hh near the existing `#include` <vector>, `#include`
<map>, etc., and ensure no other symbols (e.g., any typedefs or member
declarations that use shared_ptr or unordered_map) need forward declarations
after adding these headers.

Comment on lines +178 to +189
// Calculate TAGE tag for a given PC and table
// position: branch position within the block (xored into tag like RTL)
Addr getTageTag(Addr pc, int table, Addr position = 0);

// Calculate TAGE tag with folded history (uint64_t version for performance)
// position: branch position within the block (xored into tag like RTL)
Addr getTageTag(Addr pc, int table, uint64_t foldedHist, uint64_t altFoldedHist, Addr position = 0);

// Get offset within a block for a given PC
Addr getOffset(Addr pc) {
return (pc & (blockSize - 1)) >> 1;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for MicroTAGE::getTageTag and MicroTAGE::setTag definitions
rg -n "MicroTAGE::getTageTag" -C2
echo "---"
rg -n "MicroTAGE::setTag" -C2

Repository: OpenXiangShan/GEM5

Length of output: 487


🏁 Script executed:

#!/bin/bash
# Search for the simpler getTageTag overload (without folded history)
rg -n "getTageTag\(Addr pc, int" src/cpu/pred/btb/microtage.cc -C2

# Search for setTag definitions or declarations
rg -n "setTag" src/cpu/pred/btb/ -C2

# Check the header file at lines 178-189 and 254-256
echo "=== Header file context ==="
sed -n '175,195p' src/cpu/pred/btb/microtage.hh
echo "==="
sed -n '250,260p' src/cpu/pred/btb/microtage.hh

Repository: OpenXiangShan/GEM5

Length of output: 2563


🏁 Script executed:

#!/bin/bash
# Search for setTag definition in microtage.cc
rg -n "^MicroTAGE::setTag" src/cpu/pred/btb/microtage.cc -A5

# Search for the simpler getTageTag overload (without folded history params) definition
rg -n "getTageTag\(Addr pc, int.*Addr position" src/cpu/pred/btb/microtage.cc -A5

# Also check if the simpler overload exists as an inline definition in header
grep -n "getTageTag(Addr pc, int table, Addr position" src/cpu/pred/btb/microtage.hh -A3

Repository: OpenXiangShan/GEM5

Length of output: 604


🏁 Script executed:

#!/bin/bash
# Search for calls to setTag
echo "=== setTag calls ==="
rg -n "setTag\(" src/cpu/pred/btb/ --type cpp --type h

# Search for calls to getTageTag with just 3 parameters (simpler overload)
echo "=== getTageTag calls ==="
rg -n "getTageTag\(" src/cpu/pred/btb/ --type cpp --type h | head -20

Repository: OpenXiangShan/GEM5

Length of output: 2440


🏁 Script executed:

#!/bin/bash
# Check the actual calls in microtage.cc around lines 210 and 513
sed -n '208,215p' src/cpu/pred/btb/microtage.cc

echo "==="
sed -n '510,520p' src/cpu/pred/btb/microtage.cc

# Also check if setTag is actually used anywhere (not just declared)
echo "==="
rg -n "setTag" src/cpu/pred/btb/ --type cpp --type h -i | grep -v "^\s*void setTag" | grep -v "^\s*//"

Repository: OpenXiangShan/GEM5

Length of output: 1333


🏁 Script executed:

#!/bin/bash
# Check the class definition and inheritance in microtage.hh
sed -n '1,50p' src/cpu/pred/btb/microtage.hh | grep -A20 "class MicroTAGE"

# Check if btb_tage has definitions for these
echo "=== btb_tage.cc definitions ==="
rg -n "^BTBTAGE::setTag" src/cpu/pred/btb/btb_tage.cc -A5

rg -n "^BTBTAGE::getTageTag\(Addr pc, int t, Addr position" src/cpu/pred/btb/btb_tage.cc -A5

# Check if MicroTAGE inherits from BTBTAGE
grep -n "class MicroTAGE" src/cpu/pred/btb/microtage.hh -A2

Repository: OpenXiangShan/GEM5

Length of output: 674


🏁 Script executed:

#!/bin/bash
# Verify if setTag is defined anywhere for MicroTAGE or inherited classes
rg -n "setTag" src/cpu/pred/btb/microtage.cc -A3

# Check if there's any actual usage of these functions by searching actual function calls (not just declarations)
echo "=== Checking for actual calls to the simple getTageTag ==="
rg -n "getTageTag\(.*\)" src/cpu/pred/btb/microtage.cc | grep -v "getTageTag(startPC" | grep -v "getTageTag(pc, t, uint64_t"

echo "=== Checking for actual calls to setTag ==="
rg -n "\.setTag\(" src/cpu/pred/btb/microtage.cc
rg -n "setTag(" src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 348


🏁 Script executed:

#!/bin/bash
# Fix the regex escaping
echo "=== Checking for actual calls to setTag ==="
rg -n "setTag" src/cpu/pred/btb/microtage.cc

# Check if getTageTag(Addr, int, Addr) is ever called in microtage
echo "=== All getTageTag calls in microtage.cc ==="
rg -n "getTageTag" src/cpu/pred/btb/microtage.cc

# Check what calls getTageTag with exactly 3 parameters (no folded history)
echo "=== Searching for the simple 3-param getTageTag call pattern ==="
grep -n "getTageTag(startPC, i," src/cpu/pred/btb/microtage.cc

Repository: OpenXiangShan/GEM5

Length of output: 709


🏁 Script executed:

#!/bin/bash
# Let's verify the full extent of the problem by checking if MicroTAGE is a complete class
# Check if it inherits from any class that might provide these implementations
sed -n '42,60p' src/cpu/pred/btb/microtage.hh

# Also check the entire setTag search more carefully
echo "=== Full setTag search ==="
rg "setTag" src/cpu/pred/btb/ -n

# Verify line 254-256 context shown in review
echo "=== Lines 254-256 context ==="
sed -n '252,258p' src/cpu/pred/btb/microtage.hh

Repository: OpenXiangShan/GEM5

Length of output: 1183


Remove unused function declarations.

The simple getTageTag(Addr pc, int table, Addr position) overload at line 180 and setTag() at line 255 are declared but have no definitions in microtage.cc and are never called in the codebase. Remove them to prevent potential link errors.

🤖 Prompt for AI Agents
In `@src/cpu/pred/btb/microtage.hh` around lines 178 - 189, Remove the unused
declarations from microtage.hh: delete the getTageTag overload declared as
getTageTag(Addr pc, int table, Addr position = 0) and the setTag() declaration
(the no-definition/no-usage symbols referenced in the review). Locate these
declarations in the MicroTAGE class in microtage.hh (the getTageTag overload
around the existing getTageTag signatures and the setTag declaration near other
tag-related methods) and remove them so there are no orphaned declarations that
could cause link errors; leave the remaining getTageTag(uint64_t foldedHist,
...) overload intact.

Change-Id: Id59dc54bf51794bbe3ac067829ca10ca8014be9f
@github-actions
Copy link

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.1727 -
This PR 2.1887 📈 +0.0160 (+0.74%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: e5ce14a
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Master Diff(%)
Score 17.62 17.56 +0.33 🟢

[Generated by GEM5 Performance Robot]
commit: e5ce14a
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 17.62 17.56 +0.33 🟢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants