mem-cache: berti add l2 ahead prefetch by Maxpicca-Li · Pull Request #695 · OpenXiangShan/GEM5

Maxpicca-Li · 2026-01-07T12:55:15Z

Change-Id: I01d9003bf78eaabc3619349a60c3f80a24afe58c
Config: idealkmhv3, 0.8 coverage
Result: a lot improvement in GemsFDTD.

Improve the threshold from 2 to 3 to balance L1_PREF and L2_PREF.

Summary by CodeRabbit

Performance Enhancements
- Implemented multi-level cache prefetching strategy supporting coordinated operations across different cache hierarchy levels
- Enhanced prefetch metadata tracking and optimization mechanisms for improved memory access patterns
- Refined prefetch filtering to dynamically leverage different cache level strategies for better performance

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-07T12:55:25Z

📝 Walkthrough

Walkthrough

The changes implement multi-level prefetch hierarchy in BertiPrefetcher by introducing an ahead_level parameter to distinguish L1 and L2 prefetches. A new filterL2 member tracks L2-level prefetch addresses, and AddrPriority entries are augmented with metadata fields (pfahead_host, pfahead) recording prefetch depth information.

Changes

Cohort / File(s)	Summary
Header Definition `src/mem/cache/prefetch/berti.hh`	Added public `filterL2` LRU cache member; updated `sendPFWithFilter()` signature to include `int ahead_level` parameter for level-aware filtering.
Core Prefetch Logic `src/mem/cache/prefetch/berti.cc`	Implemented `ahead_level`-based filter selection (L1 vs L2); added L2 prefetch path with delta offset; augmented `AddrPriority` with `pfahead_host` and `pfahead` fields; refined L1 PF tracing to trigger only when `ahead_level == 1`.
Integration `src/mem/cache/prefetch/sms.cc`	Assigned `berti->filterL2` to reference SMS's `pfPageLRUFilterL2` filter within null-check block.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

tastynoob
Ergou-ren
jensen-yan

Poem

🐰 Hoppy hops with filters two,
L1 and L2 both in view,
Ahead levels guide the way,
Prefetching faster every day!
Cache hierarchies now aligned,
With metadata refined. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'mem-cache: berti add l2 ahead prefetch' clearly summarizes the main change: adding L2 ahead prefetch functionality to the Berti prefetcher, which is directly reflected in the codebase modifications across all three files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Change-Id: I01d9003bf78eaabc3619349a60c3f80a24afe58c

github-actions · 2026-01-07T13:02:05Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.1689`	-
This PR	`2.1689`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

Copilot

Pull request overview

This PR adds L2 ahead prefetch capability to the BERTI prefetcher. When BERTI predicts a memory access pattern with sufficient confidence, it now generates two prefetch requests: one for L1 cache and another for L2 cache at a further distance (4x the delta).

Adds filterL2 member to support separate filtering for L2 prefetch requests
Extends sendPFWithFilter with ahead_level parameter to distinguish between L1 and L2 prefetch requests
Generates L2 ahead prefetch requests in non-aggressive mode by multiplying the best delta by 4 (shift left by 2)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
src/mem/cache/prefetch/sms.cc	Initializes filterL2 for berti prefetcher instance when created by XSCompositePrefetcher
src/mem/cache/prefetch/berti.hh	Adds filterL2 member variable and updates sendPFWithFilter signature with ahead_level parameter
src/mem/cache/prefetch/berti.cc	Implements L2 ahead prefetch generation and filter selection logic based on ahead_level

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-09T06:23:08Z

src/mem/cache/prefetch/berti.cc

+    boost::compute::detail::lru_cache<Addr, Addr>* filterHere;
+    filterHere = ahead_level > 1 ? filterL2 : filter;


Potential null pointer dereference: filterL2 is used without checking if it's null. While filterL2 is initialized in sms.cc for the berti instance, if BertiPrefetcher is instantiated independently outside of XSCompositePrefetcher, filterL2 will be null. This could cause a crash when ahead_level > 1. Consider adding a null check before using filterL2, similar to how XSStridePrefetcher handles this by checking ahead_level first and then using the appropriate filter.

Copilot · 2026-01-09T06:23:08Z

src/mem/cache/prefetch/berti.cc

+                uint64_t l2_depth_ratio = 2;
+                Addr pf_addr_l2 = useByteAddr ?
+                    pfi.getAddr() + (entry->bestDelta.delta << l2_depth_ratio) :
+                    (blockIndex(pfi.getAddr()) + (entry->bestDelta.delta << l2_depth_ratio)) << lBlkSize;


Magic number without explanation: The value 2 for l2_depth_ratio is a magic number that controls how far ahead L2 prefetching occurs. While the PR description mentions improving the threshold from 2 to 3, this code still uses 2. Consider either making this a configurable parameter or adding a comment explaining why 2 is the appropriate multiplier for L2 ahead distance.

Suggested change

uint64_t l2_depth_ratio = 2;

Addr pf_addr_l2 = useByteAddr ?

pfi.getAddr() + (entry->bestDelta.delta << l2_depth_ratio) :

(blockIndex(pfi.getAddr()) + (entry->bestDelta.delta << l2_depth_ratio)) << lBlkSize;

// L2 depth ratio controls how far ahead L2 prefetching occurs

// relative to the learned delta. A value of 2 preserves the

// original behavior of this prefetcher (using delta << 2 for L2).

constexpr uint64_t l2DepthRatio = 2;

Addr pf_addr_l2 = useByteAddr ?

pfi.getAddr() + (entry->bestDelta.delta << l2DepthRatio) :

(blockIndex(pfi.getAddr()) + (entry->bestDelta.delta << l2DepthRatio)) << lBlkSize;

Copilot · 2026-01-09T06:23:09Z

src/mem/cache/prefetch/berti.hh

@@ -166,6 +166,7 @@ class BertiPrefetcher : public Queued
  public:

    boost::compute::detail::lru_cache<Addr, Addr> *filter;


Missing documentation: The new filterL2 member variable lacks a comment explaining its purpose. Consider adding a comment to clarify that it's used for filtering L2-level prefetch requests, similar to how other member variables in the class are documented.

Suggested change

boost::compute::detail::lru_cache<Addr, Addr> *filter;

boost::compute::detail::lru_cache<Addr, Addr> *filter;

// LRU filter used to suppress redundant L2-level prefetch requests.

Copilot · 2026-01-09T06:23:09Z

src/mem/cache/prefetch/berti.hh

@@ -187,7 +188,7 @@ class BertiPrefetcher : public Queued
    int getBestDelta() { return lastUsedBestDelta; }



Missing documentation: The new ahead_level parameter lacks documentation in the function signature. Consider adding a parameter comment to explain what the different ahead_level values represent (e.g., 1 for L1, 2 for L2, etc.).

Suggested change

/**

* Decide whether to send a prefetch request with filtering.

*

* @param ahead_level Prefetch ahead level (e.g., 1 for L1, 2 for L2, etc.),

* indicating the cache level or distance ahead for this prefetch.

*/

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/mem/cache/prefetch/berti.cc (1)
256-266: Misleading variable name and missing address validation.

Variable naming issue: l2_depth_ratio is used as a shift amount (delta << l2_depth_ratio), making the actual multiplier 4, not 2. This is confusing. Consider renaming to l2_depth_shift or changing the value to 4 and using multiplication.

Missing validation: The L2 prefetch address calculation could overflow or produce invalid addresses. Consider adding bounds checking.

Hardcoded values: The shift amount (2) and coverage threshold (8) are magic numbers that could be configuration parameters.
♻️ Suggested improvements
-                uint64_t l2_depth_ratio = 2;
+                // L2 prefetches at 4x the delta distance
+                const uint64_t l2_depth_multiplier = 4;
                 Addr pf_addr_l2 = useByteAddr ?
-                    pfi.getAddr() + (entry->bestDelta.delta << l2_depth_ratio) :
-                    (blockIndex(pfi.getAddr()) + (entry->bestDelta.delta << l2_depth_ratio)) << lBlkSize;
+                    pfi.getAddr() + (entry->bestDelta.delta * l2_depth_multiplier) :
+                    (blockIndex(pfi.getAddr()) + (entry->bestDelta.delta * l2_depth_multiplier)) << lBlkSize;
+                // Validate address is within reasonable bounds before prefetching
                 sendPFWithFilter(pfi, pf_addr_l2, addresses, 32, PrefetchSourceType::Berti,
                                  entry->bestDelta.coverageCounter >= 8, 2);

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecce168 and 5769d2d.

📒 Files selected for processing (3)

src/mem/cache/prefetch/berti.cc
src/mem/cache/prefetch/berti.hh
src/mem/cache/prefetch/sms.cc

🔇 Additional comments (6)

src/mem/cache/prefetch/berti.hh (2)

190-191: LGTM - signature extension supports multi-level prefetch.

The addition of the ahead_level parameter enables routing prefetches to different cache levels (L1 vs L2), which aligns with the PR objective of adding L2 ahead prefetch capability.

169-169: No action required. The filterL2 pointer is safely initialized before use within the XSCompositePrefetcher constructor (sms.cc lines 64-67) and cannot be dereferenced without initialization. The initialization happens immediately in the constructor body after the berti object is received via the parameter, guaranteeing that filterL2 is assigned before any call to sendPFWithFilter. The existing conditional check if (berti) ensures that when the BertiPrefetcher object exists and is non-null, filterL2 is always initialized.

src/mem/cache/prefetch/sms.cc (1)

64-67: LGTM - proper L2 filter initialization.

The addition of braces and the filterL2 assignment follows the existing pattern used for Sstride and properly guards against null berti. This enables L2-level prefetch filtering.

src/mem/cache/prefetch/berti.cc (3)

242-252: LGTM - aggressive prefetch path correctly updated.

The ahead_level=1 parameter properly marks this as an L1 prefetch request, consistent with the multi-level prefetch hierarchy being introduced.

296-297: LGTM - proper metadata assignment for multi-level prefetch.

The metadata fields pfahead_host and pfahead correctly track the target prefetch level and whether it's ahead of the current cache, enabling downstream routing decisions.

254-271: Clarify if threshold change from 2 to 3 was applied or incomplete.

The code in berti.hh line 84 still shows info.coverageCounter >= 2 for L2_PREF assignment, not >= 3. If the PR description claims this threshold was changed from 2 to 3, this change does not appear to be reflected in the codebase.

Additionally, the lines under review (254-271) use different thresholds (>= 8 for L2 prefetches, > 5 for PHT trigger), which don't relate to the 2→3 change mentioned. Confirm whether the threshold change is incomplete, refers to a different parameter, or the PR description is inaccurate.

coderabbitai · 2026-01-09T06:23:08Z

src/mem/cache/prefetch/berti.cc

 bool
 BertiPrefetcher::sendPFWithFilter(const PrefetchInfo &pfi, Addr addr, std::vector<AddrPriority> &addresses, int prio,
-                                  PrefetchSourceType src, bool using_best_delta_and_confident)
+                                  PrefetchSourceType src, bool using_best_delta_and_confident, int ahead_level)
 {
-    if (archDBer && cache->level() == 1) {
+    if (archDBer && cache->level() == 1 && ahead_level == 1) {
        archDBer->l1PFTraceWrite(curTick(), pfi.getPC(), pfi.getAddr(), addr, src);
    }
    if (using_best_delta_and_confident) {
        lastUsedBestDelta = blockIndex(addr) - blockIndex(pfi.getAddr());
    }
-    if (filter->contains(addr)) {
+    boost::compute::detail::lru_cache<Addr, Addr>* filterHere;
+    filterHere = ahead_level > 1 ? filterL2 : filter;
+    if (filterHere->contains(addr)) {
        DPRINTF(BertiPrefetcher, "Skip recently prefetched: %lx\n", addr);
        return false;
    } else {
        int64_t blk_delta = (int64_t)blockIndex(addr) - blockIndex(pfi.getAddr());
        topDeltas[blk_delta] = topDeltas.count(blk_delta) ? topDeltas[blk_delta] + 1 : 1;
        DPRINTF(BertiPrefetcher, "Send pf: %lx\n", addr);
-        filter->insert(addr, 0);
+        filterHere->insert(addr, 0);
        addresses.push_back(AddrPriority(addr, prio, src));
+        addresses.back().pfahead_host = ahead_level;
+        addresses.back().pfahead = ahead_level > cache->level();
        return true;
    }
 }


⚠️ Potential issue | 🔴 Critical

Critical: Potential null pointer dereference of filterL2.

At line 286-287, filterL2 is selected when ahead_level > 1, but there's no null check before dereferencing it at line 287. While sms.cc initializes filterL2, other instantiation paths might not, leading to a segmentation fault.

🐛 Proposed fix with null check

BertiPrefetcher::sendPFWithFilter(const PrefetchInfo &pfi, Addr addr, std::vector<AddrPriority> &addresses, int prio, PrefetchSourceType src, bool using_best_delta_and_confident, int ahead_level) { if (archDBer && cache->level() == 1 && ahead_level == 1) { archDBer->l1PFTraceWrite(curTick(), pfi.getPC(), pfi.getAddr(), addr, src); } if (using_best_delta_and_confident) { lastUsedBestDelta = blockIndex(addr) - blockIndex(pfi.getAddr()); } boost::compute::detail::lru_cache<Addr, Addr>* filterHere; filterHere = ahead_level > 1 ? filterL2 : filter; + if (!filterHere) { + DPRINTF(BertiPrefetcher, "Filter not initialized for ahead_level %d\n", ahead_level); + return false; + } if (filterHere->contains(addr)) { DPRINTF(BertiPrefetcher, "Skip recently prefetched: %lx\n", addr); return false;

mem-cache: berti add l2 ahead prefetch

5769d2d

Change-Id: I01d9003bf78eaabc3619349a60c3f80a24afe58c

Maxpicca-Li marked this pull request as ready for review January 9, 2026 06:19

Maxpicca-Li requested review from Copilot, jueshiwenli and shinezyy and removed request for Copilot and shinezyy January 9, 2026 06:19

Copilot started reviewing on behalf of Maxpicca-Li January 9, 2026 06:20 View session

Copilot AI reviewed Jan 9, 2026

View reviewed changes

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

Maxpicca-Li marked this pull request as draft January 9, 2026 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mem-cache: berti add l2 ahead prefetch#695

mem-cache: berti add l2 ahead prefetch#695
Maxpicca-Li wants to merge 1 commit intoxs-devfrom
perf-berti-l2ahead

Maxpicca-Li commented Jan 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 9, 2026

Uh oh!

Copilot AI Jan 9, 2026

Uh oh!

Copilot AI Jan 9, 2026

Uh oh!

Copilot AI Jan 9, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		boost::compute::detail::lru_cache<Addr, Addr>* filterHere;
		filterHere = ahead_level > 1 ? filterL2 : filter;

-                uint64_t l2_depth_ratio = 2;
-                Addr pf_addr_l2 = useByteAddr ?
-                    pfi.getAddr() + (entry->bestDelta.delta << l2_depth_ratio) :
-                    (blockIndex(pfi.getAddr()) + (entry->bestDelta.delta << l2_depth_ratio)) << lBlkSize;
+                // L2 depth ratio controls how far ahead L2 prefetching occurs
+                // relative to the learned delta. A value of 2 preserves the
+                // original behavior of this prefetcher (using delta << 2 for L2).
+                constexpr uint64_t l2DepthRatio = 2;
+                Addr pf_addr_l2 = useByteAddr ?
+                    pfi.getAddr() + (entry->bestDelta.delta << l2DepthRatio) :
+                    (blockIndex(pfi.getAddr()) + (entry->bestDelta.delta << l2DepthRatio)) << lBlkSize;

		@@ -166,6 +166,7 @@ class BertiPrefetcher : public Queued
		public:

		boost::compute::detail::lru_cache<Addr, Addr> *filter;

	boost::compute::detail::lru_cache<Addr, Addr> *filter;
	boost::compute::detail::lru_cache<Addr, Addr> *filter;
	// LRU filter used to suppress redundant L2-level prefetch requests.

		@@ -187,7 +188,7 @@ class BertiPrefetcher : public Queued
		int getBestDelta() { return lastUsedBestDelta; }

+    /**
+     * Decide whether to send a prefetch request with filtering.
+     *
+     * @param ahead_level Prefetch ahead level (e.g., 1 for L1, 2 for L2, etc.),
+     *                    indicating the cache level or distance ahead for this prefetch.
+     */

Conversation

Maxpicca-Li commented Jan 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Jan 7, 2026

🚀 Coremark Smoke Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Maxpicca-Li commented Jan 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 7, 2026 •

edited

Loading