Conversation
|
🚀 Performance test triggered: spec06-0.8c |
📝 WalkthroughWalkthroughIssueQueue and Scheduler now use per-port inflight pipelines and method-based ready-queue insertion; TimeBuffer was refactored to use typed contiguous storage with copy/move semantics. Scheduler centralizes advancing per-port inflight buffers; IssueQue exposes setIssuePipe for wiring. Changes
Sequence Diagram(s)sequenceDiagram
actor CPU
participant IssueQue
participant Scheduler
participant TimeBuffer
CPU->>IssueQue: submit instruction
IssueQue->>IssueQue: readyQInsert(inst) (per-port)
IssueQue->>TimeBuffer: toIssue[port].enqueue(inst) %% via wire
Scheduler->>TimeBuffer: tick() -> advance per-port pipes
TimeBuffer-->>Scheduler: ready entries appear at wire
Scheduler->>IssueQue: deliver toFu[port] entries
IssueQue->>CPU: issue to functional unit
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @src/cpu/timebuf.hh:
- Around line 150-170: The class TimeBuffer defines a destructor, copy ctor and
move ctor but lacks copy and move assignment operators; implement TimeBuffer&
operator=(const TimeBuffer<T>&) and TimeBuffer& operator=(TimeBuffer<T>&&) to
satisfy the Rule of 5: in copy-assignment guard against self-assignment, free
existing resources held in datas, deep-copy other's metadata (_id, past, future,
size, base) and allocate/copy elements into datas (handling non-POD via
placement-new or proper element copy), and in move-assignment guard
self-assignment, free existing resources, transfer metadata and datas pointer
from other to this, set other.datas = nullptr and other.size = 0, and return
*this; ensure consistency with the existing destructor and move/copy ctors to
avoid double-free or leaks.
- Line 150: The default constructor TimeBuffer() leaves size==0 and
datas==nullptr which makes methods like advance() and access() undefined; either
remove the default constructor or initialize members to a safe state and add
runtime guards: initialize size to a valid capacity and datas to an allocated
buffer (or explicitly set a flag), and/or update advance() and access() to check
size>0 and datas!=nullptr (or assert/throw std::runtime_error with a clear
message) before operating; reference the TimeBuffer constructor, advance(),
access(), size and datas members when making the change.
🧹 Nitpick comments (4)
src/cpu/o3/issue_queue.hh (1)
210-213: Consider adding bounds checking insetIssuePipe.The method directly indexes into
toIssueandtoFuvectors without verifyingpi < outports. While callers may ensure valid indices, defensive bounds checking would prevent subtle bugs.🔧 Suggested bounds check
inline void setIssuePipe(TimeBuffer<DynInstPtr>& issuepipe, int pi) { + assert(pi >= 0 && pi < outports && "Invalid port index"); toIssue[pi] = issuepipe.getWire(scheduleToExecDelay); toFu[pi] = issuepipe.getWire(0); }src/cpu/o3/issue_queue.cc (2)
1386-1393: Minor: Add space afterforkeyword for consistency.The iteration logic correctly nullifies canceled instructions across all inflight pipes.
🔧 Formatting suggestion
for (auto& it : inflightIssues) { - for(int i = 0; i < it.getSize(); i++) { + for (int i = 0; i < it.getSize(); i++) { auto& inst = it[i]; if (inst && inst->canceled()) { inst = nullptr; } } }
1491-1499: Same formatting nit as inloadCancel.The squash logic correctly nullifies squashed instructions. Consider consistent formatting.
🔧 Formatting suggestion
for (auto& it : inflightIssues) { - for(int i = 0; i < it.getSize(); i++) { + for (int i = 0; i < it.getSize(); i++) { auto& inst = it[i]; if (inst && inst->isSquashed()) { inst = nullptr; } } }src/cpu/timebuf.hh (1)
161-166: Move constructor should reset source's size to prevent double destruction.The move constructor nulls the source's
dataspointer but doesn't resetsize. If the destructor is later called on the moved-from object, it will skip deletion (sincedatasis null with the proposed fix), but for clarity and to maintain invariants,sizeshould also be reset.♻️ Suggested improvement
TimeBuffer(TimeBuffer<T> &&other) noexcept - : _id(other._id), past(other.past), future(other.future), size(other.size), base(other.base), datas(other.datas) + : _id(other._id), past(other.past), future(other.future), size(other.size), base(other.base), datas(other.datas) { // Null out the other datas pointer to avoid double deletion other.datas = nullptr; + other.size = 0; }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/cpu/o3/issue_queue.ccsrc/cpu/o3/issue_queue.hhsrc/cpu/timebuf.hh
🧰 Additional context used
🧬 Code graph analysis (1)
src/cpu/o3/issue_queue.cc (1)
src/cpu/timebuf.hh (9)
i(82-87)i(82-82)i(113-118)i(126-131)TimeBuffer(138-148)TimeBuffer(150-150)TimeBuffer(152-159)TimeBuffer(161-166)TimeBuffer(168-171)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run SPEC Test / XS-GEM5 - Run performance test (spec06-0.8c)
- GitHub Check: Quick Build, Unit Tests & Smoke Test
🔇 Additional comments (13)
src/cpu/o3/issue_queue.hh (4)
109-117: LGTM on visibility reorganization.The separation of public members (
inports,outports,iqsize,replayQsize,scheduleToExecDelay,iqname) from private implementation details is a reasonable design choice for the per-port wiring mechanism.
139-140: Per-port wire vectors look correct.The change from single wires to per-port vectors aligns with the new per-port inflight tracking design. Ensure these are properly sized before use in
setIssuePipe.
200-206:readyQInsertimplementation is correct.Using
std::lower_boundwithselect_policymaintains sorted order for the ready queue. The method correctly:
- Sets the instruction's
inReadyQflag- Looks up the correct queue via
readyQclassify- Inserts at the sorted position
307-308: Centralized per-port inflight management looks good.Moving
inflightIssuesto a flatstd::vector<TimeBuffer<DynInstPtr>>at the scheduler level (one TimeBuffer per port across all IQs) simplifies lifecycle management and aligns with the per-port wiring viasetIssuePipe.src/cpu/o3/issue_queue.cc (7)
191-192: LGTM on vector initialization.Resizing
toIssueandtoFutooutportsensures proper sizing beforesetIssuePipeis called.
384-400: Per-port iteration inissueToFuis correct.The logic correctly:
- Iterates through each output port
- Dereferences the
toFuwire to access the instruction- Uses
readyQInsertfor re-queuing failed issues
517-517: Consistent use ofreadyQInsert.Replacing the macro with the inline method improves readability and maintainability.
620-625: Per-port scheduling looks correct.The instruction is assigned to the appropriate port's issue pipe via
*toIssue[pi] = inst, which will propagate through the TimeBuffer delay.
833-839: TimeBuffer construction per outport.Each IQ's outports get a dedicated
TimeBuffer<DynInstPtr>with past=0 and future=scheduleToExecDelay. This correctly models the pipeline delay.
882-888: Per-port wiring logic is correct.The global
portidcounter correctly maps each IQ's local port index to the globalinflightIssuesvector index.
1017-1020: Advancing all inflight pipes intick().The centralized advance of all
inflightIssuesensures pipeline progression happens uniformly after all IQs have ticked.src/cpu/timebuf.hh (2)
192-194:advance()correctly manages element lifecycle.The sequence of explicitly calling the destructor, zeroing memory, then placement-new is correct for resetting the oldest slot in the circular buffer.
220-234: LGTM on access pattern changes.The direct indexing into
datasarray is correct and aligns with the new storage model.
| TimeBuffer() {} | ||
|
|
||
| TimeBuffer(const TimeBuffer<T> &other) | ||
| : _id(other._id), past(other.past), future(other.future), size(other.size), base(other.base) | ||
| { | ||
| datas = new T[size]; | ||
| for (unsigned i = 0; i < size; i++) { | ||
| datas[i] = other.datas[i]; // must use explicit copy to handle non-POD types | ||
| } | ||
| } | ||
|
|
||
| TimeBuffer(TimeBuffer<T> &&other) noexcept | ||
| : _id(other._id), past(other.past), future(other.future), size(other.size), base(other.base), datas(other.datas) | ||
| { | ||
| // Null out the other datas pointer to avoid double deletion | ||
| other.datas = nullptr; | ||
| } | ||
|
|
||
| ~TimeBuffer() | ||
| { | ||
| for (unsigned i = 0; i < size; ++i) | ||
| (reinterpret_cast<T *>(index[i]))->~T(); | ||
| delete [] data; | ||
| delete [] datas; |
There was a problem hiding this comment.
Missing copy and move assignment operators violates Rule of 5.
The class defines a destructor, copy constructor, and move constructor, but lacks copy assignment and move assignment operators. This can lead to:
- Double-free if default copy assignment is used
- Resource leaks if default move assignment is used
🐛 Proposed addition of assignment operators
TimeBuffer& operator=(const TimeBuffer<T>& other)
{
if (this != &other) {
// Clean up existing resources
if (datas) {
for (unsigned i = 0; i < size; i++) {
datas[i].~T();
}
delete[] reinterpret_cast<char*>(datas);
}
// Copy from other
_id = other._id;
past = other.past;
future = other.future;
size = other.size;
base = other.base;
if (size > 0) {
datas = (T*)new char[sizeof(T) * size];
for (unsigned i = 0; i < size; i++) {
new (datas + i) T(other.datas[i]);
}
} else {
datas = nullptr;
}
}
return *this;
}
TimeBuffer& operator=(TimeBuffer<T>&& other) noexcept
{
if (this != &other) {
// Clean up existing resources
if (datas) {
for (unsigned i = 0; i < size; i++) {
datas[i].~T();
}
delete[] reinterpret_cast<char*>(datas);
}
// Move from other
_id = other._id;
past = other.past;
future = other.future;
size = other.size;
base = other.base;
datas = other.datas;
other.datas = nullptr;
other.size = 0;
}
return *this;
}🤖 Prompt for AI Agents
In @src/cpu/timebuf.hh around lines 150 - 170, The class TimeBuffer defines a
destructor, copy ctor and move ctor but lacks copy and move assignment
operators; implement TimeBuffer& operator=(const TimeBuffer<T>&) and TimeBuffer&
operator=(TimeBuffer<T>&&) to satisfy the Rule of 5: in copy-assignment guard
against self-assignment, free existing resources held in datas, deep-copy
other's metadata (_id, past, future, size, base) and allocate/copy elements into
datas (handling non-POD via placement-new or proper element copy), and in
move-assignment guard self-assignment, free existing resources, transfer
metadata and datas pointer from other to this, set other.datas = nullptr and
other.size = 0, and return *this; ensure consistency with the existing
destructor and move/copy ctors to avoid double-free or leaks.
🚀 Coremark Smoke Test Results
✅ Difftest smoke test passed! |
|
[Generated by GEM5 Performance Robot] Ideal BTB PerformanceOverall Score
|
85d0254 to
a3dcd57
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/cpu/o3/issue_queue.hh`:
- Around line 210-213: setIssuePipe currently writes to toIssue[pi] and toFu[pi]
without validating pi; add a bounds check at the start of setIssuePipe to ensure
0 <= pi < N (where N is the size/capacity of toIssue/toFu) and early-return or
assert/log an error if out of range, then only call
issuepipe.getWire(scheduleToExecDelay) and issuepipe.getWire(0) when pi is
valid; reference setIssuePipe, toIssue, toFu, issuepipe.getWire, and
scheduleToExecDelay when implementing the check.
♻️ Duplicate comments (3)
src/cpu/timebuf.hh (3)
138-148: Memory allocation/deallocation mismatch (previously flagged).The constructor allocates with
new char[sizeof(T) * size]and casts toT*, but the destructor at line 170 usesdelete[] datas. This is undefined behavior because:
- Memory allocated as
char[]is deleted asT[]- Element destructors are not explicitly called before deallocation
This was already flagged in a previous review with a proposed fix.
150-159: Copy constructor allocation inconsistency and missing assignment operators (previously flagged).The copy constructor uses
new T[size](line 155) which differs from the main constructor'snew char[]approach. This creates inconsistent behavior and the destructor will behave differently for copied vs originally-constructed objects.Additionally, the class defines destructor, copy constructor, and move constructor but lacks copy and move assignment operators (Rule of 5 violation).
These issues were flagged in previous reviews with proposed fixes.
161-170: Move constructor and destructor.The move constructor correctly transfers ownership by copying the
dataspointer and nullingother.datas. However, the destructor issue (deletingchar[]-allocated memory asT[]) remains unresolved as noted in previous reviews.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/cpu/o3/issue_queue.ccsrc/cpu/o3/issue_queue.hhsrc/cpu/timebuf.hh
🧰 Additional context used
🧬 Code graph analysis (1)
src/cpu/o3/issue_queue.cc (1)
src/cpu/timebuf.hh (9)
i(82-87)i(82-82)i(113-118)i(126-131)TimeBuffer(138-148)TimeBuffer(150-150)TimeBuffer(152-159)TimeBuffer(161-166)TimeBuffer(168-171)
🔇 Additional comments (14)
src/cpu/o3/issue_queue.hh (4)
109-117: Visibility changes look intentional for per-port wiring.The exposure of
inports,outports,iqsize,replayQsize,scheduleToExecDelay, andiqnameas public members enables external access needed for the centralized per-port inflight management inScheduler. This aligns with the architectural changes in this PR.
139-140: Per-port wire vectors correctly replace single inflight tracking.The change from a single
IssueStreamto per-portstd::vector<TimeBuffer<DynInstPtr>::wire>for bothtoIssueandtoFuproperly supports the new per-port inflight pipeline architecture.
200-205: Helper method for ready queue insertion looks correct.The
readyQInsertmethod properly:
- Marks the instruction as in the ready queue
- Looks up the correct ready queue by opClass
- Uses binary search for ordered insertion
307-308: Centralized inflight management in Scheduler.The
std::vector<TimeBuffer<DynInstPtr>> inflightIssuescentralizes inflight issue tracking at the Scheduler level, which is then wired to individual IQs viasetIssuePipe. This design properly separates ownership (Scheduler owns the buffers) from access (IQs use wires to read/write).src/cpu/timebuf.hh (1)
192-194: Placement new in advance() is correct pattern.The
advance()method properly destructs the old element, zeroes memory, and constructs a new element in place. This is consistent with the main constructor's approach.src/cpu/o3/issue_queue.cc (9)
191-192: Per-port wire vectors properly initialized.The
toIssueandtoFuvectors are resized to matchoutports, preparing them for the per-port wiring performed later byScheduler::setIssuePipe.
384-400: Per-port iteration inissueToFulooks correct.The loop properly iterates over each outport's
toFuwire and handles issue failures by reinserting into the ready queue viareadyQInsert.
517-517: Consistent use ofreadyQInserthelper.The helper method replaces what was likely a macro, improving code clarity.
620-625: Schedule path correctly uses per-port wiring.The arbitration retry path uses
readyQInsert, and scheduled instructions are written to the per-porttoIssue[pi]wire.
833-839: TimeBuffer construction for per-port inflight management.Each IQ outport gets its own
TimeBuffer<DynInstPtr>(0, iq->scheduleToExecDelay)which creates a buffer withpast=0andfuture=scheduleToExecDelay. This allows accessing wire positions from 0 (current) toscheduleToExecDelay(future), matching the issue-to-execute pipeline depth.
882-888: Per-port wiring connects Scheduler's buffers to IQ wires.The loop iterates through all IQs and their outports, calling
setIssuePipeto wire each inflight buffer to the corresponding IQ port. Theportidcounter correctly tracks the global port index across all IQs.
1017-1020: Centralized inflight buffer advancement.Moving the
advance()calls toScheduler::tick()centralizes timing control. All inflight buffers are advanced together after individual IQ ticks complete.
1386-1393: Load cancel clears canceled instructions from inflight buffers.The iteration properly traverses all inflight buffers and their time slots, nulling any canceled instructions to prevent them from proceeding through the pipeline.
1491-1499: Squash clears squashed instructions from inflight buffers.Similar to
loadCancel, this properly clears squashed instructions from all inflight buffer slots to maintain pipeline consistency after a squash event.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| inline void setIssuePipe(TimeBuffer<DynInstPtr>& issuepipe, int pi) { | ||
| toIssue[pi] = issuepipe.getWire(scheduleToExecDelay); | ||
| toFu[pi] = issuepipe.getWire(0); | ||
| } |
There was a problem hiding this comment.
Missing bounds check in setIssuePipe.
The method accesses toIssue[pi] and toFu[pi] without validating that pi is within bounds. If called with an invalid port index, this will cause undefined behavior.
🐛 Proposed fix to add bounds checking
inline void setIssuePipe(TimeBuffer<DynInstPtr>& issuepipe, int pi) {
+ assert(pi >= 0 && pi < outports);
toIssue[pi] = issuepipe.getWire(scheduleToExecDelay);
toFu[pi] = issuepipe.getWire(0);
}🤖 Prompt for AI Agents
In `@src/cpu/o3/issue_queue.hh` around lines 210 - 213, setIssuePipe currently
writes to toIssue[pi] and toFu[pi] without validating pi; add a bounds check at
the start of setIssuePipe to ensure 0 <= pi < N (where N is the size/capacity of
toIssue/toFu) and early-return or assert/log an error if out of range, then only
call issuepipe.getWire(scheduleToExecDelay) and issuepipe.getWire(0) when pi is
valid; reference setIssuePipe, toIssue, toFu, issuepipe.getWire, and
scheduleToExecDelay when implementing the check.
🚀 Coremark Smoke Test Results
✅ Difftest smoke test passed! |
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.