Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds a 68k CPU cycle profiler for the Genesis Plus GX emulator test harness. The profiler hooks into the emulator's native CPU execution to collect per-function cycle counts and call statistics without requiring ROM modifications.
Changes:
- Added CPU cycle profiler with support for symbol loading from ELF files
- Implemented sample-based profiling for reduced overhead (configurable 1/N sampling rate)
- Added 13 comprehensive tests covering lifecycle, accuracy, sampling, and edge cases
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| vendor/genplusgx/debug/cpuhook.h | Changed cpu_hook from implicit definition to extern declaration for proper linkage |
| include/profiler.h | New profiler API header with documentation for ProfileMode, ProfileOptions, and Profiler class |
| src/profiler.cpp | Core profiler implementation with symbol loading, cycle tracking, and reporting |
| tests/profiler_test.cpp | Comprehensive test suite covering basic functionality, sampling, accumulation, consistency, and edge cases |
| BUILD.bazel | Build configuration updates to include CPU hooks, profiler sources, and platform-specific linking |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add ProfileOptions struct with mode and sample_rate - Add Start(ProfileOptions) overload - When sample_rate > 1, only do function lookup every Nth instruction - Cycles are scaled up to account for skipped samples Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tests for CPU cycle profiler using prime_sieve ROM: - Start/stop state management - Cycle counting accuracy - Function stats population - Reset clears stats - Sample-based profiling (rate reporting, cycle accuracy, overhead reduction) - Accumulation across multiple runs - Consistent deterministic results - Edge cases (no symbols, double start, stop without start) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add validation for invalid function ranges (end_addr <= start_addr) - Fix off-by-one error in ROM boundary check for ReadWord - Add overflow guards for address arithmetic - Fix sampling to accumulate actual cycles instead of scaling - Add max call stack depth (256) to prevent unbounded growth - Shell-quote paths in popen calls to prevent command injection - Update Bazel deps for Bazel 9.0 compatibility (rules_cc load statement) - Add noinline to prime_sieve functions for real symbol addresses - Update profiler_test to use real ELF-extracted function addresses - Fix documentation (LoadSymbolsFromELF method name, format comments) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
src/profiler.cpp:199
- The Reset() method should also reset sample_counter_ to ensure consistent behavior when profiling is restarted after a reset. Without this, the sampling phase will be out of sync if Reset() is called during profiling.
void Profiler::Reset() {
for (auto& kv : stats_) {
kv.second = FunctionStats();
}
call_stack_.clear();
total_cycles_ = 0;
pending_cycles_ = 0;
last_pc_ = 0;
if (running_) {
last_cycles_ = m68k.cycles;
}
}
src/profiler.cpp:271
- When sampling is enabled, skipping instructions without updating last_pc_ can cause incorrect call count attribution. The call detection logic at line 268 will compare the current PC with a PC from multiple instructions ago, potentially counting normal control flow within a function as function entries. This could artificially inflate call_count values. Consider either: (1) updating last_pc_ on every call to OnExecute regardless of sampling, or (2) disabling call counting when sampling is active.
// Sampling: only do expensive work every Nth instruction
if (sample_rate_ > 1) {
sample_counter_++;
if (sample_counter_ < sample_rate_) {
return; // Don't update last_pc_ - preserve it for call detection
}
sample_counter_ = 0;
// Use accumulated cycles since last sample (more accurate than scaling)
delta = pending_cycles_;
pending_cycles_ = 0;
}
// Attribute cycles to current function
const FunctionDef* func = LookupFunction(pc);
if (func) {
auto& s = stats_[func->start_addr];
s.cycles_exclusive += delta;
// Count function entry (PC moved into this function from outside)
if (last_pc_ != 0) {
const FunctionDef* last_func = LookupFunction(last_pc_);
if (last_func != func) {
s.call_count++;
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Bazel 9.0 removed built-in cc_library/cc_test rules, but googletest 1.14.0 hasn't been updated yet. Pin to Bazel 8.5.1 until the ecosystem catches up. - Add .bazelversion file pinning to 8.5.1 - Revert MODULE.bazel to original dependency versions - Remove rules_cc load statement (not needed in Bazel 8.x) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Move pending_cycles_ accumulation inside sampling branch to prevent unbounded growth when sampling is disabled (sample_rate_ == 1) - Update last_pc_ on every instruction, even skipped samples, so CallStack mode can correctly read opcodes for call/return detection - Add comments documenting that call counts and CallStack inclusive timing are less accurate when sampling is enabled Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explicitly include headers for types directly used: - <chrono> for std::chrono - <iostream> for std::cout - <vector> for std::vector Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/profiler.cpp
Outdated
| // Note: This differs from nm output (used by LoadSymbolsFromELF) which uses | ||
| // hex for both address and size. |
There was a problem hiding this comment.
The comment about parsing format differs between LoadSymbolsFromELF and LoadSymbolsFromFile. In LoadSymbolsFromELF (line 91), the comment says size is hex, while in LoadSymbolsFromFile (lines 124-130), it explicitly states size is decimal. However, LoadSymbolsFromFile uses %u (decimal) while LoadSymbolsFromELF uses %x (hex) for size. The documentation is correct but could be clearer to emphasize this critical difference to avoid confusion when maintaining or using these functions.
| // Note: This differs from nm output (used by LoadSymbolsFromELF) which uses | |
| // hex for both address and size. | |
| // | |
| // Parsing note: | |
| // - This function parses the address as hex (%x) and the size as decimal (%u). | |
| // - This intentionally differs from LoadSymbolsFromELF (nm output), which uses | |
| // hex for both address and size (%x for each). |
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The comment now explicitly notes that LoadSymbolsFromFile uses hex for address and decimal for size, while LoadSymbolsFromELF (nm output) uses hex for both. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
nmFeatures
Profiler::AddFunction()- manually add function rangesProfiler::LoadSymbolsFromELF()- auto-load symbols from ELFProfiler::Start(ProfileOptions)- start with optional sampling rateProfiler::GetTotalCycles()- total 68k cycles countedProfiler::GetAllStats()- per-function cycle and call statsProfiler::PrintReport()- formatted profiling reportTest coverage
Added 13 tests covering:
Other changes
LoadSymbolsFromELF(shell-quote paths)-lmlinking (Linux only, not needed on macOS)🤖 Generated with Claude Code