feat: tti-bench parity with run-profile.sh (persistent storage, profiling, flame graphs)#1
feat: tti-bench parity with run-profile.sh (persistent storage, profiling, flame graphs)#1ElodinLaarz wants to merge 2 commits intomainfrom
Conversation
… to tti-bench
Phase 1 — Storage & Script Integration:
- Persist jobs to data/jobs/{id}/results.json; reload on server restart
- Interrupted jobs marked as error on reload (not silently lost)
- Add StoreConfig.DataDir flag (--data, default ./data)
- Integrate run-profile.sh via UseProfileScript config: invokes script with
--output-dir, reads tti_ms files after completion, stores ProfileDir per run
- Capture SystemInfo (OS, kernel, CPU, RAM, Node, Go) once per job
Phase 2 — Profile Parsers (bench/internal/profile/):
- cpu.go: parse V8 .cpuprofile into FlameNode tree via parent-map + sample accumulation
- wall.go: parse Chrome trace events (X/B/E phases) into nested FlameNode tree
- memory.go: parse mem_trace.json into RSS/heap timeline + per-module KB attribution
- API routes: GET /benchmark/{id}/profile/cpu|wall|memory
Phase 3 — Streaming & UI:
- SSE endpoint GET /benchmark/{id}/events pushes update/job-done events
- Multi-tab detail page: TTI, CPU Flame, Wall Time, Memory (tabs hidden unless profiling)
- Canvas flame graph renderer: depth-colored rects, click-to-zoom, hover tooltip
- Memory timeline chart (RSS/HeapUsed/HeapTotal lines) + module attribution table
- Form gains "Use Node.js profiling hooks" checkbox + script path field
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds significant new functionality, including persistent job storage, Node.js profiling, and a rich UI with flame graphs for detailed performance analysis. The implementation is comprehensive, covering backend data capture, persistence, new API endpoints, and a sophisticated frontend visualization.
The review has identified several critical issues, primarily concerning data integrity and correctness. There are critical errors in the persistence layer where file I/O errors are ignored, potentially leading to silent data loss. Additionally, the memory profile parsing logic contains bugs related to JSON unmarshalling and unit conversion that will prevent the memory timeline from functioning correctly.
Other feedback includes suggestions for improving portability, handling errors more robustly, and optimizing performance in a few areas. Addressing the critical issues is essential for the stability and correctness of these new features.
| _ = os.MkdirAll(dir, 0755) | ||
| _ = os.WriteFile(filepath.Join(dir, "results.json"), data, 0644) |
There was a problem hiding this comment.
Errors from os.MkdirAll and os.WriteFile are being ignored. This is critical for a persistence layer. If the directory cannot be created or the file cannot be written (e.g., due to permissions, disk full), the job data will not be saved, but the application will continue as if it was successful. This can lead to silent data loss. These errors must be handled, at a minimum by logging them.
| HeapUsed int64 `json:"heapUsed"` | ||
| HeapTotal int64 `json:"heapTotal"` | ||
| External int64 `json:"external"` | ||
| Timestamp float64 `json:"timestamp"` |
There was a problem hiding this comment.
The JSON tag for the Timestamp field is timestamp. However, the mem_trace.json file generated by the profiling script uses the key timestamp_ms. This mismatch will cause the timestamp to always be unmarshalled as 0, breaking the memory timeline feature.
| Timestamp float64 `json:"timestamp"` | |
| Timestamp float64 `json:"timestamp_ms"` |
| startTs := snapshots[0].Timestamp | ||
| for _, s := range snapshots { | ||
| result.Timeline = append(result.Timeline, TimePoint{ | ||
| TsMs: (s.Timestamp - startTs) / 1e6, // ns -> ms (if nanoseconds) |
There was a problem hiding this comment.
The timestamp from the profile is divided by 1e6, with a comment suggesting a conversion from nanoseconds to milliseconds. However, the profiling script (run-profile.sh) uses performance.now(), which provides timestamps in milliseconds. This incorrect conversion will result in time values that are a million times smaller than they should be. The division is unnecessary.
| TsMs: (s.Timestamp - startTs) / 1e6, // ns -> ms (if nanoseconds) | |
| TsMs: (s.Timestamp - startTs), // timestamp is already in ms |
| kb, _ := strconv.ParseInt(fields[1], 10, 64) | ||
| info.TotalRAMMB = kb / 1024 | ||
| } |
There was a problem hiding this comment.
The error returned by strconv.ParseInt is being ignored. If parsing fails for any reason (e.g., unexpected format in /proc/meminfo), kb will be 0, and TotalRAMMB will be silently set to 0. This could lead to incorrect system information being displayed without any indication of an error. The error should be checked and handled.
if kb, err := strconv.ParseInt(fields[1], 10, 64); err == nil {
info.TotalRAMMB = kb / 1024
}|
|
||
| func (s *Store) loadFromDisk() { | ||
| pattern := filepath.Join(s.dataDir, "jobs", "*", "results.json") | ||
| matches, _ := filepath.Glob(pattern) |
There was a problem hiding this comment.
| // CPU info from /proc/cpuinfo | ||
| if data, err := os.ReadFile("/proc/cpuinfo"); err == nil { | ||
| lines := strings.Split(string(data), "\n") | ||
| for _, line := range lines { | ||
| if strings.HasPrefix(line, "model name") && info.CPUModel == "" { | ||
| parts := strings.SplitN(line, ":", 2) | ||
| if len(parts) == 2 { | ||
| info.CPUModel = strings.TrimSpace(parts[1]) | ||
| } | ||
| } | ||
| if strings.HasPrefix(line, "processor") { | ||
| info.CPUCores++ | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // RAM from /proc/meminfo | ||
| if data, err := os.ReadFile("/proc/meminfo"); err == nil { | ||
| for _, line := range strings.Split(string(data), "\n") { | ||
| if strings.HasPrefix(line, "MemTotal:") { | ||
| fields := strings.Fields(line) | ||
| if len(fields) >= 2 { | ||
| kb, _ := strconv.ParseInt(fields[1], 10, 64) | ||
| info.TotalRAMMB = kb / 1024 | ||
| } | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The captureSystemInfo function relies on reading /proc/cpuinfo and /proc/meminfo and running uname. This makes it specific to Linux-based systems. On other operating systems like macOS or Windows, this will fail (silently, as errors are ignored) and result in incomplete system information. For better portability, consider using a cross-platform library (like github.com/shirou/gopsutil) to gather system metrics, or implement OS-specific logic using build tags.
| // Walk up from leaf to root, adding time to each ancestor | ||
| visited := make(map[int]bool) | ||
| cur := sampleID | ||
| for cur != 0 { | ||
| if visited[cur] { | ||
| break | ||
| } | ||
| visited[cur] = true | ||
| totalTime[cur] += dt | ||
| cur = parent[cur] | ||
| } |
There was a problem hiding this comment.
| for i := 1; i < len(spans); i++ { | ||
| for j := i; j > 0; j-- { | ||
| a, b := spans[j-1], spans[j] | ||
| if b.ts < a.ts || (b.ts == a.ts && b.dur > a.dur) { | ||
| spans[j-1], spans[j] = spans[j], spans[j-1] | ||
| } else { | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This block implements a sorting algorithm for traceSpans using nested loops, which is an O(n^2) insertion sort. For large trace files, this can become a performance bottleneck on the server. It's more idiomatic and efficient in Go to use the sort.Slice function from the standard library, which provides an optimized O(n log n) sort.
Example using sort.Slice:
sort.Slice(spans, func(i, j int) bool {
if spans[i].ts != spans[j].ts {
return spans[i].ts < spans[j].ts
}
return spans[i].dur > spans[j].dur
})| } | ||
|
|
||
| func (h *Handler) sendSSEUpdate(w http.ResponseWriter, job *benchmark.BenchmarkResult) { | ||
| data, _ := json.Marshal(job) |
There was a problem hiding this comment.
The error returned from json.Marshal is ignored. While marshalling this struct is unlikely to fail, it's a good practice to handle all errors. If an error did occur, an empty or partial message would be sent to the client, which could cause confusion or client-side errors. The error should be checked and logged.
There was a problem hiding this comment.
Pull request overview
This PR enhances the bench web server to support persistent benchmark jobs, optional Node/V8 profiling ingestion, and richer job detail visualization (flame graphs, memory timeline), aligning tti-bench behavior with run-profile.sh.
Changes:
- Add persistent job storage under
data/jobs/{id}/results.json, with reload-on-start behavior. - Introduce profiling ingestion + new JSON API routes for CPU/wall/memory profile data, and an SSE endpoint for job updates.
- Update the web UI to support profiling configuration and add tabbed, lazy-loaded profile visualizations.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
bench/internal/web/templates/index.html |
Adds profiling form controls and tabbed profile UI with canvas renderers. |
bench/internal/web/handler.go |
Adds SSE + profile JSON endpoints and wires profiling availability into templates. |
bench/internal/profile/cpu.go |
Parses V8 .cpuprofile into a flame tree structure. |
bench/internal/profile/wall.go |
Parses wall_trace.json (Chrome trace) into a flame tree structure. |
bench/internal/profile/memory.go |
Parses mem_trace.json into a timeline + module attribution summary. |
bench/internal/benchmark/store.go |
Adds on-disk persistence, subscription-based update notifications, and system info persistence. |
bench/internal/benchmark/runner.go |
Adds system info capture and optional run-profile.sh execution mode. |
bench/internal/benchmark/result.go |
Extends job config/results with profiling + system info fields. |
bench/cmd/server/main.go |
Adds -data flag and initializes the persistent store with StoreConfig. |
Comments suppressed due to low confidence (1)
bench/internal/benchmark/store.go:281
Store.Deleteonly removes the job from the in-memory map; it leavesdata/jobs/{id}/results.json(and related per-job data) on disk. After a server restart,loadFromDisk()will reload the “deleted” job, so DELETE isn’t durable and the data dir will grow without bound. Consider removing the per-job directory on disk (and ideally cleaning up any listener entries) as part of Delete, and/or persisting a tombstone state.
func (s *Store) Delete(id string) bool {
s.mu.Lock()
defer s.mu.Unlock()
_, ok := s.jobs[id]
delete(s.jobs, id)
return ok
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| func (h *Handler) sendSSEUpdate(w http.ResponseWriter, job *benchmark.BenchmarkResult) { | ||
| data, _ := json.Marshal(job) | ||
| fmt.Fprintf(w, "event: update\ndata: %s\n\n", data) | ||
| } |
| // firstProfileDir returns the ProfileDir of the first completed run that has one. | ||
| func firstProfileDir(job *benchmark.BenchmarkResult) string { | ||
| for _, run := range job.Runs { | ||
| if run.ProfileDir != "" { | ||
| if _, err := os.Stat(run.ProfileDir); err == nil { | ||
| return run.ProfileDir | ||
| } | ||
| } | ||
| } | ||
| return "" |
| {{if .HasProfile}} | ||
| <script> | ||
| (function() { | ||
| // Lazy-load profile data and render flame graphs only when tab is activated | ||
| var profileLoaded = {}; | ||
| var jobID = "{{.JobID}}"; | ||
|
|
||
| function activateTab(tabName) { | ||
| document.querySelectorAll('.tab-btn').forEach(function(b) { b.classList.toggle('active', b.dataset.tab === tabName); }); | ||
| document.querySelectorAll('.tab-pane').forEach(function(p) { p.classList.toggle('active', p.id === 'tab-' + tabName); }); | ||
| if (!profileLoaded[tabName]) { | ||
| profileLoaded[tabName] = true; | ||
| loadTab(tabName); | ||
| } | ||
| } | ||
|
|
||
| function loadTab(name) { | ||
| if (name === 'cpu') loadFlameGraph('cpu', '/benchmark/' + jobID + '/profile/cpu', 'flame-cpu'); | ||
| else if (name === 'wall') loadFlameGraph('wall', '/benchmark/' + jobID + '/profile/wall', 'flame-wall'); | ||
| else if (name === 'memory') loadMemory('/benchmark/' + jobID + '/profile/memory'); | ||
| } | ||
|
|
||
| window.activateTab = activateTab; |
| // Sort by start time ascending, then by duration descending (parents first) | ||
| for i := 1; i < len(spans); i++ { | ||
| for j := i; j > 0; j-- { | ||
| a, b := spans[j-1], spans[j] | ||
| if b.ts < a.ts || (b.ts == a.ts && b.dur > a.dur) { | ||
| spans[j-1], spans[j] = spans[j], spans[j-1] | ||
| } else { | ||
| break | ||
| } | ||
| } | ||
| } |
| // Accumulate total time: for each sample, walk up the call stack | ||
| totalTime := make(map[int]float64, len(p.Nodes)) | ||
| for i, sampleID := range p.Samples { | ||
| var dt float64 | ||
| if i < len(p.TimeDeltas) { | ||
| dt = float64(p.TimeDeltas[i]) / 1000.0 | ||
| } | ||
| // Walk up from leaf to root, adding time to each ancestor | ||
| visited := make(map[int]bool) | ||
| cur := sampleID | ||
| for cur != 0 { | ||
| if visited[cur] { | ||
| break | ||
| } | ||
| visited[cur] = true | ||
| totalTime[cur] += dt | ||
| cur = parent[cur] | ||
| } | ||
| } |
There was a problem hiding this comment.
Code Review
This pull request introduces significant new functionality to the TTI benchmarker, including persistent job storage, Node.js profiling with flame graph visualization, and live UI updates via SSE. The changes are extensive, adding new backend logic for running and parsing profiles, persistence mechanisms, and a rich frontend for data visualization. My review focuses on improving error handling for the new persistence layer to prevent silent data loss, enhancing the real-time update mechanism, and suggestions for improving the structure and maintainability of the new frontend and backend code.
| _ = os.MkdirAll(dir, 0755) | ||
| _ = os.WriteFile(filepath.Join(dir, "results.json"), data, 0644) |
There was a problem hiding this comment.
| matches, _ := filepath.Glob(pattern) | ||
| for _, path := range matches { | ||
| data, err := os.ReadFile(path) | ||
| if err != nil { | ||
| continue | ||
| } | ||
| var d diskResult | ||
| if err := json.Unmarshal(data, &d); err != nil { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Errors from filepath.Glob, os.ReadFile, and json.Unmarshal are ignored. This can cause silent failures where corrupted or unreadable job files are simply skipped without any notification. This could be confusing for users who expect their jobs to be present after a restart. At a minimum, these errors should be logged to provide visibility into loading problems.
| } | ||
|
|
||
| func (h *Handler) sendSSEUpdate(w http.ResponseWriter, job *benchmark.BenchmarkResult) { | ||
| data, _ := json.Marshal(job) |
There was a problem hiding this comment.
| job2, _ := r.store.Get(id) | ||
| stats := computeStats(job2.Runs) | ||
| r.store.Finish(id, stats, "") |
There was a problem hiding this comment.
| if strings.HasPrefix(line, "MemTotal:") { | ||
| fields := strings.Fields(line) | ||
| if len(fields) >= 2 { | ||
| kb, _ := strconv.ParseInt(fields[1], 10, 64) |
There was a problem hiding this comment.
The error returned by strconv.ParseInt is being ignored. If parsing fails (e.g., if /proc/meminfo has an unexpected format), kb will be 0, leading to TotalRAMMB being incorrectly reported as 0 on the next line without any error indication. The error should be checked and handled, for instance by logging it.
| for i := 1; i < len(spans); i++ { | ||
| for j := i; j > 0; j-- { | ||
| a, b := spans[j-1], spans[j] | ||
| if b.ts < a.ts || (b.ts == a.ts && b.dur > a.dur) { | ||
| spans[j-1], spans[j] = spans[j], spans[j-1] | ||
| } else { | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This custom insertion sort implementation can be replaced with the more idiomatic and potentially more performant sort.Slice from the standard library. This would make the code more readable and concise.
sort.Slice(spans, func(i, j int) bool {
if spans[i].ts != spans[j].ts {
return spans[i].ts < spans[j].ts
}
return spans[i].dur > spans[j].dur
})| hx-get="/benchmark/{{.Job.ID}}/results" | ||
| hx-trigger="every 500ms" | ||
| hx-swap="outerHTML" |
There was a problem hiding this comment.
The page is configured to poll for updates every 500ms. Since an SSE endpoint (/benchmark/{id}/events) has been implemented to provide real-time updates, it would be more efficient to use that instead of polling. HTMX provides an SSE extension that can be used to connect to the event stream and update the UI, which would reduce server load and latency. You would need to include the extension script (<script src="https://unpkg.com/htmx.org/dist/ext/sse.js"></script>) and then replace the hx-get and hx-trigger attributes on this element with hx-ext="sse", sse-connect="...", and sse-swap="update".
| <script> | ||
| (function() { | ||
| // Lazy-load profile data and render flame graphs only when tab is activated | ||
| var profileLoaded = {}; | ||
| var jobID = "{{.JobID}}"; | ||
|
|
||
| function activateTab(tabName) { | ||
| document.querySelectorAll('.tab-btn').forEach(function(b) { b.classList.toggle('active', b.dataset.tab === tabName); }); | ||
| document.querySelectorAll('.tab-pane').forEach(function(p) { p.classList.toggle('active', p.id === 'tab-' + tabName); }); | ||
| if (!profileLoaded[tabName]) { | ||
| profileLoaded[tabName] = true; | ||
| loadTab(tabName); | ||
| } | ||
| } | ||
|
|
||
| function loadTab(name) { | ||
| if (name === 'cpu') loadFlameGraph('cpu', '/benchmark/' + jobID + '/profile/cpu', 'flame-cpu'); | ||
| else if (name === 'wall') loadFlameGraph('wall', '/benchmark/' + jobID + '/profile/wall', 'flame-wall'); | ||
| else if (name === 'memory') loadMemory('/benchmark/' + jobID + '/profile/memory'); | ||
| } | ||
|
|
||
| window.activateTab = activateTab; | ||
|
|
||
| // Flame graph renderer | ||
| function loadFlameGraph(name, url, canvasId) { | ||
| var el = document.getElementById(canvasId + '-container'); | ||
| if (!el) return; | ||
| el.innerHTML = '<div class="flame-loading">Loading...</div>'; | ||
| fetch(url).then(function(r) { | ||
| if (!r.ok) { el.innerHTML = '<div class="flame-loading" style="color:#f87171">No profile data available.</div>'; return; } | ||
| return r.json(); | ||
| }).then(function(tree) { | ||
| if (!tree) return; | ||
| el.innerHTML = ''; | ||
| var canvas = document.createElement('canvas'); | ||
| canvas.className = 'flame-canvas'; | ||
| el.appendChild(canvas); | ||
| var tip = document.getElementById('flame-tooltip'); | ||
| renderFlameGraph(canvas, tree, tip); | ||
| }).catch(function(e) { | ||
| el.innerHTML = '<div class="flame-loading" style="color:#f87171">Error: ' + e.message + '</div>'; | ||
| }); | ||
| } | ||
|
|
||
| function loadMemory(url) { | ||
| var el = document.getElementById('mem-container'); | ||
| if (!el) return; | ||
| el.innerHTML = '<div class="flame-loading">Loading...</div>'; | ||
| fetch(url).then(function(r) { | ||
| if (!r.ok) { el.innerHTML = '<div class="flame-loading" style="color:#f87171">No memory data available.</div>'; return; } | ||
| return r.json(); | ||
| }).then(function(data) { | ||
| if (!data) return; | ||
| el.innerHTML = ''; | ||
| if (data.timeline && data.timeline.length > 0) { | ||
| var canvas = document.createElement('canvas'); | ||
| canvas.className = 'mem-canvas'; | ||
| canvas.height = 200; | ||
| el.appendChild(canvas); | ||
| var legend = document.createElement('div'); | ||
| legend.className = 'mem-legend'; | ||
| legend.innerHTML = '<div class="mem-legend-item"><div class="mem-legend-dot" style="background:#60a5fa"></div>RSS</div>' + | ||
| '<div class="mem-legend-item"><div class="mem-legend-dot" style="background:#4ade80"></div>Heap Used</div>' + | ||
| '<div class="mem-legend-item"><div class="mem-legend-dot" style="background:#facc15"></div>Heap Total</div>'; | ||
| el.appendChild(legend); | ||
| renderMemChart(canvas, data.timeline); | ||
| } | ||
| if (data.attribution && data.attribution.length > 0) { | ||
| var h = document.createElement('h2'); | ||
| h.style.marginTop = '1.5rem'; | ||
| h.textContent = 'Module Memory Attribution'; | ||
| el.appendChild(h); | ||
| var tbl = document.createElement('table'); | ||
| tbl.innerHTML = '<thead><tr><th>Module</th><th>Delta KB</th></tr></thead>'; | ||
| var tbody = document.createElement('tbody'); | ||
| data.attribution.slice(0, 30).forEach(function(m) { | ||
| var tr = document.createElement('tr'); | ||
| tr.innerHTML = '<td style="font-family:monospace">' + escHtml(m.module) + '</td><td>' + m.deltaKB + ' KB</td>'; | ||
| tbody.appendChild(tr); | ||
| }); | ||
| tbl.appendChild(tbody); | ||
| el.appendChild(tbl); | ||
| } | ||
| }).catch(function(e) { | ||
| el.innerHTML = '<div class="flame-loading" style="color:#f87171">Error: ' + e.message + '</div>'; | ||
| }); | ||
| } | ||
|
|
||
| function escHtml(s) { | ||
| return String(s).replace(/&/g,'&').replace(/</g,'<').replace(/>/g,'>'); | ||
| } | ||
|
|
||
| // ─── Flame Graph Canvas Renderer ─────────────────────────────────────────── | ||
| var COLORS = ['#2563eb','#1d4ed8','#1e40af','#1e3a8a','#3b82f6','#60a5fa','#93c5fd']; | ||
| var ROW_HEIGHT = 20; | ||
|
|
||
| function renderFlameGraph(canvas, root, tooltip) { | ||
| // Flatten tree into rows | ||
| var rows = []; | ||
| var maxDepth = 0; | ||
|
|
||
| function traverse(node, depth, x, w) { | ||
| if (!rows[depth]) rows[depth] = []; | ||
| rows[depth].push({node: node, x: x, w: w, depth: depth}); | ||
| if (depth > maxDepth) maxDepth = depth; | ||
| if (!node.children || node.children.length === 0) return; | ||
| var total = node.totalMs || 0; | ||
| if (total === 0) { | ||
| node.children.forEach(function(c) { total += (c.totalMs || 0); }); | ||
| } | ||
| if (total === 0) return; | ||
| var cx = x; | ||
| node.children.forEach(function(child) { | ||
| var cw = w * ((child.totalMs || 0) / total); | ||
| if (cw > 0.5) traverse(child, depth + 1, cx, cw); | ||
| cx += cw; | ||
| }); | ||
| } | ||
|
|
||
| traverse(root, 0, 0, 1.0); | ||
|
|
||
| var dpr = window.devicePixelRatio || 1; | ||
| var W = canvas.offsetWidth || 800; | ||
| canvas.width = W * dpr; | ||
| canvas.height = (maxDepth + 1) * ROW_HEIGHT * dpr; | ||
| canvas.style.height = ((maxDepth + 1) * ROW_HEIGHT) + 'px'; | ||
|
|
||
| var ctx = canvas.getContext('2d'); | ||
| ctx.scale(dpr, dpr); | ||
|
|
||
| var zoom = {x: 0, w: 1.0}; | ||
|
|
||
| function draw() { | ||
| ctx.clearRect(0, 0, W, canvas.height / dpr); | ||
| rows.forEach(function(row, depth) { | ||
| row.forEach(function(item) { | ||
| var nx = (item.x - zoom.x) / zoom.w; | ||
| var nw = item.w / zoom.w; | ||
| if (nx + nw < 0 || nx > 1) return; | ||
| var px = nx * W; | ||
| var pw = nw * W; | ||
| if (pw < 1) return; | ||
| var py = depth * ROW_HEIGHT; | ||
| ctx.fillStyle = COLORS[depth % COLORS.length]; | ||
| ctx.fillRect(px + 1, py + 1, pw - 2, ROW_HEIGHT - 2); | ||
| if (pw > 30) { | ||
| ctx.fillStyle = '#fff'; | ||
| ctx.font = '11px monospace'; | ||
| ctx.save(); | ||
| ctx.beginPath(); | ||
| ctx.rect(px + 2, py, pw - 4, ROW_HEIGHT); | ||
| ctx.clip(); | ||
| var label = item.node.name || '(unknown)'; | ||
| if (item.node.totalMs != null) label += ' ' + item.node.totalMs.toFixed(1) + 'ms'; | ||
| ctx.fillText(label, px + 4, py + 14); | ||
| ctx.restore(); | ||
| } | ||
| }); | ||
| }); | ||
| } | ||
|
|
||
| draw(); | ||
|
|
||
| // Zoom on click | ||
| canvas.addEventListener('click', function(e) { | ||
| var rect = canvas.getBoundingClientRect(); | ||
| var mx = (e.clientX - rect.left) / rect.width; | ||
| var my = e.clientY - rect.top; | ||
| var depth = Math.floor(my / ROW_HEIGHT); | ||
| var clicked = null; | ||
| if (rows[depth]) { | ||
| rows[depth].forEach(function(item) { | ||
| var nx = (item.x - zoom.x) / zoom.w; | ||
| var nw = item.w / zoom.w; | ||
| if (mx >= nx && mx <= nx + nw) clicked = item; | ||
| }); | ||
| } | ||
| if (clicked) { | ||
| if (clicked.x === zoom.x && clicked.w === zoom.w) { | ||
| zoom = {x: 0, w: 1.0}; | ||
| } else { | ||
| zoom = {x: clicked.x, w: clicked.w}; | ||
| } | ||
| draw(); | ||
| } | ||
| }); | ||
|
|
||
| // Tooltip on mousemove | ||
| if (tooltip) { | ||
| canvas.addEventListener('mousemove', function(e) { | ||
| var rect = canvas.getBoundingClientRect(); | ||
| var mx = (e.clientX - rect.left) / rect.width; | ||
| var my = e.clientY - rect.top; | ||
| var depth = Math.floor(my / ROW_HEIGHT); | ||
| var found = null; | ||
| if (rows[depth]) { | ||
| rows[depth].forEach(function(item) { | ||
| var nx = (item.x - zoom.x) / zoom.w; | ||
| var nw = item.w / zoom.w; | ||
| if (mx >= nx && mx <= nx + nw) found = item; | ||
| }); | ||
| } | ||
| if (found) { | ||
| var n = found.node; | ||
| tooltip.style.display = 'block'; | ||
| tooltip.style.left = (e.clientX + 12) + 'px'; | ||
| tooltip.style.top = (e.clientY - 8) + 'px'; | ||
| tooltip.textContent = (n.name || '(unknown)') + | ||
| (n.totalMs != null ? ' — total: ' + n.totalMs.toFixed(2) + 'ms' : '') + | ||
| (n.selfMs != null ? ', self: ' + n.selfMs.toFixed(2) + 'ms' : ''); | ||
| } else { | ||
| tooltip.style.display = 'none'; | ||
| } | ||
| }); | ||
| canvas.addEventListener('mouseleave', function() { tooltip.style.display = 'none'; }); | ||
| } | ||
| } | ||
|
|
||
| // ─── Memory Timeline Chart ────────────────────────────────────────────────── | ||
| function renderMemChart(canvas, timeline) { | ||
| var dpr = window.devicePixelRatio || 1; | ||
| var W = canvas.offsetWidth || 800; | ||
| var H = 200; | ||
| canvas.width = W * dpr; | ||
| canvas.height = H * dpr; | ||
| var ctx = canvas.getContext('2d'); | ||
| ctx.scale(dpr, dpr); | ||
|
|
||
| var maxVal = 0; | ||
| timeline.forEach(function(p) { | ||
| if (p.rss > maxVal) maxVal = p.rss; | ||
| if (p.heapTotal > maxVal) maxVal = p.heapTotal; | ||
| }); | ||
| if (maxVal === 0) return; | ||
|
|
||
| var maxTs = timeline[timeline.length - 1].tsMs || 1; | ||
|
|
||
| function px(ts) { return (ts / maxTs) * (W - 40) + 20; } | ||
| function py(val) { return H - 20 - (val / maxVal) * (H - 30); } | ||
|
|
||
| // Grid | ||
| ctx.strokeStyle = '#2a2a2a'; | ||
| ctx.lineWidth = 1; | ||
| for (var i = 0; i <= 4; i++) { | ||
| var y = H - 20 - (i / 4) * (H - 30); | ||
| ctx.beginPath(); ctx.moveTo(20, y); ctx.lineTo(W - 20, y); ctx.stroke(); | ||
| ctx.fillStyle = '#555'; ctx.font = '10px monospace'; | ||
| ctx.fillText(((maxVal / 1024 / 1024) * (i / 4)).toFixed(0) + 'MB', 2, y + 3); | ||
| } | ||
|
|
||
| function drawLine(key, color) { | ||
| ctx.beginPath(); ctx.strokeStyle = color; ctx.lineWidth = 1.5; | ||
| timeline.forEach(function(p, i) { | ||
| var x = px(p.tsMs), y = py(p[key]); | ||
| if (i === 0) ctx.moveTo(x, y); else ctx.lineTo(x, y); | ||
| }); | ||
| ctx.stroke(); | ||
| } | ||
|
|
||
| drawLine('rss', '#60a5fa'); | ||
| drawLine('heapUsed', '#4ade80'); | ||
| drawLine('heapTotal', '#facc15'); | ||
| } | ||
|
|
||
| // Auto-activate TTI tab initially | ||
| document.addEventListener('DOMContentLoaded', function() { | ||
| activateTab('tti'); | ||
| }); | ||
| })(); | ||
| </script> | ||
| <div id="flame-tooltip" class="flame-tooltip"></div> |
| <script> | ||
| (function() { | ||
| // Lazy-load profile data and render flame graphs only when tab is activated | ||
| var profileLoaded = {}; |
There was a problem hiding this comment.
The Javascript code uses var for variable declarations. It's modern best practice to use let and const (introduced in ES6) to provide block-scoping and prevent accidental re-declarations or hoisting-related bugs. Refactoring to use let and const would improve code quality and clarity.
const profileLoaded = {};|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds significant new functionality for profiling and persistent storage to the TTI benchmark tool. The changes include persistent job storage, integration with a profiling script to capture Node.js CPU, wall-time, and memory profiles, new API endpoints for profile data, and a multi-tab UI with lazy-loading and custom canvas-based flame graph rendering. The implementation is comprehensive. I've identified a few areas for improvement regarding memory efficiency, parsing robustness, and frontend security, which are detailed in the comments.
| if data, err := os.ReadFile("/proc/cpuinfo"); err == nil { | ||
| lines := strings.Split(string(data), "\n") | ||
| for _, line := range lines { | ||
| if strings.HasPrefix(line, "model name") && info.CPUModel == "" { | ||
| parts := strings.SplitN(line, ":", 2) | ||
| if len(parts) == 2 { | ||
| info.CPUModel = strings.TrimSpace(parts[1]) | ||
| } | ||
| } | ||
| if strings.HasPrefix(line, "processor") { | ||
| info.CPUCores++ | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
For better memory efficiency, especially when dealing with potentially large proc files, consider using a bufio.Scanner to read the file line by line instead of reading the whole file into memory with os.ReadFile and then splitting it. This is particularly relevant for a performance-oriented tool.
| if data, err := os.ReadFile("/proc/cpuinfo"); err == nil { | |
| lines := strings.Split(string(data), "\n") | |
| for _, line := range lines { | |
| if strings.HasPrefix(line, "model name") && info.CPUModel == "" { | |
| parts := strings.SplitN(line, ":", 2) | |
| if len(parts) == 2 { | |
| info.CPUModel = strings.TrimSpace(parts[1]) | |
| } | |
| } | |
| if strings.HasPrefix(line, "processor") { | |
| info.CPUCores++ | |
| } | |
| } | |
| } | |
| if f, err := os.Open("/proc/cpuinfo"); err == nil { | |
| defer f.Close() | |
| scanner := bufio.NewScanner(f) | |
| for scanner.Scan() { | |
| line := scanner.Text() | |
| if strings.HasPrefix(line, "model name") && info.CPUModel == "" { | |
| parts := strings.SplitN(line, ":", 2) | |
| if len(parts) == 2 { | |
| info.CPUModel = strings.TrimSpace(parts[1]) | |
| } | |
| } | |
| if strings.HasPrefix(line, "processor") { | |
| info.CPUCores++ | |
| } | |
| } | |
| } |
| if data, err := os.ReadFile("/proc/meminfo"); err == nil { | ||
| for _, line := range strings.Split(string(data), "\n") { | ||
| if strings.HasPrefix(line, "MemTotal:") { | ||
| fields := strings.Fields(line) | ||
| if len(fields) >= 2 { | ||
| if kb, err := strconv.ParseInt(fields[1], 10, 64); err == nil { | ||
| info.TotalRAMMB = kb / 1024 | ||
| } | ||
| } | ||
| break | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Similar to the /proc/cpuinfo parsing, using bufio.Scanner here would be more memory-efficient than os.ReadFile followed by strings.Split. Reading line-by-line avoids loading the entire file into memory.
if f, err := os.Open("/proc/meminfo"); err == nil {
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "MemTotal:") {
fields := strings.Fields(line)
if len(fields) >= 2 {
if kb, err := strconv.ParseInt(fields[1], 10, 64); err == nil {
info.TotalRAMMB = kb / 1024
}
}
break
}
}
}| if len(beginStack) > 0 { | ||
| begin := beginStack[len(beginStack)-1] | ||
| beginStack = beginStack[:len(beginStack)-1] | ||
| dur := e.Ts - begin.Ts | ||
| spans = append(spans, traceSpan{name: begin.Name, ts: begin.Ts, endTs: e.Ts, dur: dur}) | ||
| } |
There was a problem hiding this comment.
The current logic for handling 'E' (end) events assumes a strict LIFO order and pairs it with whatever 'B' (begin) event is on top of the stack. This could lead to incorrect duration calculations if the trace events are not perfectly nested (e.g. B(A), B(B), E(A), E(B)). A more robust implementation would verify that the name of the 'B' event on the stack matches the 'E' event before pairing them.
| }).catch(function(e) { | ||
| el.innerHTML = '<div class="flame-loading" style="color:#f87171">Error: ' + e.message + '</div>'; | ||
| }); |
There was a problem hiding this comment.
Displaying error messages using innerHTML with string concatenation can be a security risk (Cross-Site Scripting) if the error message were to contain malicious HTML. While e.message from fetch is usually safe, it's a security best practice to use textContent to display potentially untrusted content. This also applies to the error handler in the loadMemory function.
Consider creating a helper function to display errors safely and avoid code duplication.
}).catch(function(e) {
el.innerHTML = '';
const errorDiv = document.createElement('div');
errorDiv.className = 'flame-loading';
errorDiv.style.color = '#f87171';
errorDiv.textContent = 'Error: ' + e.message;
el.appendChild(errorDiv);
});| }).catch(function(e) { | ||
| el.innerHTML = '<div class="flame-loading" style="color:#f87171">Error: ' + e.message + '</div>'; | ||
| }); |
There was a problem hiding this comment.
Displaying error messages using innerHTML with string concatenation can be a security risk (Cross-Site Scripting). As mentioned in another comment, it's safer to use textContent when displaying error messages. Reusing a helper function for error display would be ideal.
}).catch(function(e) {
el.innerHTML = '';
const errorDiv = document.createElement('div');
errorDiv.className = 'flame-loading';
errorDiv.style.color = '#f87171';
errorDiv.textContent = 'Error: ' + e.message;
el.appendChild(errorDiv);
});There was a problem hiding this comment.
Pull request overview
Adds persistent, profile-aware benchmark job handling to tti-bench, including disk-backed job storage, optional Node profiling ingestion (CPU/wall/memory), and UI support for viewing flame graphs and timelines.
Changes:
- Persist benchmark jobs/results to disk and reload them on server start (with interrupted jobs marked as error).
- Add profiling ingestion endpoints (
/profile/*) and parsers for V8 CPU, wall trace, and memory traces. - Enhance the detail UI with lazy-loaded multi-tab profile views and add an SSE updates endpoint.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| bench/internal/web/templates/index.html | Adds profiling form controls, system info display, tabbed profile UI, and Canvas renderers. |
| bench/internal/web/handler.go | Adds SSE + profile JSON routes and profile directory selection/validation. |
| bench/internal/profile/cpu.go | Implements V8 .cpuprofile parsing into a flame-tree JSON model. |
| bench/internal/profile/wall.go | Implements wall_trace.json parsing into a flame-tree JSON model. |
| bench/internal/profile/memory.go | Implements mem_trace.json parsing into timeline + attribution JSON. |
| bench/internal/benchmark/store.go | Introduces disk persistence, reload-on-start, and SSE subscription notifications. |
| bench/internal/benchmark/runner.go | Adds run-profile.sh execution mode and system info capture. |
| bench/internal/benchmark/result.go | Extends job/run structs for profiling + system info + on-disk paths. |
| bench/cmd/server/main.go | Adds -data flag and passes configured storage dir to the store. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| func (s *Store) Delete(id string) bool { | ||
| s.mu.Lock() | ||
| defer s.mu.Unlock() | ||
| _, ok := s.jobs[id] | ||
| r, ok := s.jobs[id] | ||
| if !ok { | ||
| return false | ||
| } | ||
| delete(s.jobs, id) | ||
| return ok | ||
| // Remove persisted data so the job doesn't reload on restart. | ||
| if r.DataDir != "" { | ||
| if err := os.RemoveAll(r.DataDir); err != nil { | ||
| log.Printf("store: remove data dir %s: %v", r.DataDir, err) | ||
| } | ||
| } |
| func toMem(d diskResult) *BenchmarkResult { | ||
| return &BenchmarkResult{ | ||
| ID: d.ID, | ||
| Config: BenchmarkConfig{ | ||
| Binary: d.Config.Binary, | ||
| Args: d.Config.Args, | ||
| PromptPatterns: d.Config.PromptPatterns, | ||
| Runs: d.Config.Runs, | ||
| Timeout: time.Duration(d.Config.TimeoutSec) * time.Second, | ||
| CooldownMs: d.Config.CooldownMs, | ||
| UseProfileScript: d.Config.UseProfileScript, | ||
| ProfileScriptPath: d.Config.ProfileScriptPath, | ||
| }, | ||
| Runs: d.Runs, | ||
| Status: d.Status, | ||
| StartedAt: d.StartedAt, | ||
| FinishedAt: d.FinishedAt, | ||
| Stats: d.Stats, | ||
| SystemInfo: d.SystemInfo, | ||
| DataDir: d.DataDir, | ||
| } |
| cmd.Dir = filepath.Dir(absScript) | ||
| out, err := cmd.CombinedOutput() | ||
| if err != nil { | ||
| r.store.Finish(id, nil, fmt.Sprintf("script failed: %v\n%s", err, out)) |
| const tip = document.getElementById('flame-tooltip'); | ||
| renderFlameGraph(canvas, tree, tip); | ||
| }).catch(function(e) { | ||
| el.innerHTML = '<div class="flame-loading" style="color:#f87171">Error: ' + e.message + '</div>'; | ||
| }); |
| el.appendChild(tbl); | ||
| } | ||
| }).catch(function(e) { | ||
| el.innerHTML = '<div class="flame-loading" style="color:#f87171">Error: ' + e.message + '</div>'; |
| rootID = n.ID | ||
| break | ||
| } | ||
| } |
|
|
||
| parent := stack[len(stack)-1].node | ||
| parent.Children = append(parent.Children, node) | ||
| parent.SelfMs -= node.TotalMs |
| if run.TTIMs == nil && !run.TimedOut && run.Error == "" { | ||
| r.Runs[i].Error = "interrupted by server restart" | ||
| } | ||
| } |
| (function() { | ||
| // Lazy-load profile data and render flame graphs only when tab is activated | ||
| const profileLoaded = {}; | ||
| const jobID = "{{.JobID}}"; | ||
|
|
||
| function activateTab(tabName) { | ||
| document.querySelectorAll('.tab-btn').forEach(function(b) { b.classList.toggle('active', b.dataset.tab === tabName); }); | ||
| document.querySelectorAll('.tab-pane').forEach(function(p) { p.classList.toggle('active', p.id === 'tab-' + tabName); }); | ||
| if (!profileLoaded[tabName]) { | ||
| profileLoaded[tabName] = true; | ||
| loadTab(tabName); | ||
| } | ||
| } |
| func buildMemoryData(snapshots []MemSnapshot) *MemoryData { | ||
| result := &MemoryData{} | ||
|
|
||
| // Build timeline; timestamps from performance.now() are already in milliseconds. | ||
| startTs := snapshots[0].Timestamp | ||
| for _, s := range snapshots { | ||
| result.Timeline = append(result.Timeline, TimePoint{ | ||
| TsMs: s.Timestamp - startTs, | ||
| RSS: s.RSS, | ||
| HeapUsed: s.HeapUsed, | ||
| HeapTotal: s.HeapTotal, | ||
| External: s.External, | ||
| }) | ||
| } | ||
|
|
||
| // Per-module heap attribution: sum positive deltas grouped by module name. | ||
| modDelta := make(map[string]int64) | ||
| for i := 1; i < len(snapshots); i++ { | ||
| prev, cur := snapshots[i-1], snapshots[i] | ||
| delta := (cur.HeapUsed - prev.HeapUsed) / 1024 // bytes -> KB | ||
| if delta > 0 && cur.Module != "" { | ||
| modDelta[cur.Module] += delta | ||
| } | ||
| } |
Summary
data/jobs/{id}/results.json; interrupted jobs are marked aserroron reloadUseProfileScriptmode invokesrun-profile.shand reads per-run V8 CPU, wall-time, and memory profilesbench/internal/profile/package parses.cpuprofile,wall_trace.json, andmem_trace.jsoninto Go structs served via JSON APIGET /benchmark/{id}/eventspushes liveupdateevents alongside existing HTMX pollingNew routes
GET /benchmark/{id}/profile/cpuGET /benchmark/{id}/profile/wallGET /benchmark/{id}/profile/memoryGET /benchmark/{id}/eventsTest plan
cd bench && go build -o ../tti-bench ./cmd/server/data/jobs/*/results.jsonpresent)run-profile.shpath — verify profile tabs appear and flame graphs render/benchmark/{id}/events) — verifyupdateevents fire as runs completeerror🤖 Generated with Claude Code