A complete profiling testbed that measures where time, CPU, and memory are spent from the moment a user types gemini to the point the prompt is ready for input.
Produces an interactive HTML dashboard with zoomable flame graphs for CPU, wall time, and memory — plus a memory timeline and cross-run comparison chart.
| Metric | How | Output |
|---|---|---|
| Wall time (TTI) | PTY monitoring for prompt-ready pattern | Milliseconds from exec to interactive |
| CPU flame graph | V8 --cpu-prof at 100μs intervals |
.cpuprofile → interactive flame chart |
| Wall-time flame | require() monkey-patch + GC observer + event loop lag |
Module load tree with timing |
| Memory flame | Heap snapshots per module load via process.memoryUsage() |
Attribution by module/package |
| Memory timeline | Periodic v8.getHeapStatistics() snapshots |
RSS, heap, external over time |
run-profile.sh ← Orchestrator: launches gemini with profiling flags
├── _require_hook.cjs ← Injected via --require: traces every require() call
├── _memory_hook.cjs ← Injected via --require: snapshots heap per module
└── node --cpu-prof ← V8 CPU profiler via NODE_OPTIONS
└── gemini CLI ← The actual target being profiled
src/generate-report.js ← Parses all profile data → single interactive HTML
└── output/index.html ← Dashboard with canvas flame graphs
The harness monitors terminal output for patterns indicating Gemini CLI is ready for input:
[INSERT]/NORMAL mode— editor mode indicators in the input areafor shortcuts/shift+tab to accept— UI chrome visible when prompt is readyEnter a prompt/Type a message— fallback text patterns
This gives the true time-to-interactive (TTI) — what the user actually experiences.
# Prerequisites: Node.js 18+, gemini CLI installed and authenticated
npm install -g @google/gemini-cli
gemini # complete auth setup before benchmarking
# Run 5 profiling iterations (default) and open the report
bash run-profile.sh
open output/index.html # macOS
xdg-open output/index.html # Linux
# Serve the report locally (auto-opens browser)
bash run-profile.sh --serve
# Faster iteration: fewer runs, explicit timeout
bash run-profile.sh --runs 3 --timeout 60
# Regenerate the report from existing profiles without re-running
node src/generate-report.js --input ./profiles --output ./output
open output/index.html--runs N Number of profiling runs (default: 5)
--cpu-only Only collect CPU profile
--mem-only Only collect memory/heap profile
--wall-only Only collect wall-time trace
--timeout N Seconds to wait for prompt ready (default: 120)
--gemini-path P Path to gemini binary (default: auto-detect)
--output-dir D Output directory (default: ./profiles)
--cold Drop filesystem caches between runs (requires sudo)
--no-report Skip HTML report generation
--serve Start a local server and open the report in a browser
--port N Port for the local server (default: 8080)
Note: Complete
geminiauthentication before benchmarking. An unauthenticated or slow-auth session will inflate TTI numbers with network round-trips unrelated to startup performance.
- Click any frame to zoom in
- Right-click to zoom back out
- Search to highlight matching frames
- Hover for detailed tooltips (self time, total time, file location)
- Breadcrumb trail shows zoom path
- Color-coded by category (Node builtins, app code, dependencies, GC)
- RSS, Heap Total, Heap Used, External memory plotted over startup duration
- Shows exactly when memory spikes occur and which phase causes them
- Bar chart of TTI across all profiling runs
- Average line with variance indication
- Color-coded: green (fast), blue (normal), red (slow)
The orchestrator (run-profile.sh) launches gemini with:
- V8 CPU profiler (
--cpu-prof --cpu-prof-interval=100) — 100μs sampling - Require hook (
--require _require_hook.cjs) — monkey-patchesModule._loadto time everyrequire()call and builds a trace event timeline - Memory hook (
--require _memory_hook.cjs) — snapshotsprocess.memoryUsage()+v8.getHeapStatistics()around each module load - PTY monitor — watches terminal output via
scriptcommand to detect the prompt-ready moment
src/generate-report.js reads all collected profiles and:
- Parses V8
.cpuprofileJSON into a d3-flamegraph-compatible tree - Converts the require trace events into a hierarchical wall-time tree
- Groups memory snapshots by module/package into a memory attribution tree
- Generates a single self-contained HTML file with Canvas-based interactive flame graphs
- Wide frames at top = functions taking most CPU overall
- Red/orange = high self-time (CPU spent in that function, not children)
- Yellow = mostly child time (orchestration functions)
- Look for unexpected hot spots in startup (e.g., JSON schema validation, crypto)
- Wide frames = slow-loading modules (disk I/O, compilation, initialization)
- Blue tones = Google packages, Green = Node builtins, Purple = UI (React/Ink)
- Red = GC pauses, Yellow = event loop lag
- The widest
require()chains show the critical startup path
- Size = heap bytes attributed to each module
- Identify which dependencies consume the most memory at startup
- Look for unexpectedly large allocations from utility packages
Edit PROMPT_PATTERNS in run-profile.sh if Gemini CLI changes its prompt indicator.
Add more V8 flags in the NODE_*_FLAGS variables:
--trace-gc-verbosefor detailed GC logs--max-old-space-size=256to test under memory pressure--proffor V8 tick profiler (lower-level than cpu-prof)
bash run-profile.sh --runs 3 --no-report --timeout 30
# Check profiles/combined/run_*/tti_ms for regression detection