feat: support token usage diff and multiple comparison runs in audit diff command by Copilot · Pull Request #24544 · github/gh-aw

Copilot · 2026-04-04T17:45:48Z

Summary

Enhances the audit diff command with two improvements:

1. Detailed token usage diff

Adds a TokenUsageDiff struct populated from RunSummary.TokenUsage (the firewall proxy token-usage.jsonl data) with a per-field breakdown:

Input tokens, output tokens, cache read/write tokens, effective tokens
Pre-computed percentage change strings for each field
Total API requests and change count
Cache efficiency per run

This is surfaced in both markdown (table) and pretty console output. The existing Run.TokenUsage (int) field in RunMetricsDiff is preserved for backward compatibility.

2. Multiple comparison runs from a single base

Changes the command signature from diff <run-id-1> <run-id-2> to diff <base-run-id> <compare-run-id-1> [<compare-run-id-2>...]:

The base run summary is downloaded/loaded once and reused for all comparisons
Each comparison run is diffed against the base independently
Self-comparison and duplicate run IDs are rejected with clear error messages
For single comparisons: JSON output remains a single object (backward compatible)
For multiple comparisons: pretty/markdown outputs are separated with dividers; JSON outputs an array

Changes

pkg/cli/audit_diff.go — New TokenUsageDiff struct + field on RunMetricsDiff; computeTokenUsageDiff; updated computeRunMetricsDiff
pkg/cli/audit_diff_command.go — MinimumNArgs(2), multi-run parsing with validation, base-run loaded once
pkg/cli/audit_diff_render.go — Multi-diff render functions, token usage detail sections in markdown/pretty
pkg/cli/audit_diff_test.go — Tests for computeTokenUsageDiff, RunMetricsDiff with token details, and multi-run diff

Examples

# Two-run diff (backward compatible)
gh aw audit diff 12345 12346

# Base vs multiple runs
gh aw audit diff 12345 12346 12347 12348

# Markdown output with token details
gh aw audit diff 12345 12346 --format markdown

…diff command Agent-Logs-Url: https://github.com/github/gh-aw/sessions/c2cbe169-434c-4f94-8c73-7af8797b7c5d Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

pelikhan · 2026-04-04T17:49:34Z

@copilot use console pkg helpers

… labels in audit diff pretty render Agent-Logs-Url: https://github.com/github/gh-aw/sessions/2e45f51d-dc34-4f35-a127-3c42a1b0ea44 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-04T17:58:28Z

@copilot use console pkg helpers

Updated in dceb448. All pretty section renderers now use console.FormatSectionHeader() for section headings (e.g. "Firewall Changes", "MCP Tool Changes", "Run Metrics", "Token Usage Details") and console.FormatInfoMessage() for sub-section labels (e.g. "New Domains (2):", "Changed Tools (1):"), consistent with the pattern in audit_report_render.go.

pelikhan · 2026-04-04T18:15:42Z

@copilot collect all diff data in tables and use console table rendering helpers

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3ec821db-7712-419a-ba3e-3bbeda023d5e Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-04T18:26:19Z

@copilot collect all diff data in tables and use console table rendering helpers

Done in 453a36b. All diff sections in the pretty console renderer now use console.RenderTable:

Firewall: separate tables for new domains (Domain/Status/Requests/Anomaly), removed domains (Domain/Previous Status/Previous Requests), status changes (Domain/Before/After/Anomaly), and volume changes (Domain/Requests before/Requests after/Change)
MCP Tools: separate tables for new tools (Server/Tool/Calls/Anomaly), removed tools (Server/Tool/Previous Calls), and changed tools (Server/Tool/Calls before/Calls after/Change/Errors before/Errors after/Anomaly)
Run Metrics: table with Metric/Run#N/Run#M/Change rows
Token Usage Details: table with Token Type/Run#N/Run#M/Change rows

Copilot

Pull request overview

Enhances the audit diff command to provide a richer token-usage breakdown (from firewall proxy token-usage.jsonl summaries) and to support diffing multiple comparison runs against a single base run in one invocation.

Changes:

Added a detailed TokenUsageDiff and integrated it into RunMetricsDiff computation.
Updated CLI command parsing/execution to accept base-run-id + 1..N comparison run IDs and reuse the base summary across comparisons.
Updated pretty/markdown/JSON renderers to support rendering multiple diffs and to display token-usage detail sections; added tests covering the new diff logic.

Show a summary per file

File	Description
pkg/cli/audit_diff.go	Adds `TokenUsageDiff` and computes detailed token usage diffs as part of run metrics.
pkg/cli/audit_diff_command.go	Changes command signature and loads the base run once for multiple comparisons.
pkg/cli/audit_diff_render.go	Updates JSON/markdown/pretty output to handle multiple diffs and render token usage details.
pkg/cli/audit_diff_test.go	Adds unit tests for token usage diffing and for metrics diff integration.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 4/4 changed files
Comments generated: 2

Copilot · 2026-04-04T18:42:30Z

pkg/cli/audit_diff.go

+	EffectiveTokensChange  string  `json:"effective_tokens_change,omitempty"`
+	Run1TotalRequests      int     `json:"run1_total_requests"`
+	Run2TotalRequests      int     `json:"run2_total_requests"`
+	RequestsChange         string  `json:"requests_change,omitempty"`


RequestsChange is formatted via formatCountChange (absolute delta like "+4") while the other "Change" fields in TokenUsageDiff are percentage strings (e.g. "+50%"). This mixes units in the same diff/table output and can be misleading; consider either formatting requests as a percentage (consistent with token fields) or renaming the field/output to make it explicit that it's an absolute request delta.

Suggested change

RequestsChange string `json:"requests_change,omitempty"`

RequestsDelta string `json:"requests_delta,omitempty"` // Absolute request-count delta, e.g. "+4"

Copilot · 2026-04-04T18:42:31Z

pkg/cli/audit_diff.go

+	Run1CacheEfficiency    float64 `json:"run1_cache_efficiency"`
+	Run2CacheEfficiency    float64 `json:"run2_cache_efficiency"`
+}


TokenUsageDiff includes per-run cache efficiency values, but there is no corresponding change field, so renderers currently leave the "Change" column blank for that row. If cache efficiency is intended to be part of the diff (per PR description), consider adding a CacheEfficiencyChange (percentage-point delta or relative %) or rendering it in a way that doesn't imply a missing value.

pelikhan · 2026-04-04T18:47:04Z

@copilot review comments

…ange Agent-Logs-Url: https://github.com/github/gh-aw/sessions/2a4a1640-5be5-4d7d-9dbe-61f73600db6b Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-04T18:54:25Z

@copilot review comments

Both review issues addressed in 3f36c01:

RequestsChange → RequestsDelta — renamed field and JSON tag (requests_delta) to make it explicit that it's an absolute count delta (e.g. +4), not a relative percentage like the other *Change fields.
CacheEfficiencyChange — added field computed as a percentage-point delta via a new formatPercentagePointChange helper (e.g. -1.5pp). The Change column in the cache efficiency table row is now populated in both markdown and pretty output.

pelikhan · 2026-04-04T19:00:53Z

@copilot fix https://github.com/github/gh-aw/actions/runs/23985383782/job/69956569649#step:10:1

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/edc57949-d44b-4550-b62b-31a85eab8f5c Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-04T19:07:28Z

@copilot fix https://github.com/github/gh-aw/actions/runs/23985383782/job/69956569649#step:10:1

Fixed in 87402be. TestArgumentNamingConventions was failing because the Use string "diff <base-run-id> <compare-run-id> [<compare-run-id-2>...]" left a <compare-run-id-2> token (after the test's prefix/suffix stripping) that didn't match ^[a-z][a-z0-9-]*$. Changed to the standard variadic form "diff <base-run-id> <compare-run-id>..." consistent with add_command.go and others.

- Update gh aw audit diff signature to reflect multi-run support: accepts one base run and one or more comparison runs (from #24544) - Document token usage diff breakdown (input/output/cache tokens, API requests, cache efficiency) added in #24544 - Document JSON array output and divider behavior for multiple diffs - Add lock file header section to workflow-structure.md documenting the gh-aw-metadata first-line placement and the Secrets/Custom actions manifest sections added in #24495 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: support token usage diff and multiple comparison runs in audit …

9670d21

…diff command Agent-Logs-Url: https://github.com/github/gh-aw/sessions/c2cbe169-434c-4f94-8c73-7af8797b7c5d Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Apr 4, 2026

Copilot created this pull request from a session on behalf of pelikhan April 4, 2026 17:46 View session

Copilot AI requested a review from pelikhan April 4, 2026 17:46

Copilot finished work on behalf of pelikhan April 4, 2026 17:46

Copilot started work on behalf of pelikhan April 4, 2026 17:49 View session

refactor: use console pkg helpers for section headers and sub-section…

dceb448

… labels in audit diff pretty render Agent-Logs-Url: https://github.com/github/gh-aw/sessions/2e45f51d-dc34-4f35-a127-3c42a1b0ea44 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 4, 2026 17:59

Copilot started work on behalf of pelikhan April 4, 2026 18:15 View session

refactor: use console.RenderTable for all diff section pretty renderers

453a36b

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3ec821db-7712-419a-ba3e-3bbeda023d5e Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot stopped work on behalf of pelikhan due to an error April 4, 2026 18:26
The "Setup uv" custom setup step from your .github/workflows/copilot-setup-steps.yml file failed.

github-actions bot mentioned this pull request Apr 4, 2026

[PR Triage Report] PR Triage Report - 2026-04-04 (Run #23984798749) #24548

Open

pelikhan marked this pull request as ready for review April 4, 2026 18:37

Copilot AI review requested due to automatic review settings April 4, 2026 18:37

Copilot started reviewing on behalf of pelikhan April 4, 2026 18:38 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Copilot started work on behalf of pelikhan April 4, 2026 18:47 View session

This was referenced Apr 4, 2026

Smoke Test: Claude - 23985081412 #24551

Closed

Smoke Test: Copilot - 23985224521 #24555

Closed

Smoke Test: Claude - 23985224528 #24557

Closed

fix: rename RequestsChange to RequestsDelta and add CacheEfficiencyCh…

3f36c01

…ange Agent-Logs-Url: https://github.com/github/gh-aw/sessions/2a4a1640-5be5-4d7d-9dbe-61f73600db6b Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 4, 2026 18:55

Copilot started work on behalf of pelikhan April 4, 2026 19:01 View session

fix: use conventional variadic syntax in audit diff Use string

87402be

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/edc57949-d44b-4550-b62b-31a85eab8f5c Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 4, 2026 19:07

pelikhan merged commit cc344d1 into main Apr 4, 2026

pelikhan deleted the copilot/support-audit-diff-token-usage branch April 4, 2026 19:13

github-actions bot mentioned this pull request Apr 4, 2026

Smoke Test: Claude - 23985734470 #24565

Closed

This was referenced Apr 4, 2026

[docs] Update documentation for features from 2026-04-04 #24574

Merged

[aw] No-Op Runs #24437

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support token usage diff and multiple comparison runs in audit diff command#24544

feat: support token usage diff and multiple comparison runs in audit diff command#24544
pelikhan merged 5 commits intomainfrom
copilot/support-audit-diff-token-usage

Copilot AI commented Apr 4, 2026

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 4, 2026

Uh oh!

Copilot AI Apr 4, 2026

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	RequestsChange string `json:"requests_change,omitempty"`
	RequestsDelta string `json:"requests_delta,omitempty"` // Absolute request-count delta, e.g. "+4"

Conversation

Copilot AI commented Apr 4, 2026

Summary

1. Detailed token usage diff

2. Multiple comparison runs from a single base

Changes

Examples

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

pelikhan commented Apr 4, 2026

Uh oh!

Copilot AI commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants