-
Notifications
You must be signed in to change notification settings - Fork 18
⚡ Copilot Token Optimization2026-04-03 — Documentation Maintainer #1643
Description
Target Workflow: doc-maintainer
Source report: #1642
Estimated cost per run: $3.91
Total tokens per run: ~3,003K
Cache hit rate: 48.7%
LLM turns: 32
Model: claude-sonnet-4.6 (via Copilot provider)
Current Configuration
| Setting | Value |
|---|---|
| Tools loaded | github (22 tools — context,repos,issues,pull_requests), edit, bash, safeoutputs (4 tools) — 28 total |
| Tools actually used | bash (git/file ops), edit (file edits), safeoutputs.create_pull_request — 3 effective |
| GitHub MCP tools called | 0 — zero GitHub MCP tool invocations observed in run logs |
| Network groups | defaults only (appropriate) |
| Pre-agent steps | ❌ No — agent discovers git history and doc files at runtime |
| Post-agent steps | ❌ No |
max-turns |
Not set (no guard — ran 32 turns) |
| Prompt size | ~4,161 chars in .md source |
Key finding: The MCP gateway logs for run §23936885217 show exactly one [ROUTED] tool call: create_pull_request (safeoutputs). The 22 GitHub MCP tools (list_commits, get_file_contents, issue_read, etc.) were loaded but never called — the agent used bash (git/find/cat) for all discovery and file reading work.
Recommendations
1. Remove the github: toolset entirely
Estimated savings: ~422K tokens/run (~14%)
The agent uses bash for all git and file operations, and safeoutputs.create_pull_request for the final PR. The 22 GitHub MCP tools are dead weight in the system prompt.
Current (.github/workflows/doc-maintainer.md):
tools:
github:
toolsets: [default]
edit:
bash: trueChange to:
tools:
edit:
bash: trueEach GitHub tool schema adds ~600 tokens to the system prompt on every turn. Removing 22 tools saves ~13,200 tokens/turn × 32 turns = ~422K tokens/run.
At claude-sonnet-4.6 pricing ($3/M input), this alone saves **$1.27/run** — reducing per-run cost from $3.91 to ~$2.64.
⚠️ If future workflow changes need GitHub API access (e.g., checking PR existence), consider addingtoolsets: [context](3 tools) rather than[default](22 tools).
2. Add a max-turns guard
Estimated savings: Prevents worst-case runaway; ~0–20% depending on run
The workflow has timeout-minutes: 15 but no max-turns. The observed run used 32 turns. A guard prevents cost explosions if the agent iterates unexpectedly.
Add to frontmatter:
max-turns: 25
timeout-minutes: 15Setting max-turns: 25 allows a healthy completion margin while capping extreme cases. For a doc-sync task reviewing one week of changes, 25 turns is generous.
If this had been set to 20 on today's run, it would have saved ~12 turns × ~50K tokens avg = ~600K tokens (20%). Note: we can't know if 32 turns was normal or excessive without more run history — use a conservative cap initially.
3. Pre-compute git changes in steps: and inject into prompt
Estimated savings: ~150–250K tokens/run (~5–8%)
The agent's early turns are spent running git log --since=7days, git show <sha>, and discovering which files changed. These are fully deterministic and can be done before the agent starts.
Add a steps: section to pre-compute and inject via template variable:
steps:
- name: Gather recent git changes
id: git_changes
run: |
echo "RECENT_COMMITS<<EOF" >> $GITHUB_OUTPUT
git log --since="7 days ago" --name-only --format="%H %s" | head -100
echo "EOF" >> $GITHUB_OUTPUT
- name: List documentation files
id: doc_files
run: |
echo "DOC_FILES<<EOF" >> $GITHUB_OUTPUT
find docs/ -name "*.md" && find . -maxdepth 1 -name "*.md" | sort
echo "EOF" >> $GITHUB_OUTPUT
```
Then reference in the prompt body with:
```
## Recent Changes (Pre-computed)
<details>
<summary>Git log — past 7 days</summary>
\{\{ steps.git_changes.outputs.RECENT_COMMITS }}
</details>
## Documentation Files
\{\{ steps.doc_files.outputs.DOC_FILES }}This eliminates the agent's discovery phase (typically 3–5 turns), saving ~5 turns × 38K avg tokens = ~190K tokens/run.
ℹ️ See AGENTS.md: "gh-aw workflows render the agent prompt in the activation job;
.mdsteps:run later in the agent job, so step outputs can't be injected into the prompt unless computed in activation." Confirm thesteps:outputs approach matches the gh-aw version in use (v0.47.0per lock file header).
4. Improve cache warming — move static content earlier in the prompt
Estimated savings: Converts ~33K cold tokens on turn 1 to cached reads; ~$0.10–0.15/run
The first LLM turn has 0 cache reads (cold start costs ~$3/M for input vs $0.30/M cached). The system prompt, tool schemas, and static instructions should be as stable as possible to maximize cache reuse across daily runs.
- Ensure the prompt body doesn't include run-specific data at the top (dates, run IDs)
- Move variable content (pre-computed git changes) to the end of the injected prompt so the static prefix is as long as possible
- The system prompt + tool schemas = ~20K tokens that should warm the cache by turn 2
Currently working: turns 2–32 have ~48.7% cache rate (good). Turn 1 is always cold — that's unavoidable for the static content, but keeping it cold for only 33K tokens (turn 1) vs leaking dynamic content that shifts the cache boundary is important.
Expected Impact
| Metric | Current | Projected | Savings |
|---|---|---|---|
| Total tokens/run | 3,003K | ~2,200K | ~−27% |
| Input tokens/run | 1,535K | ~1,050K | ~−32% |
| Cost/run | $3.91 | ~$2.55 | ~−35% |
| LLM turns | 32 | ≤25 (capped) | −22% worst case |
| GitHub tools loaded | 22 | 0 | −100% |
Cost projection combines Rec #1 (~14% savings from tool removal) + Rec #3 (~8% from pre-steps) + partial cache improvement. Rec #2 (max-turns) is a safety cap that may save more on "long" runs.
Implementation Checklist
- Edit
.github/workflows/doc-maintainer.md: removegithub: toolsets: [default]fromtools:section - Edit
.github/workflows/doc-maintainer.md: addmax-turns: 25to frontmatter - Edit
.github/workflows/doc-maintainer.md: addsteps:block for git changes and doc file discovery - Edit
.github/workflows/doc-maintainer.md: update prompt body to reference\{\{ steps.*.outputs.* }}template variables - Recompile:
gh aw compile .github/workflows/doc-maintainer.md - Post-process:
npx tsx scripts/ci/postprocess-smoke-workflows.ts - Verify lock file updated correctly (
doc-maintainer.lock.ymlshould no longer reference GitHub MCP server config) - Trigger a
workflow_dispatchrun and compare token-usage.jsonl vs baseline - Confirm PR creation still works (safeoutputs
create_pull_requestis unaffected by tool changes)
Generated by Daily Copilot Token Optimization Advisor · ◷