[copilot-cli-research] Copilot CLI Deep Research - April 2026 #24174

2026-04-02T21:14:38Z

github-actions[bot]
bot Apr 2, 2026

🔍 Copilot CLI usage analysis across 88 Copilot workflows (out of 179 total).

Analysis Date: 2026-04-02 · Triggered by: @pelikhan · Run: §23921862570

📊 Executive Summary

This is the first comprehensive analysis of GitHub Copilot CLI feature utilization in this repository. Out of 179 total agentic workflow files, 88 (49%) use the Copilot engine. The repository shows strong adoption of GitHub's MCP tooling, safe-outputs, and the shared imports system — but several powerful Copilot-specific features remain largely untapped.

Top 3 findings:

🔴 ~30 workflows use bash: ["*"] (allow-all-tools) without AWF firewall sandbox — a significant security posture gap.
🟡 Autopilot mode (max-continuations) is used in only 1 workflow despite being Copilot-exclusive and well-suited for complex multi-step tasks.
🟡 Custom agent files are used in only 3 workflows despite 10+ workflows having prompts over 500 lines that would benefit from dedicated agent identities.

🔴 Critical Findings

1. Bash Wildcard Without Sandbox (30 Workflows)

Approximately 30 workflows use bash: ["*"] (which triggers --allow-all-tools in the Copilot CLI) without sandbox: agent: awf. This means the AI has unrestricted shell access on the unprotected runner.

Example vulnerable pattern:

engine: copilot
tools:
  bash:
    - "*"
  # No sandbox: section

Affected workflows include: daily-regulatory.md, daily-firewall-report.md, org-health-report.md, cli-consistency-checker.md, cli-version-checker.md, terminal-stylist.md, stale-repo-identifier.md, daily-integrity-analysis.md, deep-report.md, functional-pragmatist.md, daily-architecture-diagram.md, and ~19 more.

Recommended fix:

engine: copilot
sandbox:
  agent: awf  # Firewall + container isolation
tools:
  bash:
    - "*"

2. network.blocked Domain Feature Essentially Unused

Only 1 workflow uses network.blocked: to deny specific domains, despite this feature being available for all engines. This feature can harden workflows against exfiltration to known bad domains or scope-creep to unexpected services.

Recommended usage:

network:
  allowed:
    - defaults
  blocked:
    - "~*"  # Block all non-allowlisted domains

🟡 Medium Priority Opportunities

3. Autopilot Mode (max-continuations) Severely Underutilized

Only smoke-copilot.md uses max-continuations: 2. This Copilot-exclusive feature enables the --autopilot --max-autopilot-continues N flags, allowing complex tasks to span multiple consecutive runs automatically.

Best candidates for autopilot:

repository-quality-improver.md (565 lines of instructions)
daily-safe-output-integrator.md (739 lines, complex multi-phase task)
release.md (657 lines, multi-step release process)

Example:

engine:
  id: copilot
  max-continuations: 3  # Agent can run up to 3 consecutive autopilot sessions

4. Custom Agent Files Underutilized

Only 3 custom agent files exist (.github/agents/):

technical-doc-writer.agent.md — used by 2 workflows
ci-cleaner.agent.md — used by 1 workflow
agentic-workflows.agent.md — available but rarely referenced

Yet 10+ workflows exceed 500 lines of prompt text. Custom agent files provide a stable identity, shared system prompts, and can be reused across workflows without duplication.

Top candidates for custom agent files:

Workflow	Lines	Suggested Agent
`functional-pragmatist.md`	1,475	`fp-analyst.agent.md`
`bot-detection.md`	925	`bot-detection.agent.md`
`daily-security-red-team.md`	835	`security-red-team.agent.md`
`repo-audit-analyzer.md`	779	`repo-auditor.agent.md`
`daily-safe-output-integrator.md`	739	(already complex enough)
`agent-performance-analyzer.md`	582	`performance-analyst.agent.md`

Example:

engine:
  id: copilot
  agent: security-red-team

5. Model Selection Not Used Strategically

Only 6 workflows override the default model. Many simple, high-frequency workflows (daily reports, summaries, triage) run on the default flagship model when a lighter, faster model like gpt-5.1-codex-mini would suffice — reducing cost and latency.

Simple workflows that could use lighter models:

poem-bot.md ✅ (already uses gpt-5)
daily-fact.md ✅ (already uses gpt-5.1-codex-mini)
Candidates: ai-moderator.md, blog-auditor.md, sub-issue-closer.md, step-name-alignment.md

Example:

engine:
  id: copilot
  model: gpt-5.1-codex-mini  # Faster, cheaper for simple tasks

6. web-search Severely Underutilized (3 Workflows)

Only 3 workflows use web-search: vs 17 that use web-fetch:. The Copilot engine supports web search through MCP. Research-oriented workflows (deep-report.md, research.md, prompt-clustering-analysis.md) that rely on pre-fetched data could benefit from real-time web search.

Example:

tools:
  web-search:   # Enable web search via MCP
  web-fetch:    # Also keep direct URL fetching

🟢 Low Priority / Nice-to-Haves

7. tracker-id Coverage (65/179 = 36%)

tracker-id: enables run lineage tracking via the episode graph. Only 36% of workflows have it. Adding it to all workflows would improve the observability kit's analysis quality and enable better trend detection.

8. mcp-scripts in Main Workflows

mcp-scripts: are used in 5 shared files (shared/go-make.md, shared/gh.md, etc.) but rarely defined in main workflows. This powerful feature allows typed, schema-validated tool definitions that give agents structured access to CLI tools instead of raw bash.

Example use case: Instead of bash: ["go", "make"], define:

mcp-scripts:
  run-tests:
    description: "Run the test suite with optional filter"
    inputs:
      pattern:
        type: string
        description: "Test pattern to run (e.g. TestCompile)"
    run: |
      go test -v -run "$INPUT_PATTERN" ./...

9. GitHub MCP Toolsets Not Optimized

Many workflows use broad toolsets like toolsets: [default] (47 workflows) or even toolsets: [all] (3 workflows) when they only need 1-2 specific APIs. Over-broad toolsets mean more potential for unintended tool use.

Toolset usage breakdown:

Toolset Config	Count
`[default]`	47
`[default, discussions]`	10
`[default, actions]`	5
`[repos, pull_requests]`	3
`[all]`	3 ← overly broad

📈 Feature Usage Matrix

Feature	Available	Workflows Using	Usage Rate	Notes
`sandbox: agent: awf`	✅	12	14% of copilot	Security gap
`engine.agent` (custom file)	✅	4	5%	Very underused
`engine.model`	✅	6	7%	Underused for cost savings
`engine.max-continuations`	✅	1	1%	Barely used
`engine.args`	✅	9	10%	Reasonable
`engine.api-target`	✅	0	0%	Enterprise-only feature
`engine.env`	✅	20	23%	Good adoption
`network.blocked`	✅	1	1%	Essentially unused
`safe-outputs`	✅	153	85%	🌟 Great adoption
`cache-memory`	✅	65	36%	Good adoption
`mcp-scripts`	✅	5	6%	Via shared files
`tools.web-fetch`	✅	17	19%	Moderate
`tools.web-search`	✅	3	3%	Very low
`tools.playwright`	✅	12	14%	Good for browser tasks
`bash: ["*"]` (allow-all)	✅	34	39%	30 without sandbox ⚠️
`features.copilot-requests`	✅	41	47%	Good adoption
`imports`	✅	175	98%	🌟 Excellent adoption
`tracker-id`	✅	57	32%	Could be higher

Copilot CLI Capabilities Inventory (Full)

Available CLI Flags (compiler-generated)

--add-dir — Grant file access to directories (auto-configured)
--agent — Use a custom agent file (via engine.agent:)
--allow-all-paths — Allow write to all paths (when tools.edit: enabled)
--allow-all-tools — Bypass tool allowlist (when bash: ["*"])
--allow-tool — Granular tool permission (auto-configured per tool)
--autopilot — Enable autopilot mode (via engine.max-continuations:)
--disable-builtin-mcps — Disable built-in MCP servers (always set)
--log-dir — Log output directory (auto-configured)
--log-level — Log verbosity (always all)
--max-autopilot-continues — Max autopilot sessions (via max-continuations)
--prompt — Workflow prompt (auto-configured)

Engine Config Options

id — Engine identifier (copilot)
version — CLI version pinning (default: latest)
model — Override LLM model (e.g., gpt-5.1-codex-mini)
max-continuations — Autopilot mode continuation count
args — Custom extra CLI arguments
agent — Custom agent file reference (.github/agents/)
api-target — Enterprise/GHES API endpoint override
env — Custom environment variables
command — Custom binary path (skip installation)
concurrency — Job-level concurrency

Feature Flags (via `features:`)

copilot-requests — Enable S2S token mode
mcp-gateway — Enable MCP gateway proxy
disable-xpia-prompt — Disable XPIA injection protection prompt

Available Custom Agent Files (`.github/agents/`)

technical-doc-writer.agent.md
ci-cleaner.agent.md
agentic-workflows.agent.md
contribution-checker.agent.md
create-safe-output-type.agent.md
custom-engine-implementation.agent.md
grumpy-reviewer.agent.md
interactive-agent-designer.agent.md
w3c-specification-writer.agent.md

Workflow-Specific Recommendations

`functional-pragmatist.md` (1,475 lines)

Issue: Monolithic 1,475-line prompt without custom agent identity
Recommendation: Extract core persona/instructions to fp-analyst.agent.md, keep workflow-specific context in the .md file
Benefit: Reusability, cleaner diffs when updating instructions

`bot-detection.md` (925 lines)

Issue: Complex security analysis without sandbox
Recommendation: Add sandbox: agent: awf + create bot-detection.agent.md
Benefit: Security isolation + reusable agent identity

`daily-news.md`

Current: Uses AWF sandbox + bash wildcard + web-fetch + edit (good!)
Recommendation: Already well-configured; model for other complex workflows
Note: This workflow is a good reference implementation

`smoke-copilot.md`

Current: Uses max-continuations, cache-memory, playwright, web-fetch, network (most feature-rich workflow)
Status: Already a reference implementation for Copilot features

`agentic-observability-kit.md`

Current: Well-configured (strict mode, tracker-id, safe-outputs, specific toolsets)
Recommendation: Could add tracker-id to more workflows it monitors to improve data quality

`daily-cli-tools-tester.md`

Issue: Uses bash: ["*"] + agentic-workflows tool without sandbox
Recommendation: Add sandbox: agent: awf

Research workflows (`deep-report.md`, `research.md`)

Recommendation: Add tools: web-search: for real-time search capability instead of relying only on pre-fetched data

🎯 Action Items

Immediate (Security):

Audit the ~30 workflows with bash: ["*"] and no sandbox — add sandbox: agent: awf where appropriate
Consider enabling network.blocked: ["~*"] as a deny-by-default posture for high-sensitivity workflows

Short-term (Developer Experience):

Create custom agent files for the top 5 longest-prompt workflows
Evaluate max-continuations: 2-3 for complex multi-step workflows (repository-quality-improver.md, release.md)
Add tracker-id: to the ~120 workflows that lack it

Medium-term (Optimization):

Audit workflows using toolsets: [default] and narrow to specific toolsets where possible
Evaluate gpt-5.1-codex-mini model for high-frequency simple workflows (daily facts, poem-bot, simple summarizers)
Add web-search: to research-oriented workflows that currently only use web-fetch:

🔬 Research Methodology

Code analysis: Reviewed pkg/workflow/copilot_engine*.go, pkg/workflow/copilot_mcp.go, pkg/constants/ for available features
Workflow survey: Analyzed all 179 .md workflow files in .github/workflows/
Pattern matching: Used grep -r to count feature adoption across all workflows
Security analysis: Cross-referenced bash: ["*"] usage against sandbox: presence
Size analysis: wc -l on all workflow files to identify candidates for custom agents

References:

AI generated by Copilot CLI Deep Research Agent · history

expires on Apr 3, 2026, 9:14 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - April 2026 #24174

Uh oh!

{{title}}

Uh oh!

Available CLI Flags (compiler-generated)

Engine Config Options

Feature Flags (via `features:`)

Available Custom Agent Files (`.github/agents/`)

`functional-pragmatist.md` (1,475 lines)

`bot-detection.md` (925 lines)

`daily-news.md`

`smoke-copilot.md`

`agentic-observability-kit.md`

`daily-cli-tools-tester.md`

Research workflows (`deep-report.md`, `research.md`)

Replies: 0 comments

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - April 2026 #24174

Uh oh!

github-actions[bot] bot Apr 2, 2026

📊 Executive Summary

🔴 Critical Findings

1. Bash Wildcard Without Sandbox (30 Workflows)

2. network.blocked Domain Feature Essentially Unused

🟡 Medium Priority Opportunities

3. Autopilot Mode (max-continuations) Severely Underutilized

4. Custom Agent Files Underutilized

5. Model Selection Not Used Strategically

6. web-search Severely Underutilized (3 Workflows)

🟢 Low Priority / Nice-to-Haves

7. tracker-id Coverage (65/179 = 36%)

8. mcp-scripts in Main Workflows

9. GitHub MCP Toolsets Not Optimized

📈 Feature Usage Matrix

Available CLI Flags (compiler-generated)

Engine Config Options

Feature Flags (via features:)

Available Custom Agent Files (.github/agents/)

functional-pragmatist.md (1,475 lines)

bot-detection.md (925 lines)

daily-news.md

smoke-copilot.md

agentic-observability-kit.md

daily-cli-tools-tester.md

Research workflows (deep-report.md, research.md)

🎯 Action Items

🔬 Research Methodology

Replies: 0 comments

github-actions[bot]
bot Apr 2, 2026

Feature Flags (via `features:`)

Available Custom Agent Files (`.github/agents/`)

`functional-pragmatist.md` (1,475 lines)

`bot-detection.md` (925 lines)

`daily-news.md`

`smoke-copilot.md`

`agentic-observability-kit.md`

`daily-cli-tools-tester.md`

Research workflows (`deep-report.md`, `research.md`)