[prompt-clustering] Copilot Agent Prompt Clustering Analysis - December 14, 2025 #6449

2025-12-14T19:22:12Z

github-actions[bot]
bot Dec 14, 2025

🔬 Copilot Agent Prompt Clustering Analysis

Analysis Date: 2025-12-14

Summary

This analysis applies NLP-based clustering to 879 task prompts from copilot agent PRs over the last 30 days, identifying 5 distinct clusters of task types. The overall success rate is 77.1% with an average task duration of 2.0 hours.

Key Findings: The most common task type is "Miscellaneous Tasks" (350 tasks), while "New Features & Implementation" tasks have the highest success rate at 80.0%. Tasks typically change 14.2 files with an average of 2 hours duration.

Full Analysis Report

General Insights

Total Tasks Analyzed: 879
Clusters Identified: 5
Overall Success Rate: 77.1% (678 merged, 201 closed)
Average Duration: 2.0 hours
Average Files Changed: 14.2

Most Common Task Type: Miscellaneous Tasks (350 tasks, 39.8% of total)

Highest Success Rate: New Features & Implementation (80.0%)

Most Time-Consuming: Bug Fixes & Error Resolution (avg 3.2 hours)

Cluster Analysis

Cluster 1: Miscellaneous Tasks

Size: 350 tasks (39.8% of total)
Success Rate: 78.9% (276/350 merged)
Average Duration: 2.0 hours
Average Files Changed: 13.0
Average Additions: 403 lines
Average Comments: 1.2

Top TF-IDF Terms: details original, details original issue, original issue resolve, issue resolve, section details

Characteristics:
This cluster performs above average with a 78.9% success rate.
Tasks in this cluster complete faster than average.
These tasks typically involve fewer file changes.

Example PRs:

#3779: Implement golden file testing for compiler output validation
#3505: Add create-commit-status safe output type with pending/final status lifecycle
#3503: Refactor: Extract duplicate agent output handler boilerplate

Sample Prompts

---- > > This section details on the original issue you should resolve > >

Cluster 2: New Features & Implementation

Size: 296 tasks (33.7% of total)
Success Rate: 74.0% (219/296 merged)
Average Duration: 1.9 hours
Average Files Changed: 19.8
Average Additions: 776 lines
Average Comments: 1.9

Top TF-IDF Terms: update, github, file, run, code

Common Keywords: update, add, test, create, ui

Characteristics:
This cluster performs below average with a 74.0% success rate.
Tasks in this cluster complete faster than average.
These tasks typically involve more extensive file changes.

Example PRs:

#2253: Add OIDC authentication with API key fallback - enabled by default for Claude
#2430: [WIP] Add firewall feature to all agentic workflows
#3917: Replace activation job checkout with GitHub API for timestamp checking

Sample Prompts

Update the frontmatter "imports" documentation under /docs with all the supported URL and path formats

The copilot engine should review the --add-dir folders used in the args list and make sure those folders exist before running the cli. > > Emit "mkdir -p ..." to prepare those directories.

Cluster 3: New Features & Implementation

Size: 110 tasks (12.5% of total)
Success Rate: 80.0% (88/110 merged)
Average Duration: 1.9 hours
Average Files Changed: 5.5
Average Additions: 1711 lines
Average Comments: 1.8

Top TF-IDF Terms: workflow, agentic, agentic workflow, daily, create

Common Keywords: create, update, add, ui, test

Characteristics:
This cluster performs above average with a 80.0% success rate.
Tasks in this cluster complete faster than average.
These tasks typically involve fewer file changes.

Example PRs:

#4086: Add shared github-context.md import for comprehensive GitHub invocation context
#2813: Add Ollama Llama Guard 3 threat scanning for safe outputs
#3507: Add Weekly MCP Registry Insights workflow with live MCP server integration

Sample Prompts

Add a agentic workflow "smoke outpost" that investigate and determines the root cause of failed "smoke..." agentic workflow runs. See ci-doctor for trigger strategy. > > The workflow contains the prom...

Generate an agentic workflow that clusters functions semantically by name and tries to refactor similar functionalities. > > - imports Serena.md > - enables github tool > - uses Claude > - has edit to...

Cluster 4: Bug Fixes & Error Resolution

Size: 98 tasks (11.1% of total)
Success Rate: 76.5% (75/98 merged)
Average Duration: 3.2 hours
Average Files Changed: 11.6
Average Additions: 473 lines
Average Comments: 1.7

Top TF-IDF Terms: add, command, firewall, compile, field

Common Keywords: add, update, test, ui, error

Characteristics:
This cluster performs below average with a 76.5% success rate.
Tasks in this cluster take longer than average.
These tasks typically involve fewer file changes.

Example PRs:

#3602: Add optional fingerprint field for asset tracking
#2319: Add firewall version of changeset-generator workflow
#2125: Update dev.md to respond to /dev command in discussions with 3-word poems

Sample Prompts

Update dev.md to respond to "dev" command in discussion comments, generate a 3 word poem and add comment to the discussion back.

Add reaction to dev.md and use Claude engine

Cluster 5: Bug Fixes & Error Resolution

Size: 25 tasks (2.8% of total)
Success Rate: 80.0% (20/25 merged)
Average Duration: 49 minutes
Average Files Changed: 14.0
Average Additions: 244 lines
Average Comments: 1.2

Top TF-IDF Terms: fix, tests, docs, javascript, files

Common Keywords: fix, test, lint, ui, error

Characteristics:
This cluster performs above average with a 80.0% success rate.
Tasks in this cluster complete faster than average.
These tasks typically involve fewer file changes.

Example PRs:

#2935: Fix missing JavaScript files and implement security sanitization features
#2800: Fix test suite for GitHub MCP toolset permission validation
#3452: Render docs for iPhone 16 and fix CSS issues for multiple form factors

Sample Prompts

fix test suite

The CI is broken on main tree, please fix: https://github.com/githubnext/gh-aw/actions/runs/18792018478/job/53624349505

Success Rate by Cluster

Cluster	Theme	Tasks	Success Rate	Avg Duration	Avg Files
🟢 C1	New Features & Implementation	110	80.0%	1.9h	5.5
🟢 C4	Bug Fixes & Error Resolution	25	80.0%	0.8h	14.0
🟢 C2	Miscellaneous Tasks	350	78.9%	2.0h	13.0
🟡 C5	Bug Fixes & Error Resolution	98	76.5%	3.2h	11.6
🟡 C3	New Features & Implementation	296	74.0%	1.9h	19.8

Key Findings

1. Task Distribution: The largest cluster is "Miscellaneous Tasks" with 350 tasks (39.8% of all tasks). This suggests that copilot agents are most frequently used for miscellaneous tasks.

2. Success Patterns: 3 out of 5 clusters have success rates above 78%. Tasks in the "New Features & Implementation" cluster show the highest success rate at 80.0%, indicating these types of tasks are well-suited for agent automation.

3. Complexity Analysis: The "Bug Fixes & Error Resolution" cluster takes the longest to complete (avg 3.2 hours), suggesting higher complexity. However, its success rate of 76.5% shows that even complex tasks can be successfully automated.

Recommendations

1. Optimize for High-Success Task Types: Tasks in the "New Features & Implementation" category show 80.0% success rate. Consider templating or documenting best practices for these types of prompts to maintain high quality.

2. Improve "New Features & Implementation" Tasks: With a 74.0% success rate, tasks in this cluster may benefit from more detailed prompts, better context, or task decomposition.

3. Optimize Time-Intensive Tasks: "Bug Fixes & Error Resolution" tasks take 3.2 hours on average. Consider breaking these into smaller subtasks or providing more specific guidance to reduce iteration time.

4. Prompt Engineering Insights: Analysis of TF-IDF terms reveals that successful prompts often include specific technical terms and clear action verbs. Encourage prompt writers to be explicit about the desired outcome and provide relevant context.

Visualizations

The following visualizations are available as workflow artifacts:

Elbow Plot: Shows optimal cluster count selection
Cluster Sizes: Bar chart of cluster sizes colored by success rate
Success Rates: Success rate comparison across clusters
PCA Projection: 2D visualization of cluster distribution

Analysis generated by NLP Prompt Clustering workflow on 2025-12-14 19:20 UTC

AI generated by Copilot Agent Prompt Clustering Analysis

2025-12-15T19:30:43Z

github-actions[bot]
bot Dec 15, 2025
Author

⚓ Avast! This discussion be marked as outdated by Copilot Agent Prompt Clustering Analysis.
🗺️ A newer treasure map awaits ye at Discussion #6553.
Fair winds, matey! 🏴‍☠️

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-clustering] Copilot Agent Prompt Clustering Analysis - December 14, 2025 #6449

Uh oh!

{{title}}

Uh oh!

General Insights

Cluster Analysis

Cluster 1: Miscellaneous Tasks

Cluster 2: New Features & Implementation

Cluster 3: New Features & Implementation

Cluster 4: Bug Fixes & Error Resolution

Cluster 5: Bug Fixes & Error Resolution

Success Rate by Cluster

Key Findings

Recommendations

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[prompt-clustering] Copilot Agent Prompt Clustering Analysis - December 14, 2025 #6449

Uh oh!

github-actions[bot] bot Dec 14, 2025

🔬 Copilot Agent Prompt Clustering Analysis

Summary

General Insights

Cluster Analysis

Cluster 1: Miscellaneous Tasks

Cluster 2: New Features & Implementation

Cluster 3: New Features & Implementation

Cluster 4: Bug Fixes & Error Resolution

Cluster 5: Bug Fixes & Error Resolution

Success Rate by Cluster

Key Findings

Recommendations

Visualizations

Replies: 1 comment

Uh oh!

github-actions[bot] bot Dec 15, 2025 Author

github-actions[bot]
bot Dec 14, 2025

github-actions[bot]
bot Dec 15, 2025
Author