[prompt-clustering] Copilot Agent Prompt Clustering Analysis - December 2025 #6291
Closed
Replies: 1 comment
-
|
⚓ Avast! This discussion be marked as outdated by Copilot Agent Prompt Clustering Analysis. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Daily NLP-based clustering analysis of copilot agent task prompts to identify patterns, trends, and opportunities for optimization.
Summary
Analyzed 1,892 copilot agent tasks from the last 30 days using advanced NLP clustering techniques (TF-IDF + K-means). The analysis identified 6 distinct clusters representing different types of tasks with varying success rates and complexity characteristics.
Overall Performance:
Key Findings
🏆 Most Common Task Type: New Features (868 tasks, 45.9%)
⚠️ Lowest Success Rate: CI/CD & Workflows (65.8%)
⭐ Highest Success Rate: New Features - Agentic Workflows (79.8%)
📊 Average Code Changes: +1,289/-535 lines per task
Full Cluster Analysis
Cluster Analysis
Cluster 1: New Features - General
Size: 868 tasks (45.9% of total)
Performance:
Characteristics:
Top Keywords: add, update, agent, javascript, run, github, step
Example Tasks:
Cluster 2: Documentation
Size: 375 tasks (19.8% of total)
Performance:
Characteristics:
Top Keywords: aw, gh, gh aw, githubnext, githubnext gh, githubnext gh aw, comments
Example Tasks:
Cluster 3: New Features - Agentic Workflows
Size: 247 tasks (13.1% of total)
Performance:
Characteristics:
Top Keywords: agentic, agentic workflow, workflow, workflows, update, shared, create
Example Tasks:
Cluster 4: Documentation & CLI
Size: 203 tasks (10.7% of total)
Performance:
Characteristics:
Top Keywords: cli, version, comments, issue_title, issue, section, issue_description
Example Tasks:
Cluster 5: CI/CD & Workflows
Size: 114 tasks (6.0% of total)
Performance:
Characteristics:
Top Keywords: mcp, server, mcp server, safe, tool, github, json
Example Tasks:
Cluster 6: Testing & Code Quality
Size: 85 tasks (4.5% of total)
Performance:
Characteristics:
Top Keywords: code, duplicate, duplicate code, analysis, tests, fix, commit
Example Tasks:
Success Rate by Cluster
Key Insights
1. Agentic Workflow Tasks Perform Best
Tasks focused on creating or modifying agentic workflows (Cluster 3) have the highest success rate at 79.8%. These tasks typically:
2. CI/CD & MCP Integration Tasks Are Most Complex
Tasks involving CI/CD and MCP server integration (Cluster 5) show:
3. Complexity Correlates with Lower Success
Analysis shows an inverse correlation between task complexity and success rate:
4. Documentation Tasks Have Moderate Success
Documentation-focused tasks (Clusters 2 & 4) show 73.9-77.1% success rates:
Recommendations
Based on this clustering analysis, we recommend:
1. Focus on High-Success Patterns
The 'New Features - Agentic Workflows' cluster shows 79.8% success rate. When creating new tasks:
2. Break Down Complex CI/CD Tasks
CI/CD & MCP integration tasks (65.8% success) should be split into smaller, focused subtasks:
3. Manage Task Complexity
Clusters with high file changes show more iterations and lower success:
4. Leverage Successful Prompt Patterns
The most successful clusters share common characteristics:
5. Consider Task Type When Setting Expectations
Different task types have different baseline success rates:
Methodology
Data Collection:
Analysis Technique:
Cluster Themes: Inferred from top keywords and manual review of sample tasks
Analysis Tools: Python (scikit-learn, pandas), TF-IDF vectorization, K-means clustering
Date: 2025-12-12
Data Period: Last 30 days (1,892 tasks analyzed)
Beta Was this translation helpful? Give feedback.
All reactions