[prompt-analysis] Copilot PR Prompt Analysis - December 3, 2025 #5394

2025-12-03T09:15:20Z

github-actions[bot]
bot Dec 3, 2025

Summary

Analysis of Copilot-generated PRs over the last 30 days reveals a 79.8% merge success rate across 997 completed PRs. The data shows that prompts with code references (98% in merged vs 87% in closed), file-specific mentions (86% vs 78%), and test-related keywords achieve the highest success rates.

Analysis Period: Last 30 days (Nov 3 - Dec 3, 2025)
Total PRs: 1,000 | Merged: 796 (79.8%) | Closed: 201 (20.2%) | Open: 3 (0.3%)

Full Analysis Report

Prompt Categories and Success Rates

Category	Total	Merged	Closed	Success Rate
Test	815	659	156	80.9%
Refactor	414	333	81	80.4%
Bug Fix	977	783	194	80.1%
Feature	955	763	192	79.9%
Documentation	908	724	184	79.7%

All categories show similar success rates (79.7-80.9%), suggesting category choice is less important than prompt quality.

Prompt Characteristics Analysis

✅ Successful Prompt Patterns (Merged PRs)

Common characteristics:

Average length: 460 words (detailed but focused)
Code references: 98.2% include code blocks or inline code
File references: 85.7% mention specific files or extensions
Error context: 50.3% include error messages or bug descriptions
Top action verbs: fix (775), change (741), add (679), update (604), resolve (387)

Key differentiators for merged PRs:

Code references +11%: 98% vs 87% in closed PRs
File specificity +8%: 86% vs 78% in closed PRs
Error context +6%: 50% vs 44% in closed PRs

❌ Unsuccessful Prompt Patterns (Closed PRs)

Common characteristics:

Average length: 449 words (similar to merged)
Code references: 87.1% (11% lower than merged)
File references: 78.1% (8% lower than merged)
Error context: 44.3% (6% lower than merged)
Top action verbs: fix (194), add (167), change (157), update (137), create (104)

Warning signs observed:

WIP indicators: PRs with "[WIP]" in title often closed
Generic placeholders: "Thanks for asking me to work on this..."
Checklist-heavy: Task lists without detailed context
Less specific: Fewer file/code references

Example Comparisons

✅ Successful Prompt Example

PR #5373 - Document safe-outputs requirements in schema with $comment

Users writing workflows with operations that require safe-outputs get no upfront guidance—schema validation passes, but runtime compilation fails with a missing safe-outputs error. Adds $comment...

Why it succeeded:

✅ Specific problem statement with technical context
✅ Code reference (safe-outputs)
✅ Clear error description
✅ Implementation approach mentioned

✅ Successful Prompt Example

PR #5372 - Convert stale repository identifier to agentic workflow

Replaces the basic GitHub Action that immediately creates issues for stale repositories with an AI agent that performs deep research before reporting...

Why it succeeded:

✅ Clear before/after description
✅ Specific use case
✅ Detailed implementation plan
✅ File structure mentioned

❌ Unsuccessful Prompt Example

PR #5377 - Update workflow to review only public repos [WIP]

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress...

Why it failed:

❌ WIP tag indicates incomplete work
❌ Generic placeholder text
❌ No specific problem or solution
❌ No code or file references

❌ Unsuccessful Prompt Example

PR #5355 - Address network firewall warnings [WIP]

Analyze the issue and identify affected workflows

Build project and run gh aw compile --verbose to identify warnings

Confirmed 6 workflows (not 7) have firewall warnings...

Why it failed:

❌ WIP tag
❌ Checklist format without context
❌ Progress notes rather than problem/solution
❌ Less structured description

Key Insights

Based on 997 completed PRs, the data reveals three critical success factors:

1. Code Specificity Matters Most (+11% success difference)

Merged PRs are 11% more likely to include code references (98% vs 87%). Using backticks for code, function names, and technical terms signals concrete implementation focus.

Pattern: Successful prompts show the code, not just describe it.

2. File-Level Detail Improves Outcomes (+8% success difference)

Merged PRs mention specific files 8% more often (86% vs 78%). Referencing .go, .js, .yaml, or exact file paths demonstrates understanding of the codebase structure.

Pattern: "Fix the authentication handler" → "Fix pkg/auth/handler.go authentication logic"

3. Error Context Provides Direction (+6% success difference)

While only 50% of merged PRs include error messages, they still outperform closed PRs (44%). Including error text, stack traces, or bug descriptions helps Copilot understand the problem domain.

Pattern: Including the actual error message guides implementation better than generic "fix bug" requests.

Recommendations

Based on this analysis, follow these best practices for Copilot PR prompts:

✅ DO: Write Specific, Code-Focused Prompts

Include code references: Use backticks for functions, variables, error messages
- Example: "Fix the validateInput() function in pkg/parser/validator.go"
Reference specific files: Mention exact file paths when known
- Example: "Update the schema in pkg/parser/schemas/workflow.json"
Provide error context: Include actual error messages or bug descriptions
- Example: "Resolves 'undefined index' error when processing empty arrays"
Detail the implementation: Describe what should change, not just what's broken
- Example: "Add null check before array access at line 42"

❌ AVOID: Generic or Incomplete Prompts

Avoid WIP indicators: Don't use "[WIP]" tags—complete the implementation first
Skip placeholders: Replace "I will work on this..." with actual problem descriptions
Minimize checklists: Task lists without context provide insufficient guidance
Don't be vague: "Improve performance" → "Optimize the database query in getUserProfile()"

📝 Prompt Template for Success

[Specific problem statement with technical context]

**Current behavior**: [What's wrong, including error messages]
**Expected behavior**: [What should happen]
**Implementation**: [Specific changes needed, referencing files/functions]

[Optional: Code snippets, file references, or error traces]

Statistical Summary

Prompt Quality Indicators (Merged vs Closed):

Code references: 98.2% vs 87.1% (+11.1% difference)
File references: 85.7% vs 78.1% (+7.6% difference)
Error context: 50.3% vs 44.3% (+6.0% difference)
Average length: 460 vs 449 words (+2.4% difference)

Key Takeaway: Specificity (code/file references) matters more than length or category.

Category Performance Insights

While all categories show similar success rates (79.7-80.9%), slight variations exist:

Test-related prompts lead: 80.9% success (likely due to clear acceptance criteria)
Refactoring second: 80.4% success (well-defined scope)
Bug fixes: 80.1% success (clear problem statements)
Features slightly lower: 79.9% success (broader scope, more complexity)
Documentation: 79.7% success (subjective quality criteria)

Insight: The type of work matters less than how you describe it. A well-written feature prompt outperforms a vague bug fix prompt.

Methodology Notes

Data Collection:

Source: GitHub search API for Copilot-generated PRs in githubnext/gh-aw
Period: Last 30 days (997 completed PRs + 3 open)
Analysis: Python-based text analysis of PR bodies and metadata

Categorization:

PRs may belong to multiple categories (e.g., bug fix + test)
Success rate = merged / (merged + closed), excluding open PRs
Keyword extraction uses lowercase matching and common programming terms

Limitations:

Analysis based on PR body text at creation time
Does not account for post-creation edits or discussions
External factors (reviewer availability, project priorities) not captured

Analysis Date: December 3, 2025
Generated by: Copilot PR Prompt Analysis Workflow
Run ID: §19888397028

AI generated by Copilot PR Prompt Pattern Analysis

2025-12-04T09:17:48Z

github-actions[bot]
bot Dec 4, 2025
Author

⚓ Avast! This discussion be marked as outdated by Copilot PR Prompt Pattern Analysis.
🗺️ A newer treasure map awaits ye at Discussion #5509.
Fair winds, matey! 🏴‍☠️

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-analysis] Copilot PR Prompt Analysis - December 3, 2025 #5394

Uh oh!

{{title}}

Uh oh!

Prompt Categories and Success Rates

Prompt Characteristics Analysis

✅ Successful Prompt Patterns (Merged PRs)

❌ Unsuccessful Prompt Patterns (Closed PRs)

Example Comparisons

✅ Successful Prompt Example

✅ Successful Prompt Example

❌ Unsuccessful Prompt Example

❌ Unsuccessful Prompt Example

Key Insights

1. Code Specificity Matters Most (+11% success difference)

2. File-Level Detail Improves Outcomes (+8% success difference)

3. Error Context Provides Direction (+6% success difference)

Recommendations

✅ DO: Write Specific, Code-Focused Prompts

❌ AVOID: Generic or Incomplete Prompts

📝 Prompt Template for Success

Statistical Summary

Category Performance Insights

Methodology Notes

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[prompt-analysis] Copilot PR Prompt Analysis - December 3, 2025 #5394

Uh oh!

github-actions[bot] bot Dec 3, 2025

Summary

Prompt Categories and Success Rates

Prompt Characteristics Analysis

✅ Successful Prompt Patterns (Merged PRs)

❌ Unsuccessful Prompt Patterns (Closed PRs)

Example Comparisons

✅ Successful Prompt Example

✅ Successful Prompt Example

❌ Unsuccessful Prompt Example

❌ Unsuccessful Prompt Example

Key Insights

1. Code Specificity Matters Most (+11% success difference)

2. File-Level Detail Improves Outcomes (+8% success difference)

3. Error Context Provides Direction (+6% success difference)

Recommendations

✅ DO: Write Specific, Code-Focused Prompts

❌ AVOID: Generic or Incomplete Prompts

📝 Prompt Template for Success

Statistical Summary

Category Performance Insights

Methodology Notes

Replies: 1 comment

Uh oh!

github-actions[bot] bot Dec 4, 2025 Author

github-actions[bot]
bot Dec 3, 2025

github-actions[bot]
bot Dec 4, 2025
Author