📊 Agentic Workflow Lock File Statistics - December 3, 2025 #5351
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it was created by an agentic workflow more than 3 days ago. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Agentic Workflow Lock File Statistics - December 3, 2025
This comprehensive statistical analysis examines all 100
.lock.ymlfiles in the githubnext/gh-aw repository, revealing usage patterns, architectural decisions, and structural characteristics of agentic workflows.Key Findings
Repository lockfiles show strong standardization: 82% of workflows use issues as a trigger, 77% support manual dispatch, and 62% run on schedules. The GitHub MCP server dominates with 3,695 tool references across workflows. Safe outputs favor discussions (37 workflows) over direct issue creation (23 workflows), indicating a preference for threaded conversations. The most common workflow pattern combines issues, schedule, and workflow_dispatch triggers (51 workflows), suggesting a robust multi-activation strategy for automation.
Complete Statistical Analysis
Executive Summary
File Size Distribution
Size Statistics:
Distribution Insights:
Trigger Analysis
Primary Trigger Distribution
Most Common Trigger Combinations
Trigger Insights
Safe Outputs Analysis
Safe Output Types Distribution
Example Workflows:
daily-news,weekly-issue-summary,audit-workflowsdaily-fact,issue-triage-agent,grumpy-reviewerci-doctor,breaking-change-checker,security-fix-prtidy,repository-quality-improver,semantic-function-refactorDiscussion Categories Used
Key Insight: The "audits" category (combined case variations: 17 total) is the most popular destination for workflow outputs, indicating gh-aw is heavily used for automated auditing and quality checks.
Safe Output Patterns
Structural Characteristics
Job Complexity
Based on statistical analysis across all 100 workflows:
Job Distribution Insights:
Typical Lock File Structure
Based on median and average values, a typical .lock.yml file has:
Timeout Distribution
Insight: 10 minutes is overwhelmingly the most common timeout (290 occurrences), indicating most workflows complete quickly. Only 33 workflows (7%) require 30+ minutes.
Permission Patterns
Most Common Permissions
Permission Analysis
Security Posture: The repository demonstrates excellent security practices with minimal write permissions and predominant use of read-only access patterns.
Tool & MCP Server Usage
MCP Server Distribution
Dominance of GitHub MCP: The GitHub MCP server accounts for 95.3% of all MCP tool references, showing workflows are heavily GitHub-focused.
Top 20 GitHub MCP Tools (Each: 66 uses)
All of the following tools appear exactly 66 times across workflows, indicating they are part of a standard toolset configuration:
download_workflow_run_artifactget_code_scanning_alertget_commitget_dependabot_alertget_discussionget_discussion_commentsget_file_contentsget_job_logsget_labelget_latest_releaseget_meget_notification_detailsget_pull_requestget_pull_request_commentsget_pull_request_diffget_pull_request_filesget_pull_request_review_commentsget_pull_request_reviewsget_pull_request_statusget_release_by_tagStandardization Insight: The uniform count of 66 uses across all these tools suggests they are configured as a comprehensive toolset bundle that 66 workflows include by default, rather than being individually selected.
MCP Server Specialization
Engine Distribution
Note: Engine data is extracted from workflow comments and may not represent all workflows. Most workflows likely use a default engine not explicitly specified in the lockfile.
Interesting Findings
1. Standardized Toolset Pattern
66 workflows share an identical GitHub MCP toolset with exactly 20 tools, indicating a well-established template or base configuration for new workflows.
2. Size-Complexity Correlation
The poem-bot workflow (608 KB) is nearly 2x larger than the average, suggesting creative/generative workflows require significantly more configuration than analytical workflows.
3. Multi-Trigger Flexibility
51 workflows (51%) use the triple combination of issues + schedule + workflow_dispatch, providing three independent activation paths and maximizing workflow accessibility.
4. Audit-Centric Architecture
With "audits" as the top discussion category (17 combined occurrences) and high usage of read permissions, gh-aw workflows are primarily designed for monitoring, auditing, and reporting rather than direct code modification.
5. Conservative Timeout Strategy
290 workflows use 10-minute timeouts, but 117 use 20 minutes, suggesting two distinct workflow classes: quick operations and in-depth analysis.
6. Minimal External Dependencies
Despite having access to various MCP servers, 95.3% of tool usage is GitHub-focused, showing workflows stay within the GitHub ecosystem.
7. Discussion > Issue Creation
37 workflows create discussions vs 23 that create issues, indicating a preference for conversational, threaded outputs over actionable issue tracking.
8. Test Workflows Are Minimal
The smallest workflows are in
tests/andshared/mcp/directories (81-86 KB), showing test workflows are intentionally simplified for focused validation.Recommendations
1. Standardize Engine Configuration
Only 17 workflows explicitly specify engines in analyzed data. Consider documenting the default engine and when to specify alternatives.
2. Consolidate Discussion Categories
Multiple variations of "audits" (audits, Audits) exist. Standardize on a single category name to improve organization.
3. Optimize Large Workflows
Investigate poem-bot.lock.yml (608 KB) to identify opportunities for refactoring or template reuse that could reduce size.
4. Document Timeout Guidelines
With clear clustering at 10 and 20 minutes, establish guidelines for when to use each timeout tier.
5. Expand PR Workflows
Only 9% of workflows trigger on pull requests. Consider expanding PR-triggered automation for code review and quality checks.
6. MCP Server Documentation
Document the standard 66-workflow toolset pattern so new workflow authors understand what's included by default.
7. Safe Output Best Practices
With 37 discussion creators vs 24 comment adders, establish guidelines for when to create new discussions vs comment on existing ones.
Methodology
Data Collection
.lock.ymlfiles in.github/workflows/and subdirectories/tmp/gh-aw/cache-memory/for reproducibilityAnalysis Approach
Limitations
Historical Context
This is the first comprehensive statistical analysis of gh-aw lockfiles. Future analyses can compare against this baseline to track:
Analysis Date: December 3, 2025
Generated by: Lockfile Statistics Analysis Agent
Cache Location:
/tmp/gh-aw/cache-memory/Reproducible: Run
python3 /tmp/gh-aw/cache-memory/scripts/detailed_analysis.pyfrom workflows directoryBeta Was this translation helpful? Give feedback.
All reactions