From 3d8f2ad1ac25f974607338fb999cce47d07da16e Mon Sep 17 00:00:00 2001 From: Ladislav Martincik Date: Mon, 9 Feb 2026 16:07:11 +0100 Subject: [PATCH 1/4] feat: add --swarm flag support to ci-fix-loop skill Add parallel team-based fixing capability to CI fix loop: - Add --swarm flag to "When to Use" section - Add Argument Parsing section for flag extraction - Modify Step 2.5 with conditional swarm logic: * Use team only if 2+ files with errors AND --swarm set * Partition errors by file (max 4 teammates) * Teammates restricted to file editing only * Lead collects changes in single commit * Team created/torn down within iteration - Update Step 2.6 to note swarm handles commit/push - Add swarm constraints to Safety Mechanisms Phase 3 of agent-teams-expansion plan --- skills/ci-fix-loop/SKILL.md | 122 +++++++++++++++++++++++++++++++++++- 1 file changed, 119 insertions(+), 3 deletions(-) diff --git a/skills/ci-fix-loop/SKILL.md b/skills/ci-fix-loop/SKILL.md index b349ac8..49d337f 100644 --- a/skills/ci-fix-loop/SKILL.md +++ b/skills/ci-fix-loop/SKILL.md @@ -11,9 +11,17 @@ Orchestrates autonomous CI repair: analyze → fix → commit → push → monit This skill is invoked when: - User runs `/fix-ci --loop` or `/fix-ci --auto` +- User runs `/fix-ci --swarm` - Multiple CI fix iterations are needed - User wants hands-off CI repair +## Argument Parsing + +The command arguments may contain optional flags that modify the fix loop behavior: +- Extract `--swarm` flag if present (indicates user wants parallel team-based fixing) +- Extract `--loop` or `--auto` flags (standard autonomous mode) +- The flags can be combined with other arguments + ## Configuration | Setting | Value | Description | @@ -127,19 +135,126 @@ if current_errors is empty: #### Step 2.5: Apply Fixes -Invoke the `ci-error-fixer` agent with error list: +**Condition Check**: Determine fix strategy based on error distribution and `--swarm` flag: + +``` +file_count = count of distinct files with errors +use_swarm = (file_count >= 2) AND (--swarm flag is set) +``` + +**If use_swarm is FALSE** (errors in single file OR --swarm not requested): +- Invoke the `ci-error-fixer` agent with error list - Applies targeted fixes based on error type - Shows diffs for each change - Reports fixed vs flagged-for-manual-review counts +- Track results: + ``` + errors_fixed = count of successfully fixed errors + errors_flagged = count of errors needing manual review + ``` + +**If use_swarm is TRUE** (2+ files with errors AND --swarm flag set): + +**Team Prerequisites and Fallback**: + +Attempt to create the agent team using `TeamCreate` with a unique timestamped name: `fix-ci-{YYYYMMDD-HHMMSS}` and description: "CI Fix Attempt {attempt}". + +If team creation fails (tool unavailable or experimental features disabled), inform the user that swarm mode requires agent teams to be enabled (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` in settings.json), then fall back to the standard (non-swarm) fix approach using the existing `ci-error-fixer` agent as above. + +**Partitioning Strategy** (performed by lead after centralized analysis in Step 2.3): + +1. Group errors by file path (deterministic: sort errors by file path first) +2. If >4 distinct files, group by directory proximity to create max 4 partitions +3. Non-file-specific errors (dependency resolution, config issues, flaky tests without clear file association) remain with lead for sequential handling +4. Each partition contains: error list for those files, file paths, error types and messages + +**Teammate Spawn Protocol** (max 4 teammates, one per file partition): + +Spawn each teammate via the `Task` tool with `team_name` parameter and `subagent_type: "general-purpose"`. Each teammate prompt MUST include as string literals: + +``` +You are fixing CI errors for a specific set of files as part of a parallel fix team. + +YOUR FILE PARTITION: +{list of file paths assigned to this teammate} + +ERRORS TO FIX: +{error list for your files - type, file, line, message} + +FILE CONTENTS: +{Read and include relevant file contents} + +YOUR TASK: +Fix all errors in your assigned files. Apply targeted fixes based on error type: +- Lint errors: auto-fix with formatter/linter where possible +- Type errors: add type annotations, fix type mismatches +- Test failures: fix test logic or implementation bugs +- Build errors: fix import paths, missing dependencies + +CRITICAL CONSTRAINTS: +- Edit files ONLY - no git commit/push/add commands +- DO NOT run test commands or build commands (may interfere with other teammates) +- DO NOT use AskUserQuestion (you can't interact with the user) +- Read files completely without limit/offset +- Make surgical, minimal changes to fix errors + +COMPLETION SIGNAL: +When you've fixed all errors in your files: +1. Send "FIX COMPLETE" via SendMessage +2. Wait for shutdown_request + +ERROR TYPES AND HANDLING: +{specific guidance based on error types in this partition} +``` + +**Completion Protocol**: + +Wait for all teammates to signal completion by sending "FIX COMPLETE" messages. Timeout: 10 minutes from teammate spawn time. -Track results: +If timeout occurs, proceed with available fixes and note which teammates timed out. + +**Lead Collects Changes**: + +After all teammates complete (or timeout), the lead stages ALL changes in a single commit and single push: + +```bash +git add . + +git commit -m "fix(ci): automated swarm fix attempt ${attempt} + +Errors addressed across ${file_count} files: +- ${error_summary_list} + +Attempt ${attempt} of ${max_attempts} (ci-fix-loop with swarm)" ``` -errors_fixed = count of successfully fixed errors + +This replaces the per-error serial commits from the non-swarm approach. + +**Resource Cleanup**: + +Always execute cleanup regardless of success or failure: +1. Send shutdown requests to all teammates via `SendMessage` with `type: "shutdown_request"` +2. Wait briefly for confirmations +3. Call `TeamDelete` to remove the team + +If cleanup fails, log warning: "Team cleanup incomplete. You may need to check for lingering team resources." + +Execute cleanup before proceeding to Step 2.6. + +**Track Results**: +``` +errors_fixed = count of successfully fixed errors across all teammates errors_flagged = count of errors needing manual review ``` +**Note**: The team is created and torn down within this single fix iteration (Step 2.5). The outer loop (Phase 2) continues as normal after this step completes. + #### Step 2.6: Commit & Push +**Note**: If swarm mode was used in Step 2.5, the commit and push were already performed by the lead after collecting teammate changes. Skip this step and proceed to Step 2.7. + +**Otherwise** (standard non-swarm mode): + Stage and commit changes: ```bash git add . @@ -315,6 +430,7 @@ echo "Pull and retry: git pull --rebase && /fix-ci --loop" 4. **Progress detection**: Abort if same errors repeat twice 5. **Timeout limits**: 30 min max CI wait per attempt 6. **Commit tracking**: Report all commits for easy revert +7. **Swarm mode constraints**: When using `--swarm`, teammates are restricted to file editing only - they cannot run git commands (commit/push/add), test commands, build commands, or use AskUserQuestion. This prevents coordination issues and ensures the lead maintains control of the git workflow. ## Token Efficiency From cd027aeb9f5bff6481b764ed40fc53cbf0a42d3c Mon Sep 17 00:00:00 2001 From: Ladislav Martincik Date: Mon, 9 Feb 2026 16:07:24 +0100 Subject: [PATCH 2/4] feat: add --swarm flag support to review-pr command Phase 2 of agent-teams-expansion: Add parallel team-based PR review - Add Argument Parsing section to extract --swarm flag - Add Mode Selection section to route between workflows - Add comprehensive Swarm Workflow with 4 specialized reviewers - Security Reviewer: OWASP, injection, auth/authz, secrets - Performance Reviewer: algorithmic complexity, N+1, caching - Test Coverage Reviewer: adequacy, edge cases, quality - Architecture Reviewer: SOLID, design patterns, API design - Each reviewer includes cross-concern finding protocol - Add Completion Protocol with 10-min timeout - Add Consolidation section with attribution markers - Add Resource Cleanup with TeamDelete - Rename existing workflow to "Standard Workflow" - Preserve all existing Standard Workflow content unchanged --- commands/review-pr.md | 344 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 343 insertions(+), 1 deletion(-) diff --git a/commands/review-pr.md b/commands/review-pr.md index b66e46a..e4ad2db 100644 --- a/commands/review-pr.md +++ b/commands/review-pr.md @@ -8,7 +8,349 @@ You are a comprehensive PR reviewer conducting a thorough analysis of a GitHub p User provided: `$ARGUMENTS` -## Workflow +## Argument Parsing + +The `$ARGUMENTS` input may contain both the PR number/URL and optional mode flags. Extract the intent: +- Identify if `--swarm` flag is present (indicates user wants parallel team review) +- Separate the flag from the PR identifier itself +- The PR identifier is the remaining text after flag extraction (PR number, URL, or owner/repo#number format) + +## Mode Selection + +**If user requested swarm mode** (via `--swarm` flag): Execute the **Swarm Workflow** below. +**Otherwise**: Execute the **Standard Workflow** below. + +--- + +## Swarm Workflow + +An alternative approach using agent teams for PR review that benefits from parallel specialized analysis. This works well for comprehensive reviews where different aspects (security, performance, testing, architecture) can be evaluated independently. + +### Team Prerequisites and Fallback + +Attempt to create the agent team using `TeamCreate` with a unique timestamped name: `review-pr-{number}-{YYYYMMDD-HHMMSS}` and description: "PR Review: {title}". + +If team creation fails (tool unavailable or experimental features disabled), inform the user that swarm mode requires agent teams to be enabled (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` in settings.json), then fall back to executing the Standard Workflow instead. The PR identifier is already parsed and ready to use. + +### Phase 1: PR Information Gathering + +Execute the same PR information gathering steps as Standard Workflow Phase 1 (extract PR identifier, fetch metadata, checkout branch). This establishes the shared context that all teammates will need. + +After gathering PR information and checking out the branch, proceed to spawn teammates for parallel review. + +### Shared Task List + +Create tasks via `TaskCreate` that represent the specialized review areas: +1. **Security Review** — Identify security vulnerabilities and risks +2. **Performance Review** — Evaluate performance implications +3. **Test Coverage Review** — Assess test adequacy and quality +4. **Architecture Review** — Examine design patterns and code organization + +These tasks provide structure for parallel specialized reviews. + +### Teammate Roles and Spawn Protocol + +After completing Phase 1, spawn 4 specialized reviewer teammates via the `Task` tool with `team_name` parameter and `subagent_type: "general-purpose"`. Each teammate prompt MUST include as string literals (NOT references to conversation): + +**Required Context in Each Spawn Prompt**: +- PR title +- PR author +- Base branch name +- Head branch name +- Diff summary from `git diff $BASE_BRANCH...HEAD --stat` +- List of changed files +- For PRs with ≤50 changed files: Include the changed file contents (read them before spawning) +- For PRs with >50 changed files: Include file list only, instruct teammates to Read specific files they need to examine +- PR description/body + +**Teammate 1: Security Reviewer** + +``` +You are conducting a security-focused review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for security concerns. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for security vulnerabilities and risks. Focus on: +- OWASP Top 10 vulnerabilities +- Injection vulnerabilities (SQL, NoSQL, Command, LDAP, etc.) +- Authentication and authorization issues +- Secrets exposure (API keys, tokens, passwords in code) +- Dependency vulnerabilities (new or updated dependencies) +- Input validation and sanitization +- Cross-Site Scripting (XSS) potential +- Cross-Site Request Forgery (CSRF) protection +- Insecure cryptographic practices +- Security misconfiguration + +CROSS-CONCERN FINDINGS: +If you find security issues that also have performance implications (e.g., DoS potential), share via SendMessage: +"Security: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" + +CRITICAL CONSTRAINTS: +- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) +- DO NOT run git commit/push/add or any destructive commands +- DO NOT use AskUserQuestion (you can't interact with the user) +- DO NOT run build or test commands (may interfere with other teammates) + +COMPLETION: +When you've completed your security review: +1. Mark your task complete via TaskUpdate with your findings +2. Send "REVIEW COMPLETE" via SendMessage +3. Wait for shutdown_request + +FINDINGS FORMAT: +- File path and line numbers for all issues +- Severity level (Critical/High/Medium/Low) +- Specific vulnerability description +- Recommendation for remediation +``` + +**Teammate 2: Performance Reviewer** + +``` +You are conducting a performance-focused review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for performance concerns. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for performance implications. Focus on: +- Algorithmic complexity (O(n²), O(n log n), etc.) +- Memory allocation patterns (unnecessary allocations, memory leaks) +- Database query efficiency (N+1 queries, missing indexes, overfetching) +- Caching opportunities (missing caching, cache invalidation issues) +- Bundle size impact (for frontend code) +- Async/await patterns (blocking operations, parallel opportunities) +- Resource leaks (unclosed connections, file handles, event listeners) +- Unnecessary re-renders or re-computations +- Network request optimization + +CROSS-CONCERN FINDINGS: +If you find performance issues that also have security implications (e.g., DoS potential), share via SendMessage: +"Performance: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" + +CRITICAL CONSTRAINTS: +- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) +- DO NOT run git commit/push/add or any destructive commands +- DO NOT use AskUserQuestion (you can't interact with the user) +- DO NOT run build or test commands (may interfere with other teammates) + +COMPLETION: +When you've completed your performance review: +1. Mark your task complete via TaskUpdate with your findings +2. Send "REVIEW COMPLETE" via SendMessage +3. Wait for shutdown_request + +FINDINGS FORMAT: +- File path and line numbers for all issues +- Performance impact level (Critical/High/Medium/Low) +- Specific performance concern description +- Recommendation for optimization +``` + +**Teammate 3: Test Coverage Reviewer** + +``` +You are conducting a test coverage review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for test coverage. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for test adequacy and quality. Focus on: +- Test adequacy for all changed code paths +- Edge case coverage (boundary conditions, error states, null/undefined handling) +- Test quality (not just existence — are tests meaningful?) +- Regression potential (areas where bugs could reappear) +- Missing test scenarios (untested combinations, integration scenarios) +- Test maintainability (clear, readable, not brittle) +- Test isolation (proper mocking, no side effects) +- Assertion quality (specific assertions, not just "it doesn't crash") + +CROSS-CONCERN FINDINGS: +If you find test gaps that expose security or performance risks, share via SendMessage: +"Test Coverage: Missing tests at {file}:{line} also creates {other concern} risk — {explanation}" + +CRITICAL CONSTRAINTS: +- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) +- DO NOT run git commit/push/add or any destructive commands +- DO NOT use AskUserQuestion (you can't interact with the user) +- DO NOT run build or test commands (may interfere with other teammates) + +COMPLETION: +When you've completed your test coverage review: +1. Mark your task complete via TaskUpdate with your findings +2. Send "REVIEW COMPLETE" via SendMessage +3. Wait for shutdown_request + +FINDINGS FORMAT: +- File path and line numbers for untested or inadequately tested code +- Coverage gap severity (Critical/High/Medium/Low) +- Specific test scenario missing +- Recommendation for test improvements +``` + +**Teammate 4: Architecture Reviewer** + +``` +You are conducting an architecture-focused review as part of a PR review team. + +PR TITLE: {literal PR title} +PR AUTHOR: {literal author} +BASE BRANCH: {literal base branch} +HEAD BRANCH: {literal head branch} + +DIFF SUMMARY: +{literal git diff --stat output} + +CHANGED FILES: +{literal list of changed files} + +[If ≤50 files] +CHANGED FILE CONTENTS: +{literal file contents} + +[If >50 files] +Note: This PR has >50 changed files. Use the Read tool to examine specific files you need to review for architecture concerns. + +PR DESCRIPTION: +{literal PR body} + +YOUR TASK: +Review this PR for architectural quality and design patterns. Focus on: +- Design patterns (appropriate pattern usage, anti-patterns) +- SOLID principles (Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, Dependency Inversion) +- Code organization (proper module/package structure, file placement) +- API design (consistent interfaces, clear contracts, backward compatibility) +- Separation of concerns (business logic vs presentation, data access layer) +- Dependency management (appropriate dependencies, circular dependencies, coupling) +- Breaking changes (API changes that affect consumers) +- Consistency with existing codebase patterns + +CROSS-CONCERN FINDINGS: +If you find architectural issues that affect security, performance, or testability, share via SendMessage: +"Architecture: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" + +CRITICAL CONSTRAINTS: +- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) +- DO NOT run git commit/push/add or any destructive commands +- DO NOT use AskUserQuestion (you can't interact with the user) +- DO NOT run build or test commands (may interfere with other teammates) + +COMPLETION: +When you've completed your architecture review: +1. Mark your task complete via TaskUpdate with your findings +2. Send "REVIEW COMPLETE" via SendMessage +3. Wait for shutdown_request + +FINDINGS FORMAT: +- File path and line numbers for all issues +- Architectural impact level (Critical/High/Medium/Low) +- Specific architectural concern description +- Recommendation for improvement +``` + +### Completion Protocol + +Wait for all 4 teammates to signal completion by sending "REVIEW COMPLETE" messages. Timeout: 10 minutes from teammate spawn time. + +**If timeout occurs**: Proceed with available findings and note which teammates timed out in the consolidated review. + +**Fallback behavior**: If a teammate fails or gets stuck (repeated similar messages, no progress), you have three options: +1. Note the failure and proceed with other teammates' findings +2. Spawn a replacement teammate with clearer scoped instructions +3. Handle that review aspect yourself + +Choose based on how critical that specialized review is to the overall PR assessment. + +### Consolidation + +As team lead, integrate teammate findings into a comprehensive PR review. Your job is to synthesize specialized findings into the same output structure used in Standard Workflow, not mechanically merge outputs. + +**Reviewer Attribution**: Mark which teammate(s) found each issue: `[Security]`, `[Performance]`, `[Test Coverage]`, `[Architecture]`. Mark independently confirmed findings from multiple reviewers `[Consensus]`. + +**Output Format**: Use the same review structure as Standard Workflow Phase 3: +- Executive Summary (synthesize risk level based on all reviewer findings) +- Code Quality Analysis (incorporate findings from all specialized reviews) +- Detailed File-by-File Review (merge findings by file, preserve all reviewer attributions) +- Discussion & CI Review (same as standard workflow) +- Testing Verification (same as standard workflow) +- Recommendations & Action Items (merge and prioritize findings from all reviewers) + +**Cross-cutting Findings**: When findings from multiple reviewers relate to the same code location, highlight this: +"[Consensus] The dependency update at package.json:42 raises both security concerns (known CVE) and performance concerns (increased bundle size)" + +### Resource Cleanup + +After completing consolidation (whether successful or failed), always clean up team resources. + +Send shutdown requests to all teammates via `SendMessage` with `type: "shutdown_request"`, wait briefly for confirmations, then call `TeamDelete` to remove the team and its task list. + +If cleanup itself fails, inform the user: "Team cleanup incomplete. You may need to check for lingering team resources." + +Execute cleanup regardless of consolidation outcome—even if earlier steps errored or teammates timed out, cleanup must run before ending. + +--- + +## Standard Workflow + +The default PR review approach for comprehensive single-agent analysis. ### Phase 1: PR Information Gathering From 6a1d41bc6c04bf2d395fa235365189f021016360 Mon Sep 17 00:00:00 2001 From: Ladislav Martincik Date: Mon, 9 Feb 2026 16:09:54 +0100 Subject: [PATCH 3/4] chore: bump version to 1.5.0 and update changelog Version bump for --swarm flag additions to /review-pr and /fix-ci. --- .claude-plugin/plugin.json | 2 +- CHANGELOG.md | 18 ++++++++++++++++++ package.json | 2 +- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index ebce281..a2f1467 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "github", - "version": "1.4.0", + "version": "1.5.0", "description": "GitHub CI/CD automation plugin with autonomous fix loops, PR workflows, and code review", "author": { "name": "Ladislav Martincik", diff --git a/CHANGELOG.md b/CHANGELOG.md index c88287e..5ff88ea 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [1.5.0] - 2026-02-09 + +### Added + +- **`--swarm` flag for `/review-pr`** — opt-in agent team-based parallel PR review + - Spawns 4 specialized reviewer teammates: Security, Performance, Test Coverage, Architecture + - Cross-concern findings shared between reviewers via SendMessage + - Consolidated review with attribution markers (`[Security]`, `[Performance]`, etc.) + - Feature flag validation with graceful fallback to sequential mode + - Requires `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` + +- **`--swarm` flag for `/fix-ci`** (ci-fix-loop skill) — parallel CI error fixing + - Partitions errors by file path after centralized analysis + - Spawns one teammate per file partition (max 4) + - Lead collects all changes into single commit + push + - Falls through to sequential mode for single-file errors + - Team created/torn down within single fix iteration + ## [1.4.0] - 2025-12-13 ### Added diff --git a/package.json b/package.json index 6c1109a..36a9e98 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "github-plugin", - "version": "1.4.0", + "version": "1.5.0", "description": "GitHub CI/CD automation plugin for Claude Code", "private": true, "type": "module", From e3072896158ef2bdee6f70eb073d7a8271a42c90 Mon Sep 17 00:00:00 2001 From: Ladislav Martincik Date: Mon, 9 Feb 2026 16:14:57 +0100 Subject: [PATCH 4/4] refactor: apply constitution compliance transformations to swarm prompts Replace bare "CRITICAL CONSTRAINTS" blocks with reasoning-based constraints across all 4 reviewer teammate templates and ci-fix-loop teammate template. Replace exhaustive review checklists with judgment criteria for severity. Replace rigid FINDINGS FORMAT with judgment-guided FINDINGS GUIDANCE. --- commands/review-pr.md | 167 +++++++++++++++--------------------- skills/ci-fix-loop/SKILL.md | 13 +-- 2 files changed, 75 insertions(+), 105 deletions(-) diff --git a/commands/review-pr.md b/commands/review-pr.md index e4ad2db..9241c2a 100644 --- a/commands/review-pr.md +++ b/commands/review-pr.md @@ -90,39 +90,30 @@ PR DESCRIPTION: {literal PR body} YOUR TASK: -Review this PR for security vulnerabilities and risks. Focus on: -- OWASP Top 10 vulnerabilities -- Injection vulnerabilities (SQL, NoSQL, Command, LDAP, etc.) -- Authentication and authorization issues -- Secrets exposure (API keys, tokens, passwords in code) -- Dependency vulnerabilities (new or updated dependencies) -- Input validation and sanitization -- Cross-Site Scripting (XSS) potential -- Cross-Site Request Forgery (CSRF) protection -- Insecure cryptographic practices -- Security misconfiguration +Review this PR for security vulnerabilities and risks. Apply your security expertise to identify issues that could expose the application to attacks or data breaches. + +Judgment criteria for severity: +- Critical: Directly exploitable vulnerability (e.g., unsanitized user input reaching a database query, exposed secrets) +- High: Vulnerability requiring specific conditions to exploit (e.g., missing auth check on internal endpoint) +- Medium: Weakness that increases attack surface (e.g., overly permissive CORS, verbose error messages) +- Low: Best practice violation with minimal direct risk (e.g., missing security headers) + +Common areas to examine: authentication flows, input handling, dependency changes, secrets in code, and authorization boundaries. Use your expertise to identify issues beyond this list. CROSS-CONCERN FINDINGS: If you find security issues that also have performance implications (e.g., DoS potential), share via SendMessage: "Security: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" -CRITICAL CONSTRAINTS: -- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) -- DO NOT run git commit/push/add or any destructive commands -- DO NOT use AskUserQuestion (you can't interact with the user) -- DO NOT run build or test commands (may interfere with other teammates) +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. COMPLETION: -When you've completed your security review: -1. Mark your task complete via TaskUpdate with your findings -2. Send "REVIEW COMPLETE" via SendMessage -3. Wait for shutdown_request - -FINDINGS FORMAT: -- File path and line numbers for all issues -- Severity level (Critical/High/Medium/Low) -- Specific vulnerability description -- Recommendation for remediation +When you've completed your security review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each issue, include file path with line numbers, severity level, vulnerability description, and remediation recommendation. Match detail level to severity — critical issues deserve thorough explanation. ``` **Teammate 2: Performance Reviewer** @@ -152,38 +143,30 @@ PR DESCRIPTION: {literal PR body} YOUR TASK: -Review this PR for performance implications. Focus on: -- Algorithmic complexity (O(n²), O(n log n), etc.) -- Memory allocation patterns (unnecessary allocations, memory leaks) -- Database query efficiency (N+1 queries, missing indexes, overfetching) -- Caching opportunities (missing caching, cache invalidation issues) -- Bundle size impact (for frontend code) -- Async/await patterns (blocking operations, parallel opportunities) -- Resource leaks (unclosed connections, file handles, event listeners) -- Unnecessary re-renders or re-computations -- Network request optimization +Review this PR for performance implications. Identify changes that could degrade response times, increase resource consumption, or create scalability bottlenecks. + +Judgment criteria for impact: +- Critical: Changes that will noticeably degrade performance at current scale (e.g., N+1 queries in a hot path, O(n²) on large datasets) +- High: Changes likely to cause issues at moderate scale (e.g., missing index on a growing table, synchronous I/O in async context) +- Medium: Suboptimal patterns that accumulate (e.g., unnecessary allocations in loops, missed caching opportunities) +- Low: Minor inefficiencies with negligible real-world impact (e.g., slightly verbose serialization) + +Common areas to examine: algorithmic complexity, database query patterns, memory/resource management, async patterns, and bundle size. Use your expertise to identify issues beyond this list. CROSS-CONCERN FINDINGS: If you find performance issues that also have security implications (e.g., DoS potential), share via SendMessage: "Performance: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" -CRITICAL CONSTRAINTS: -- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) -- DO NOT run git commit/push/add or any destructive commands -- DO NOT use AskUserQuestion (you can't interact with the user) -- DO NOT run build or test commands (may interfere with other teammates) +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. COMPLETION: -When you've completed your performance review: -1. Mark your task complete via TaskUpdate with your findings -2. Send "REVIEW COMPLETE" via SendMessage -3. Wait for shutdown_request - -FINDINGS FORMAT: -- File path and line numbers for all issues -- Performance impact level (Critical/High/Medium/Low) -- Specific performance concern description -- Recommendation for optimization +When you've completed your performance review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each issue, include file path with line numbers, impact level, concern description, and optimization recommendation. Match detail level to impact — critical issues deserve thorough explanation. ``` **Teammate 3: Test Coverage Reviewer** @@ -213,37 +196,30 @@ PR DESCRIPTION: {literal PR body} YOUR TASK: -Review this PR for test adequacy and quality. Focus on: -- Test adequacy for all changed code paths -- Edge case coverage (boundary conditions, error states, null/undefined handling) -- Test quality (not just existence — are tests meaningful?) -- Regression potential (areas where bugs could reappear) -- Missing test scenarios (untested combinations, integration scenarios) -- Test maintainability (clear, readable, not brittle) -- Test isolation (proper mocking, no side effects) -- Assertion quality (specific assertions, not just "it doesn't crash") +Review this PR for test adequacy and quality. Assess whether the test suite adequately covers the changes and would catch regressions. + +Judgment criteria for gap severity: +- Critical: Core functionality completely untested (e.g., new API endpoint with no tests, auth logic without coverage) +- High: Important edge cases missing (e.g., error handling paths, boundary conditions on critical logic) +- Medium: Test exists but is shallow or brittle (e.g., only tests happy path, uses implementation details) +- Low: Nice-to-have coverage improvements (e.g., additional assertion specificity, minor edge cases) + +Key questions to answer: Are tests meaningful (not just "it doesn't crash")? Do they cover error states and edge cases? Would they catch regressions if someone modifies this code later? Are tests maintainable and isolated? CROSS-CONCERN FINDINGS: If you find test gaps that expose security or performance risks, share via SendMessage: "Test Coverage: Missing tests at {file}:{line} also creates {other concern} risk — {explanation}" -CRITICAL CONSTRAINTS: -- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) -- DO NOT run git commit/push/add or any destructive commands -- DO NOT use AskUserQuestion (you can't interact with the user) -- DO NOT run build or test commands (may interfere with other teammates) +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. COMPLETION: -When you've completed your test coverage review: -1. Mark your task complete via TaskUpdate with your findings -2. Send "REVIEW COMPLETE" via SendMessage -3. Wait for shutdown_request - -FINDINGS FORMAT: -- File path and line numbers for untested or inadequately tested code -- Coverage gap severity (Critical/High/Medium/Low) -- Specific test scenario missing -- Recommendation for test improvements +When you've completed your test coverage review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each gap, include file path with line numbers, severity, what test scenario is missing, and recommendation. Match detail level to severity — critical gaps deserve thorough explanation. ``` **Teammate 4: Architecture Reviewer** @@ -273,37 +249,30 @@ PR DESCRIPTION: {literal PR body} YOUR TASK: -Review this PR for architectural quality and design patterns. Focus on: -- Design patterns (appropriate pattern usage, anti-patterns) -- SOLID principles (Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, Dependency Inversion) -- Code organization (proper module/package structure, file placement) -- API design (consistent interfaces, clear contracts, backward compatibility) -- Separation of concerns (business logic vs presentation, data access layer) -- Dependency management (appropriate dependencies, circular dependencies, coupling) -- Breaking changes (API changes that affect consumers) -- Consistency with existing codebase patterns +Review this PR for architectural quality and design patterns. Evaluate whether the changes follow established codebase conventions and maintain a sustainable design. + +Judgment criteria for impact: +- Critical: Breaking changes to public APIs, circular dependencies introduced, or fundamental design violations that would be costly to fix later +- High: Patterns that deviate significantly from codebase conventions, tight coupling that limits extensibility, or poor separation of concerns +- Medium: Suboptimal design choices that work but create maintenance burden (e.g., logic in wrong layer, inconsistent abstractions) +- Low: Style-level architectural preferences with minimal real impact + +Key questions to answer: Does this follow the existing codebase's patterns? Are responsibilities clearly separated? Would a new developer understand the design intent? Are there breaking changes that affect consumers? CROSS-CONCERN FINDINGS: If you find architectural issues that affect security, performance, or testability, share via SendMessage: "Architecture: This {issue} at {file}:{line} also has {other concern} implications — {explanation}" -CRITICAL CONSTRAINTS: -- Use ONLY read-only tools (Read, Grep, Glob, Bash for non-destructive commands) -- DO NOT run git commit/push/add or any destructive commands -- DO NOT use AskUserQuestion (you can't interact with the user) -- DO NOT run build or test commands (may interfere with other teammates) +WORKING CONSTRAINTS: +You're operating in a parallel review team. This means: +- Read-only access: Don't modify the codebase or run build/test commands — other reviewers are working concurrently and modifications would cause conflicts. +- Team communication only: Use SendMessage for cross-concern findings. You cannot interact with the user directly. COMPLETION: -When you've completed your architecture review: -1. Mark your task complete via TaskUpdate with your findings -2. Send "REVIEW COMPLETE" via SendMessage -3. Wait for shutdown_request - -FINDINGS FORMAT: -- File path and line numbers for all issues -- Architectural impact level (Critical/High/Medium/Low) -- Specific architectural concern description -- Recommendation for improvement +When you've completed your architecture review, mark your task complete via TaskUpdate with your findings, send "REVIEW COMPLETE" via SendMessage, and wait for shutdown_request. + +FINDINGS GUIDANCE: +For each issue, include file path with line numbers, impact level, architectural concern, and improvement recommendation. Match detail level to impact — critical issues deserve thorough explanation. ``` ### Completion Protocol diff --git a/skills/ci-fix-loop/SKILL.md b/skills/ci-fix-loop/SKILL.md index 49d337f..3047424 100644 --- a/skills/ci-fix-loop/SKILL.md +++ b/skills/ci-fix-loop/SKILL.md @@ -191,12 +191,13 @@ Fix all errors in your assigned files. Apply targeted fixes based on error type: - Test failures: fix test logic or implementation bugs - Build errors: fix import paths, missing dependencies -CRITICAL CONSTRAINTS: -- Edit files ONLY - no git commit/push/add commands -- DO NOT run test commands or build commands (may interfere with other teammates) -- DO NOT use AskUserQuestion (you can't interact with the user) -- Read files completely without limit/offset -- Make surgical, minimal changes to fix errors +WORKING CONSTRAINTS: +You're one of several agents editing files in parallel. This means: +- Edit only your assigned files — the lead handles git staging and commits to avoid conflicts between teammates. +- Don't run tests or builds — they'd interfere with other teammates' file changes happening concurrently. +- Communicate only via SendMessage (no user interaction available in team context). +- Make minimal, targeted fixes — over-editing risks introducing new errors and makes the lead's review harder. +- Read files completely without limit/offset so you don't miss relevant context. COMPLETION SIGNAL: When you've fixed all errors in your files: