Skip to content

feat: E2E cloud testing infrastructure for autonomous agents#2

Open
johnsonfamily1234 wants to merge 11 commits intomainfrom
feature/e2e-cloud-testing
Open

feat: E2E cloud testing infrastructure for autonomous agents#2
johnsonfamily1234 wants to merge 11 commits intomainfrom
feature/e2e-cloud-testing

Conversation

@johnsonfamily1234
Copy link

Summary

  • Complete E2E testing infrastructure — GitHub Actions workflow, Playwright + Browserbase fixtures, Vercel preview deployment integration
  • User-pattern discovery process — structured interview skill so agents write tests based on real user journeys, not page structure
  • Enforcement at every level — PRD schema requires e2eTests field, Ralph loop blocks task completion without green CI, branch protection docs included
  • 4 test templates — Next.js webapp, API endpoints, CLI browser OAuth, plus reusable fixtures
  • Worker skills — e2e-testing skills added to frontend-dev, backend-dev, and qa-tester workers
  • Observability/metrics --tests for coverage tracking, agent-friendly result parser with jq queries

Architecture

Test Planning (user-pattern discovery interview)
    → Test Writing (templates + worker skills)
    → Cloud Execution (Vercel Preview + Playwright + Browserbase)
    → Result Parsing (agent-results.json)
    → Enforcement (PR blocked without green E2E)

13 user stories, 31 files, all knowledge/worker/workflow additions — no existing code modified.

Test plan

  • Verify GitHub Actions workflow triggers on PR with paths matching app code
  • Run /run qa-tester test-plan against an existing project to validate discovery flow
  • Run /run frontend-dev e2e-testing write to generate a test from a test plan
  • Confirm validate-prd.ps1 catches PRDs missing e2eTests field
  • Verify Browserbase fixture falls back to local Playwright when no API key set

🤖 Generated with Claude Code

johnsonfamily1234 and others added 11 commits February 7, 2026 22:30
Create a reusable Playwright E2E testing workflow that deploys to Vercel
preview, waits for readiness, and runs tests with Browserbase cloud
execution support and local fallback.

Key design decisions:
- Vercel project/team IDs use repository variables (vars.*) instead of
  hardcoded values, so any project can configure its own
- Test directory is configurable via E2E_TEST_DIR env var with
  tests/e2e default, overridable via workflow_dispatch input
- 3-job structure: deploy-preview -> wait-for-deployment -> e2e-tests
- Artifacts: JSON results, HTML report, failure screenshots/traces
- PR comment with pass/fail summary and failure details
- Concurrency group prevents duplicate runs per branch
- Validation step fails fast if required vars are missing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create knowledge/testing/templates/ with portable fixtures for any project:
- browserbase.ts: CDP fixture with session recording and local fallback
- playwright.config.ts: Browserbase auto-detection, CI settings, reporters
- package.json: minimal deps with test/agent/debug npm scripts
- process-results.js: transforms Playwright JSON into agent-results.json
- README.md: usage guide with quick start and customization points

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port three testing knowledge docs from my-hq, replacing hardcoded
project IDs, team IDs, and repo references with placeholder variables.
All C:/my-hq paths replaced with relative paths, installer/-prefixed
paths updated to tests/e2e, and hq-installer-specific content
generalized for any Vercel-deployed project.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ready-to-use Playwright testing templates covering the three most
common application types: Next.js web apps, REST API endpoints, and
CLI tools with browser-based OAuth flows. Each template includes setup,
playwright.config.ts example, common patterns, assertions, and cleanup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces the test-plan skill for the qa-tester worker - a structured
interview and analysis process that ensures E2E tests are grounded in
real user behavior rather than arbitrary UI coverage.

Key capabilities:
- Structured interview protocol covering critical journeys, revenue
  paths, fragile flows, edge cases, and minimum viable user journeys
- Automated analysis fallback when humans are unavailable (crawl UI
  structure, analyze source code, infer priorities)
- Hybrid mode combining automated analysis with validation interviews
- Machine-parseable JSON output consumed by write-test and run-tests
- Coverage matrix revealing blind spots across auth, navigation, forms,
  payments, API, mobile, and accessibility
- Clear critical-path vs coverage test classification
- Template mapping to knowledge/testing/templates/ patterns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive CI E2E Verification section to ralph-loop-pattern.md
that requires E2E workflow pass before task completion. Includes:
- Full push/trigger/poll/parse workflow with gh CLI commands
- agent-results.json download and failure parsing
- 15-minute timeout with BLOCKED status handling
- Failure handling with checkpoint logging
- Emergency skip process with audit requirements
- Quick reference commands for all CI operations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant