feat: E2E cloud testing infrastructure for autonomous agents by johnsonfamily1234 · Pull Request #2 · indigoai-us/hq

johnsonfamily1234 · 2026-02-08T16:26:43Z

Summary

Complete E2E testing infrastructure — GitHub Actions workflow, Playwright + Browserbase fixtures, Vercel preview deployment integration
User-pattern discovery process — structured interview skill so agents write tests based on real user journeys, not page structure
Enforcement at every level — PRD schema requires e2eTests field, Ralph loop blocks task completion without green CI, branch protection docs included
4 test templates — Next.js webapp, API endpoints, CLI browser OAuth, plus reusable fixtures
Worker skills — e2e-testing skills added to frontend-dev, backend-dev, and qa-tester workers
Observability — /metrics --tests for coverage tracking, agent-friendly result parser with jq queries

Architecture

Test Planning (user-pattern discovery interview)
    → Test Writing (templates + worker skills)
    → Cloud Execution (Vercel Preview + Playwright + Browserbase)
    → Result Parsing (agent-results.json)
    → Enforcement (PR blocked without green E2E)

13 user stories, 31 files, all knowledge/worker/workflow additions — no existing code modified.

Test plan

Verify GitHub Actions workflow triggers on PR with paths matching app code
Run /run qa-tester test-plan against an existing project to validate discovery flow
Run /run frontend-dev e2e-testing write to generate a test from a test plan
Confirm validate-prd.ps1 catches PRDs missing e2eTests field
Verify Browserbase fixture falls back to local Playwright when no API key set

🤖 Generated with Claude Code

Create a reusable Playwright E2E testing workflow that deploys to Vercel preview, waits for readiness, and runs tests with Browserbase cloud execution support and local fallback. Key design decisions: - Vercel project/team IDs use repository variables (vars.*) instead of hardcoded values, so any project can configure its own - Test directory is configurable via E2E_TEST_DIR env var with tests/e2e default, overridable via workflow_dispatch input - 3-job structure: deploy-preview -> wait-for-deployment -> e2e-tests - Artifacts: JSON results, HTML report, failure screenshots/traces - PR comment with pass/fail summary and failure details - Concurrency group prevents duplicate runs per branch - Validation step fails fast if required vars are missing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create knowledge/testing/templates/ with portable fixtures for any project: - browserbase.ts: CDP fixture with session recording and local fallback - playwright.config.ts: Browserbase auto-detection, CI settings, reporters - package.json: minimal deps with test/agent/debug npm scripts - process-results.js: transforms Playwright JSON into agent-results.json - README.md: usage guide with quick start and customization points Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Port three testing knowledge docs from my-hq, replacing hardcoded project IDs, team IDs, and repo references with placeholder variables. All C:/my-hq paths replaced with relative paths, installer/-prefixed paths updated to tests/e2e, and hq-installer-specific content generalized for any Vercel-deployed project. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add ready-to-use Playwright testing templates covering the three most common application types: Next.js web apps, REST API endpoints, and CLI tools with browser-based OAuth flows. Each template includes setup, playwright.config.ts example, common patterns, assertions, and cleanup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Introduces the test-plan skill for the qa-tester worker - a structured interview and analysis process that ensures E2E tests are grounded in real user behavior rather than arbitrary UI coverage. Key capabilities: - Structured interview protocol covering critical journeys, revenue paths, fragile flows, edge cases, and minimum viable user journeys - Automated analysis fallback when humans are unavailable (crawl UI structure, analyze source code, infer priorities) - Hybrid mode combining automated analysis with validation interviews - Machine-parseable JSON output consumed by write-test and run-tests - Coverage matrix revealing blind spots across auth, navigation, forms, payments, API, mobile, and accessibility - Clear critical-path vs coverage test classification - Template mapping to knowledge/testing/templates/ patterns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add comprehensive CI E2E Verification section to ralph-loop-pattern.md that requires E2E workflow pass before task completion. Includes: - Full push/trigger/poll/parse workflow with gh CLI commands - agent-results.json download and failure parsing - 15-minute timeout with BLOCKED status handling - Failure handling with checkpoint logging - Emergency skip process with audit requirements - Quick reference commands for all CI operations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

johnsonfamily1234 and others added 11 commits February 7, 2026 22:30

docs: add agent-results.json schema and jq query reference

db99395

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add e2eTests field to PRD schema with validation

beaa5fe

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add e2e-testing skill to frontend-dev and backend-dev workers

9dd36b4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add E2E test coverage tracking to /metrics

3b78a78

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add E2E testing requirements to CLAUDE.md

e82672d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: E2E cloud testing infrastructure for autonomous agents#2

feat: E2E cloud testing infrastructure for autonomous agents#2
johnsonfamily1234 wants to merge 11 commits intomainfrom
feature/e2e-cloud-testing

johnsonfamily1234 commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnsonfamily1234 commented Feb 8, 2026

Summary

Architecture

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant