Skip to content

test(smoke): setup playwright test suite#684

Draft
vinzenzLIFI wants to merge 7 commits intomainfrom
test/emb-331-create-playwright-test-suite
Draft

test(smoke): setup playwright test suite#684
vinzenzLIFI wants to merge 7 commits intomainfrom
test/emb-331-create-playwright-test-suite

Conversation

@vinzenzLIFI
Copy link
Copy Markdown

@vinzenzLIFI vinzenzLIFI commented Apr 2, 2026

Which Linear task is linked to this PR?

EMB-330 — CI: Smoke test examples on dependency and code changes

Why was it implemented this way?

What this does

Adds a Playwright E2E smoke test suite and a GitHub Actions workflow that verifies widget examples still render correctly when dependencies or example code changes.

The smoke tests verify 3 things against each example app:

  1. The widget container renders with the Exchange heading, From/To buttons, and Send input
  2. The Settings view opens via the cog icon, all setting rows are present, and back-navigation works
  3. Token selection works end-to-end — opens the From selector, picks a token, opens the To selector, picks a second token, both reflected in the Exchange view

The tests use a Component Object Model (not traditional POM — the widget is a single-page component with internal navigation, no URL-based page transitions). Selectors use ARIA roles and accessible names scoped to the widget root ([id^="widget-app-expanded-container"]), making them framework-agnostic. No data-testid attributes or source code changes to examples are required.

The CI workflow (examples-smoke-test.yml) runs on PRs with the check-examples label:

  1. Detects what changed — shared dependency change (any packages/* file, lockfile, root config, or e2e/ change) triggers all 10 examples; an isolated examples/<name>/ change triggers only that example
  2. Builds and tests in parallel — each affected example gets its own matrix shard (build → serve → run Playwright tests)
  3. Uploads Playwright reports on failure as artifacts for debugging

Why only 10 examples?

All 10 that are compatible have been verified:

  • vite, connectkit, nextjs, nextjs15, privy, privy-ethers, rainbowkit, reown, svelte, zustand-widget-config

The remaining examples are excluded for documented reasons:

Category Examples Reason
Build/serve failures nextjs14, nextjs14-page-router, nuxt, react-router-7, remix Pre-existing dependency issues (missing @metamask/connect-evm, Rollup polyfill error, ESM shim rejection)
Different widget mode deposit-flow, nft-checkout subvariant: 'custom' renders "Deposit"/"NFT Checkout" UI instead of "Exchange" — needs its own test assertions
Iframe isolation vite-iframe, vite-iframe-wagmi Widget inside <iframe> — requires page.frameLocator() variant
Non-root route tanstack-router Widget at /widget, not /
Auth-gated dynamic Requires wallet auth init before widget renders
Vue wrapper vue React-in-Vue (veaury) bridge doesn't expose widget root ID
Empty nextjs-page-router No package.json

Each can be addressed incrementally. The 10 covered examples already catch the most important signal: "did a package change break the widget in a real framework integration?"

Why no auto-approve?

The workflow is gated behind the check-examples label intentionally. Auto-approve was considered (EMB-332) but deferred because of an unresolved concern:

If the workflow triggers automatically on every PR (no label), it would only test examples — but the PR might contain broader changes (backend logic, build config, other packages). Auto-approving based solely on example smoke tests passing could give a false sense of security for PRs that touch more than just examples.

The label-scoped approach is safer: a human explicitly says "this PR needs example verification", the workflow runs, and the results are visible. But even with the label, someone could add it to any PR and get a green check that only reflects example health — not the full PR quality.

Whether to auto-approve, and under what conditions, is an open discussion point. Options include: auto-approve only when the PR exclusively touches examples/ paths, require a separate approval for non-example changes, or keep it manual.

Note on selector robustness

The selectors use ARIA roles and visible text (e.g. getByRole('button', { name: /^From/ })) rather than data-testid attributes. This is intentional — we want to add data-testid to the widget source in the future, but they would need to be stripped from production builds since the widget is a library consumed by third parties. Since the smoke tests run against built example apps (not dev mode), data-testid attributes would not be available at test time unless we also solve the build-stripping problem first.

The current ARIA-based selectors are stable enough for smoke-level verification but may need updates if i18n strings or accessible names change. This is a known tradeoff documented in the README.

Visual showcase (Screenshots or Videos)

N/A — CI workflow, no UI changes.

Checklist before requesting a review

  • I have performed a self-review and testing of my code.
  • This pull request is focused and addresses a single problem.
  • If this PR modifies the Widget API or adds new features that require documentation, I have updated the documentation in the public-docs repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant