Draft
Conversation
… is used for comparison
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which Linear task is linked to this PR?
EMB-330 — CI: Smoke test examples on dependency and code changes
Why was it implemented this way?
What this does
Adds a Playwright E2E smoke test suite and a GitHub Actions workflow that verifies widget examples still render correctly when dependencies or example code changes.
The smoke tests verify 3 things against each example app:
The tests use a Component Object Model (not traditional POM — the widget is a single-page component with internal navigation, no URL-based page transitions). Selectors use ARIA roles and accessible names scoped to the widget root (
[id^="widget-app-expanded-container"]), making them framework-agnostic. Nodata-testidattributes or source code changes to examples are required.The CI workflow (
examples-smoke-test.yml) runs on PRs with thecheck-exampleslabel:packages/*file, lockfile, root config, ore2e/change) triggers all 10 examples; an isolatedexamples/<name>/change triggers only that exampleWhy only 10 examples?
All 10 that are compatible have been verified:
The remaining examples are excluded for documented reasons:
@metamask/connect-evm, Rollup polyfill error, ESM shim rejection)subvariant: 'custom'renders "Deposit"/"NFT Checkout" UI instead of "Exchange" — needs its own test assertions<iframe>— requirespage.frameLocator()variant/widget, not/package.jsonEach can be addressed incrementally. The 10 covered examples already catch the most important signal: "did a package change break the widget in a real framework integration?"
Why no auto-approve?
The workflow is gated behind the
check-exampleslabel intentionally. Auto-approve was considered (EMB-332) but deferred because of an unresolved concern:If the workflow triggers automatically on every PR (no label), it would only test examples — but the PR might contain broader changes (backend logic, build config, other packages). Auto-approving based solely on example smoke tests passing could give a false sense of security for PRs that touch more than just examples.
The label-scoped approach is safer: a human explicitly says "this PR needs example verification", the workflow runs, and the results are visible. But even with the label, someone could add it to any PR and get a green check that only reflects example health — not the full PR quality.
Whether to auto-approve, and under what conditions, is an open discussion point. Options include: auto-approve only when the PR exclusively touches
examples/paths, require a separate approval for non-example changes, or keep it manual.Note on selector robustness
The selectors use ARIA roles and visible text (e.g.
getByRole('button', { name: /^From/ })) rather thandata-testidattributes. This is intentional — we want to adddata-testidto the widget source in the future, but they would need to be stripped from production builds since the widget is a library consumed by third parties. Since the smoke tests run against built example apps (not dev mode),data-testidattributes would not be available at test time unless we also solve the build-stripping problem first.The current ARIA-based selectors are stable enough for smoke-level verification but may need updates if i18n strings or accessible names change. This is a known tradeoff documented in the README.
Visual showcase (Screenshots or Videos)
N/A — CI workflow, no UI changes.
Checklist before requesting a review