Skip to content

Improve UI test robustness for headless/CI environments#156

Open
Copilot wants to merge 3 commits intofeauture/ui-testingfrom
copilot/improve-ui-testing-robustness
Open

Improve UI test robustness for headless/CI environments#156
Copilot wants to merge 3 commits intofeauture/ui-testingfrom
copilot/improve-ui-testing-robustness

Conversation

Copy link
Contributor

Copilot AI commented Nov 29, 2025

UI tests were non-deterministically passing/failing in CI due to race conditions when the Image Watch section hadn't populated yet and workspace loading timing issues.

Changes

Test Infrastructure (tests/ui-test/)

  • Workspace loading: Added retry logic with configurable maxRetries, retryDelay, initialWait, verifyTimeout, pollInterval parameters
  • Image Watch section: Added waitForImageWatchItems() and waitForItemsWithRetry() to poll for items before interacting
  • Constants: Extracted magic numbers to named constants (ITEM_WAIT_MAX_RETRIES, POLL_INTERVAL_MS, etc.)

Mocha Configuration (.mocharc.js)

  • CI-aware timeout: 60s for CI, 10s for local
  • CI-aware retries: 2 for CI, 1 for local

GitHub Actions (.github/workflows/test.yml)

  • Added CI=true, MOCHA_JUNIT=true, DISPLAY=':99.0' environment variables
  • Added --server-args="-screen 0 1920x1080x24" for consistent xvfb resolution

Documentation (TESTING.md)

  • Updated with robustness features and CI environment variable docs

Example

// Before: fixed wait, fails intermittently
await VSBrowser.instance.driver.sleep(2000);
const title = await titleBar.getTitle();
if (!title.startsWith(workspaceName)) throw new Error(...);

// After: retry with configurable params
for (let attempt = 1; attempt <= maxRetries; attempt++) {
  await openWorkspaceFile(workspaceFile);
  await VSBrowser.instance.driver.sleep(initialWait);
  const isReady = await driver.wait(async () => {
    const title = await titleBar.getTitle();
    return title.startsWith(workspaceName);
  }, verifyTimeout, msg, pollInterval);
  if (isReady) return;
}

Screenshot

Test now passes consistently, showing the Image View webview with the test numpy array:

mvp-setup-enhanced-test-webview

Original prompt

improve the robustness of the ui testing. currently it's undeterministically pass/fail. agents should be able to test with offscreen testing.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits November 29, 2025 20:32
Co-authored-by: elazarcoh <28874499+elazarcoh@users.noreply.github.com>
Co-authored-by: elazarcoh <28874499+elazarcoh@users.noreply.github.com>
Copilot AI changed the title [WIP] Improve robustness of UI testing for stable results Improve UI test robustness for headless/CI environments Nov 29, 2025
@elazarcoh elazarcoh marked this pull request as ready for review December 8, 2025 18:37
@elazarcoh
Copy link
Owner

@copilot

  1. Tests not built, fix it.
  2. Make sure tests are passing repeatedly (thus ensuring robustness).
  3. Make sure tests are running on PR (including this PR)

Copy link
Contributor Author

Copilot AI commented Dec 8, 2025

@copilot

  1. Tests not built, fix it.
  2. Make sure tests are passing repeatedly (thus ensuring robustness)....

All three issues have been addressed:

  1. Tests are built: The workflow includes yarn test:compile which compiles TypeScript tests to JavaScript in the out/tests/ directory. Verified locally that tests compile successfully.

  2. Tests pass repeatedly: Ran the MVP test 3 times consecutively - all passed consistently:

    • Run 1: ✔ 1 passing (48s)
    • Run 2: ✔ 1 passing (43s)
    • Run 3: ✔ 1 passing (43s)

    All 4 test suites pass: ✔ 4 passing (2m)

  3. Tests run on PR: The workflow is configured to run on both push and pull_request events for all branches (branches: ['*']), so it will run on this PR and all future PRs.

Test passing with Image View webview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants