Create new test type `aamtest` for accessibility API testing by spectranaut · Pull Request #57696 · web-platform-tests/wpt

spectranaut · 2026-02-11T00:17:09Z

This PR adds a new test type to test accessibility APIs exposed by browsers, as defined by the ARIA, Core-AAM and HTML-AAM specifications. The RFC can be found here.

This is a potential replacement for: #53733

Instead of extending testharness, I added a new test type (aamtest) that is similar to wdspec tests and uses a lot of the same infrastructure. This idea came from @foolip in this comment on the RFC, and I think it looks good!

To run tests on Linux:

On Debian distros:
apt install libatspi2.0-dev libcairo2-dev libgirepository1.0-dev

# Chrome
./wpt run chrome wai-aria-aam/role/blockquote_tentative.py

# Chromium
./wpt run  --binary <chromiumbinary> chromium wai-aria-aam/role/blockquote_tentative.py

# Firefox (needs --no-headless explicitly set)
./wpt run --no-headless firefox wai-aria-aam/role/blockquote_tentative.py

On Mac:

Run chrome tests with --no-headless. Safari does not yet support this test type.

community-tc-integration · 2026-02-11T20:24:29Z

Uh oh! Looks like an error!

Client ID static/taskcluster/github does not have sufficient scopes and is missing the following scopes:

{
  "AnyOf": [
    "queue:rerun-task:taskcluster-github/RBJqfU0pQ82PBMqZMXxGYw/RqnO5x7vQu-LPaJ5z8Ue5w",
    "queue:rerun-task-in-project:none",
    {
      "AllOf": [
        "queue:rerun-task",
        "assume:scheduler-id:taskcluster-github/RBJqfU0pQ82PBMqZMXxGYw"
      ]
    }
  ]
}

This request requires the client to satisfy the following scope expression:

{
  "AnyOf": [
    "queue:rerun-task:taskcluster-github/RBJqfU0pQ82PBMqZMXxGYw/RqnO5x7vQu-LPaJ5z8Ue5w",
    "queue:rerun-task-in-project:none",
    {
      "AllOf": [
        "queue:rerun-task",
        "assume:scheduler-id:taskcluster-github/RBJqfU0pQ82PBMqZMXxGYw"
      ]
    }
  ]
}

method: rerunTask
errorCode: InsufficientScopes
statusCode: 403
time: 2026-02-11T20:24:29.184Z

spectranaut · 2026-02-11T22:08:35Z

@jcsteh -- I'd love your early feedback on this completely new direction to add AAM tests, the tests are like the wpt's webdriver spec tests, all in python!

Look at the blockquote test.

The APIs are passed to the test as arguments ("fixtures" in pytest speak -- defined in wai-aria-aam/support/fixtures_a11y_api.py). The atspi argument is a AtspiWrapper object, and the axapi argument is a AxapiWrapper object, and the ia2 argument is a Ia2Wrapper object.

You can see these tests already in the wpt.fyi for this PR: https://wpt.fyi/results/?label=pr_head&max-count=1&pr=57696

jcsteh · 2026-02-12T07:47:48Z

@spectranaut Thanks for the early ping and for your work on this. This looks really neat!

I haven't looked at this in-depth yet, but here are some early thoughts:

I notice that this moves away from the declarative approach and is more imperative. On one hand, that's what I was advocating for, so that's nice for me. :) On the other hand, I recall you feeling strongly about declarative tests, so I'd love to understand why you feel the imperative approach works with this Python framework, but didn't fit for the TestDriver framework. I totally understand if you just had a change of heart and incorporated that here, but if there's a different and/or another reason you think it makes more sense here, that understanding might guide other thinking and future possibilities.
This more or less flips the flow. Instead of writing web stuff and calling out to Python to test it, we now call out to the browser from Python to load web stuff and then test it in Python. At the risk of stating the obvious, while there were challenges with the former approach for complex cases (e.g. needing to potentially send Python code to be evaluated), there are also challenges with this latter approach for complex cases (e.g. testing mutations will require sending JS to be evaluated). It's probably fair to say that we need to run more Python than we do JS for these tests, so driving them with Python will reduce the amount of ugly cross-language shenanigans, but I just want to flag that we're not going to escape this altogether; we will absolutely need to test many kinds of mutations going forward. I reckon most of the obscurest browser engine bugs I have to fix end up being related to mutations in some way or another. :)
I'm not super familiar with this framework, so just to double check, is it definitely possible for us to evaluate whatever JS we need to run via the session object?
Can we await results from JS too; e.g. await some DOM event before executing something in Python? We'll mostly want to wait for accessibility events, not DOM events, but there are complex cases where being able to wait for some DOM event can be useful.
Speaking of accessibility events, I do think this will make supporting those simpler. We could have done that by sending Python from JS, having the Python block and then return the result to JS, which is what the Gecko IA2/ATK/UIA tests do. However, that might have been a bit tricky/ugly with TestDriver, whereas it's cleaner and more straightforward if we can keep it all in Python. FWIW, I wrote Python helpers to wait for specific IA2, ATK and UIA events for Gecko tests, so that should hopefully be helpful when we get to that point.
It'd be nice if we could avoid the if not atspi: return style boilerplate at the top of every test, but we probably can't. I thought about using a decorator that could handle this for us, but I suspect the "magic" used by fixtures wouldn't like that much and it only really reduces the boilerplate by 1 or 2 lines anyway (a decorator still means a line of code).

spectranaut · 2026-02-17T20:41:38Z

@jcsteh thanks as always for the thoughts! :)

On 1, imperative vs declarative -- tbh I never had a strong preference either way, maybe slight :) I think the declarative approach aligns the way the mapping of the Core-AAM are presented.. they are somewhat simplified and kind of have their own language for describing the APIs. Plus we can reuse all the manual tests Joanie maintained. But I think I've been convinced by you that closer to the API/imperative tests will get us better results -- and make a better and more flexible test suite in the long run.

On 2, on the python vs html+js flip -- yeah I see the tradeoffs! The tests in this PR all have inline html, but for more complicated tests, we can create an separate html file to open. I think if we are going to write imperative tests (which I've been convinced), I think we should write the tests in python, and choose those tradeoffs.

On 3, on session objects/executing javascript -- the session object is an implementation of webdriver maintained in wpt here: https://github.com/web-platform-tests/wpt/tree/master/tools/webdriver These tests have all of webdriver available to them, including the ability to send in javascript to execute, or sending clicks, keys, etc.

On 4, on DOM events -- in webdriver classic, you can't wait on DOM events, you can only poll for changes, which is probably good enough? There is a way to wait for things with webdriver bidi, but Safari doesn't have support for bidi yet.

On 5 accessibility events -- awesome, yes, that will be helpful, and I think accessibility event testing will be easier here too.

On 6, the if not atspi: return -- it's not great and I'll keep an eye out for options.. not sure that fixtures can help, but maybe some other pytest thing. I really want there to be a "not applicable" concept which can be applied to subtests, but I haven't dug into that.

spectranaut · 2026-02-19T19:55:33Z

Hi @jcsteh -- I'm noticing that these tests are flakey on Firefox on Linux.. and I wonder if you know why or can think of an easy fix. The flakes were caught by the Community-TC Integration / wpt-firefox-nightly-stability and are easy to reproduce locally.

Basically, the nodes all appear in the tree, but not all the correct attributes are set by the time we query for them.

In the code, before we run the test, we (1) load the webpage, then (2) find for the correct tab (role: document web), then (3) wait until "busy" is not set.

But when you run the test immediately after that, finding the node by DOM ID fails sometimes -- the blockquote node does not always have a DOM ID attribute. I added a poll to try to solve for this but it doesn't seem like a great solution, and then, I'm getting flakey failures while looking for another attribute in another test, as you can see in this CI report.

Am I waiting for busy on the wrong thing? Or is this bug in firefox?

jcsteh · 2026-02-19T21:29:16Z

Ah, this is due to caching granularity. By default, we only enable a small set of cached attributes to improve memory usage and performance, since a lot of clients don't need everything. When a client first requests something that isn't in the cache, we asynchronously enable it from that point forward. You can work around this by setting the pref accessibility.enable_all_cache_domains to true, the same way you set the accessibility.force_disabled pref.

spectranaut · 2026-02-20T19:13:22Z

Bad news, @jcsteh 😢
I turned on caching and I still see the flake. I confirmed the setting was on in about:config. See the flake report when I remove polling for the dom id and the flake report with polling enabled -- it's essentially the same as if this setting was not set.

jcsteh · 2026-02-23T03:34:47Z

Very odd. I'll need to get this running locally so I can shove some logging into Gecko and see what's going on. What's really strange is that we have a whole bunch of Gecko tests which cover exactly this behaviour.

jcsteh · 2026-03-02T07:34:07Z

@spectranaut, are you far enough along with Windows or Mac testing to know whether this flake shows up for Firefox on either of those platforms? That is, is this just a Linux flake at this stage or is that not conclusive yet?

spectranaut · 2026-03-03T16:28:16Z

hi @jonathan-j-lee ! Could you take a look at this alternative test format for the same AAM tests? Some of the code is the same and that code includes your review feedback on the other PR (#53733)

wai-aria-aam/attribute/aria-autocomplete_tentative.py

jonathan-j-lee

tools/ mostly LG, but will wait for the RFC to land. I think this is a nice improvement from #53733 overall, since the accessibility-specific infrastructure sits closer to the tests where it's used.

tools/manifest/sourcefile.py

jonathan-j-lee · 2026-03-04T00:53:36Z

tools/wptrunner/wptrunner/browsers/chrome.py


+    if test_type == "aamtest":
+        # Necessary to force chrome to register in AT-SPI2.
+        os.environ["ACCESSIBILITY_ENABLED"] = "1"


It seems this will set ACCESSIBILITY_ENABLED=1 in the main wptrunner process permanently. If so, chrome instances for later test types could inadvertently inherit this environment variable, inducing flakiness.

I think it would be better to add {"env": {"ACCESSIBILITY_ENABLED": "1"}} to browser_kwargs for test_type=aamtest:

wpt/tools/wptrunner/wptrunner/browsers/chrome.py

Lines 51 to 55 in 9bd5c2f

def browser_kwargs(logger, test_type, run_info_data, config, **kwargs):

return {"binary": kwargs["binary"],

"webdriver_binary": kwargs["webdriver_binary"],

"webdriver_args": kwargs.get("webdriver_args"),

"leak_check": kwargs.get("leak_check", False)}

... which will be plumbed to the chrome(driver) process tree here:

wpt/tools/wptrunner/wptrunner/browsers/base.py

Line 360 in 9bd5c2f

env=self.env,

Going to/from the aamtest type will always induce a browser restart, which will clear ACCESSIBILITY_ENABLED=1.

Awesome, thanks for catching this and the pointers!

I immediately ran into a problem, though, to get it to work I need to update this line:

wpt/tools/wptrunner/wptrunner/browsers/base.py

Line 324 in 9bd5c2f

self.env = os.environ.copy() if env is None else env

To:

self.env = {**os.environ, **env} if env else os.environ.copy()

It doesn't look like any other product uses the env key in their return from browser_kwargs, so I think this change might be fine to do?

As another possibility, firefox_android.py returns a "env_extras" key:

wpt/tools/wptrunner/wptrunner/browsers/firefox_android.py

Line 83 in 9bd5c2f

browser_kwargs["env_extras"] = dict([x.split('=') for x in kwargs.get("env", [])])

Which seems to be used the way I want to use self.env (as additional env variables, not the complete set)... but it is only used within firefox_android.py, and has a really confusing name collision with the "def env_extra()", which seems to be something very different (defined in all products but only used in sauce.py).

Do you have an opinion about either direction? treat "env" as an addition environment variable with the one-line fix above, or return "env_extras" instead and add code to have it handled as additional env variables in either chrome.py... or maybe base.py?

tools/wptrunner/wptrunner/browsers/chrome.py

jonathan-j-lee · 2026-03-04T01:13:15Z

wai-aria-aam/support/api_wrapper.py

+                f"Couldn't find browser {self.product_name} in accessibility API {self.ApiName}."
+            )
+
+    def _find_browser(self) -> Any:


To be more specific than Any, we can parameterize the class over root's type.

ApiWrapper looks like an abstract class, so we can make it so and mark _find_browser as an abstract method that subclasses must implement.

Putting this together:

import abc from typing import Generic, TypeVar Browser = TypeVar('Browser') class ApiWrapper(Generic[Browser], abc.ABC): ... @abc.abstractmethod def _find_browser(self) -> Browser: ...

... and then Ia2Wrapper would look like:

class Ia2Wrapper(ApiWrapper[IAccessible2Ptr]): def _find_browser(self) -> IAccessible2Ptr: ...

Great, thanks for the explanation about the abstract class definitions, applied!

I applied the parameterize as well, but it's a bit odd on Windows and Mac. For example, on windows, in ia2_wrapper.py, I use a type alias for readability (IAccessiblePtr = Any ) because I couldn't find a way to get a type for those COM interface pointer objects. In the CI, mypy doesn't run on these files, but when I run it locally I get the mypy error: ia2_wrapper.py:73: error: Class cannot subclass value of type "Any" [misc]. So it's still Any for Windows. It's the same scenario for macOS.

I do have a type for linux (atspi_wrapper.py) though :)

jonathan-j-lee · 2026-03-04T01:17:52Z

wai-aria-aam/support/api_wrapper.py

+    def _find_browser(self) -> Any:
+        pass
+
+    def _poll_for(self, find: Callable[[], Any], error: str) -> Any:


Similarly, instead of Any, parameterization can document that _poll_for() and find() have the same return type:

PollResult = TypeVar('PollResult') def _poll_for(self, find: Callable[[], PollResult]) -> PollResult: ...

(The error argument seems unused?)

cool! And actually the error is used, see the raise TimeoutError

wai-aria-aam/support/ia2/constants.py

jcsteh · 2026-03-05T05:46:48Z

it's essentially the same as if this setting was not set.

Well, uh, I found a bug that basically means this pref doesn't work. Ahem. I filed bug 2021210 and submitted a patch. Once that gets into Nightly, this problem should hopefully be resolved.

jcsteh · 2026-03-10T07:10:48Z

@spectranaut The accessibility.enable_all_cache_domains fix has landed in Firefox now. I'm hoping that should fix the flake issues you're having with Firefox.

jcsteh · 2026-03-12T06:13:22Z

It just occurred to me that we're creating a new directory here: wai-aria-aam. However, we already have various -aam directories: core-aam, html-aam, etc. Many of the non-core tests will probably remain TestDriver tests using get_computed_role and eventually get_accessibility_properties. But is there any reason we're not doing most of this in the existing core-aam directory? my concern is that we now have two places to find core-aam tests, which is kinda confusing.

spectranaut · 2026-03-12T15:21:17Z

It just occurred to me that we're creating a new directory here: wai-aria-aam.

Right, this was on my list to discuss with people before landing, thanks for bringing it up. I put it all in one directory because all the AAM tests/API tests will be using the same python infrastructure -- everything in the currently wai-aria-aam/support/ directory. Also, wptrunner knows the test type based on the directory of the tests. All the wdspec tests are in a single directory (webdriver) so I copied that plan for consistency when I was making the proof of concept... and I just hadn't revisited that yet.

It wouldn't be a problem to use the existing directories, however. I could make each -aam directory have a folder aamtests that contain the python tests, and html-aam can point to the core-aam/aamtests/support/ directory, and we can discover if the aamtest test type if they are contained in an aamtests folder.

So I'll make that switch in the next couple days!

moz-wptsync-bot added the mozilla:gecko-blocked label Feb 11, 2026

spectranaut force-pushed the acacia-wdspec-style-tests branch 4 times, most recently from 5aacc60 to 2016a87 Compare February 11, 2026 19:47

spectranaut force-pushed the acacia-wdspec-style-tests branch 2 times, most recently from 3101e53 to 7bda3f7 Compare February 11, 2026 21:13

spectranaut requested a review from jcsteh February 11, 2026 21:35

This was referenced Feb 17, 2026

Meeting: March 3, 2026, @ 9 AM PST web-platform-tests/interop-accessibility#219

Closed

RFC 204: WPT testing for AAMs (platform accessibility APIs) web-platform-tests/rfcs#204

Open

spectranaut force-pushed the acacia-wdspec-style-tests branch 2 times, most recently from a9fac34 to de8bf82 Compare February 19, 2026 00:58

spectranaut added 5 commits February 19, 2026 11:28

Create new test type for accessibility API testing

80f1246

Fix mac tests

a45d66b

Fix windows

389751b

Refactor atspi_wrapper and fix flake

94cf02c

Refactor wrappers, formatting

8765a3a

spectranaut force-pushed the acacia-wdspec-style-tests branch from a09c749 to 8765a3a Compare February 19, 2026 19:29

spectranaut added 2 commits February 20, 2026 09:37

Enable firefox caching and don't poll for id

4cf35c7

Add back polling for firefox

81691a3

Use test result 'PRECONDITION_FAILED' for tests on wrong platform

9637525

spectranaut requested a review from gsnedders March 3, 2026 16:28

gsnedders reviewed Mar 3, 2026

View reviewed changes

wai-aria-aam/attribute/aria-autocomplete_tentative.py Outdated Show resolved Hide resolved

spectranaut changed the title ~~Create new test type for accessibility API testing~~ Create new test type aamtest for accessibility API testing Mar 3, 2026

spectranaut marked this pull request as ready for review March 3, 2026 21:53

spectranaut requested review from a team as code owners March 3, 2026 21:53

wpt-pr-bot added ci docker infra manifest mypy.ini Taskcluster wai-aria-aam wpt wptrunner The automated test runner, commonly called through ./wpt run labels Mar 3, 2026

wpt-pr-bot assigned gsnedders Mar 3, 2026

wpt-pr-bot requested review from DanielRyanSmith, foolip and jgraham March 3, 2026 21:53

spectranaut added 2 commits March 3, 2026 14:54

Use proper pytest parameterizing norms

832ef1f

In AXAPI: use the PID if it exists

f1ebab5

jonathan-j-lee reviewed Mar 4, 2026

View reviewed changes

Code review comments

904605c

spectranaut mentioned this pull request Mar 11, 2026

Interop 2026 Accessibility Investigation web-platform-tests/interop-accessibility#202

Open

cookiecrook requested a review from twilco March 11, 2026 22:32

	def browser_kwargs(logger, test_type, run_info_data, config, **kwargs):
	return {"binary": kwargs["binary"],
	"webdriver_binary": kwargs["webdriver_binary"],
	"webdriver_args": kwargs.get("webdriver_args"),
	"leak_check": kwargs.get("leak_check", False)}

Conversation

spectranaut commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

To run tests on Linux:

On Mac:

Uh oh!

community-tc-integration bot commented Feb 11, 2026

Uh oh!

spectranaut commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcsteh commented Feb 12, 2026

Uh oh!

spectranaut commented Feb 17, 2026

Uh oh!

spectranaut commented Feb 19, 2026

Uh oh!

jcsteh commented Feb 19, 2026

Uh oh!

spectranaut commented Feb 20, 2026

Uh oh!

jcsteh commented Feb 23, 2026

Uh oh!

jcsteh commented Mar 2, 2026

Uh oh!

spectranaut commented Mar 3, 2026

Uh oh!

Uh oh!

jonathan-j-lee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonathan-j-lee Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

spectranaut Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonathan-j-lee Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

spectranaut Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

jonathan-j-lee Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

spectranaut Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jcsteh commented Mar 5, 2026

Uh oh!

jcsteh commented Mar 10, 2026

Uh oh!

jcsteh commented Mar 12, 2026

Uh oh!

spectranaut commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

spectranaut commented Feb 11, 2026 •

edited

Loading

spectranaut commented Feb 11, 2026 •

edited

Loading