Skip to content

Comments

[build-tools] Add eas/report_maestro_test_results build function#3388

Open
hSATAC wants to merge 1 commit intomainfrom
ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route
Open

[build-tools] Add eas/report_maestro_test_results build function#3388
hSATAC wants to merge 1 commit intomainfrom
ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route

Conversation

@hSATAC
Copy link
Contributor

@hSATAC hSATAC commented Feb 9, 2026

Why

Maestro test jobs produce per-flow results on the build worker (pass/fail, duration, error messages, retries), but this data is lost after the job finishes.

We want to persist these as WorkflowDeviceTestCaseResult records to surface test outcomes in the dashboard.

How

Add a new eas/report_maestro_test_results build function that:

  1. Parses JUnit XML reports from the junit_report_directory (primary data source)
  2. Enriches with flow file paths from Maestro's ai-*.json debug output in tests_directory
  3. Calls the createWorkflowDeviceTestCaseResults GraphQL mutation via gql.tada + graphqlClient

Key details:

  • JUnit XML as primary data source — ai-*.json only used for flow_file_path mapping and retry count
  • junit_report_directory added as output on eas/__maestro_test (set when output_format=junit) and as optional input on eas/report_maestro_test_results
  • Retry detection by counting occurrences of the same flow across timestamp directories (works for both reuse_devices: true and false patterns)
  • Duplicate testcase name detection — skips report when Maestro config-level name collisions make flow-to-file mapping ambiguous
  • Path relativization uses fs.realpath to handle symlinks (/tmp/private/tmp on macOS)
  • Best-effort reporting — errors are logged but never fail the build step
  • graphqlClient exposed on CustomBuildContext for step functions to use
  • 39 unit tests covering JUnit parsing, flow metadata, retry detection, properties/tags, duplicate detection, and error resilience

Test Plan

@linear
Copy link

linear bot commented Feb 9, 2026

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

Size Change: +3.48 kB (0%)

Total Size: 72.7 MB

Filename Size Change
./packages/eas-cli/dist/eas-linux-x64.tar.gz 72.7 MB +3.48 kB (0%)

compressed-size-action

@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch from 2e32515 to 3201ab1 Compare February 9, 2026 16:52
@hSATAC hSATAC added the no changelog PR that doesn't require a changelog entry label Feb 9, 2026
@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch 2 times, most recently from 51c142a to cc45900 Compare February 10, 2026 06:46
@codecov
Copy link

codecov bot commented Feb 10, 2026

Codecov Report

❌ Patch coverage is 93.29268% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.81%. Comparing base (0c32aa3) to head (1b474a6).

Files with missing lines Patch % Lines
...d-tools/src/steps/functions/maestroResultParser.ts 91.67% 10 Missing ⚠️
...ls/src/steps/functions/reportMaestroTestResults.ts 97.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3388      +/-   ##
==========================================
+ Coverage   52.32%   52.81%   +0.49%     
==========================================
  Files         807      809       +2     
  Lines       33868    34032     +164     
  Branches     7033     7071      +38     
==========================================
+ Hits        17719    17970     +251     
+ Misses      14750    14669      -81     
+ Partials     1399     1393       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch from cc45900 to f171e9e Compare February 10, 2026 07:12
@hSATAC hSATAC marked this pull request as ready for review February 10, 2026 07:19
@hSATAC hSATAC requested a review from sjchmiela February 10, 2026 07:19
@github-actions
Copy link

Subscribed to pull request

File Patterns Mentions
**/* @douglowder

Generated by CodeMention

@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch 3 times, most recently from 275434c to 7fb1035 Compare February 11, 2026 14:35
@hSATAC
Copy link
Contributor Author

hSATAC commented Feb 11, 2026

The Size / compare check is failing now due the the yarn gql prebuild script. The graphql was not deployed yet so it can't fetch the required new gql schema.

@hSATAC hSATAC requested a review from sjchmiela February 11, 2026 14:42
@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch 3 times, most recently from 198a91d to 1476a49 Compare February 13, 2026 14:14
@hSATAC hSATAC requested a review from sjchmiela February 13, 2026 14:32
flowFilePath: string;
}

export interface MaestroFlowResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a zod schema we would validate against? same for FlowMetadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowMetadata will be switched to z.output as part of the next comment.

For MaestroFlowResult — this is an internal type we construct ourselves in parseMaestroResults(), not something parsed from external input. TypeScript already guarantees correctness at compile time here, so adding Zod runtime validation wouldn't provide additional safety. Happy to change if you feel strongly though.

* Reads tags from a Maestro flow YAML file's config section.
* Flow files are structured as: config (object) + `---` + commands (array).
*/
export async function parseFlowTags(flowFilePath: string): Promise<string[]> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maestro 2.2.0 supposedly added tags and properties to XML. Maybe we can rely on that? I think that might simplify code a lot?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you're leaning towards parsing JSON… hmm

One peculiarity about commands is that it may contain multiple FAILED steps. I have such commands.json where a subflow is marked as FAILED and a single command is marked as failed. Not sure if we want to need to remember and handle these cases… But maybe we do and we just pick the last (by timestamp) FAILED step as the one that triggered the failure…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let you decide what to do. Maybe we can submit a PR to Maestro to get millisecond precision on durations… https://github.com/mobile-dev-inc/Maestro/blob/e47ba45faf8318d411717ff7a03cc1b5d764fb72/maestro-cli/src/main/java/maestro/cli/report/JUnitTestSuiteReporter.kt#L55

Copy link
Contributor Author

@hSATAC hSATAC Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's stick with JUnit as the primary source for now. Maestro seems to be investing more in JUnit support, so aligning with that direction should be a good bet.

I checked the Maestro source — as of 2.2.0, tags are included in JUnit XML as <property name="tags" value="smoke, critical, auth"/> (comma-separated). Custom properties were added in 2.1.0. I'll update parseJUnitTestCases() to extract tags from properties directly, and drop parseFlowTags().

For older Maestro versions that don't emit tags/properties, these will gracefully degrade to empty values ([] / {}), so no breakage.

We still need ai-*.json for flow_file_path (to compute relative paths) and retry counting (via timestamp directory occurrences).

@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch 2 times, most recently from 238fd51 to 7a2c0cc Compare February 24, 2026 15:11
@hSATAC hSATAC force-pushed the ash/eng-19032-report-workflowtestcaseresult-in-reuse_device-true-route branch from 7a2c0cc to 1b474a6 Compare February 24, 2026 15:19
@github-actions
Copy link

⏩ The changelog entry check has been skipped since the "no changelog" label is present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog PR that doesn't require a changelog entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants