Skip to content

Conversation

@JamesonRGrieve
Copy link
Owner

No description provided.

Remove `exec 1> >(tee ...)` patterns that create background subprocesses
which can hang indefinitely, causing the workflow to timeout at 0.4% CPU.

The process substitution with tee was only present in test-target-branch
and prepare-notification jobs, not in test-source-branch, which explains
why PRs passed (~15min) while main timed out (hours).

Root cause: When bash uses `>(tee ...)`, it creates a background process.
If the tee doesn't finish writing buffered data before the fd restore,
the shell waits indefinitely for the subprocess to exit.
- Simplify test-py-pytest.yml to single-branch test runner (~270 lines vs ~1200)
- Create run-branch-test.yml orchestrator with:
  - Framework detection (extensible for jest, xunit, etc.)
  - Target branch caching (avoids re-running tests on same SHA)
  - Concurrency groups (queues concurrent jobs testing same target)
  - Parallel test execution via pytest-xdist (-n auto)
- Fix timeout bug: removed problematic `exec 1> >(tee ...)` process
  substitution that caused jobs to hang at 0.4% CPU

Architecture:
- test-py-pytest.yml: Simple, stateless "test this branch" workflow
- run-branch-test.yml: Smart orchestrator that calls test workflow
  twice and compares via regression-test.yml

Caching flow:
1. Check cache for target branch + SHA
2. Cache hit: skip test workflow, use cached results
3. Cache miss: run tests, save results to cache
4. Concurrency group ensures only one job populates cache
@JamesonRGrieve JamesonRGrieve merged commit 42a39e7 into main Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants