Thread individual GPU targets through test fetch pipeline#3452
Open
stellaraccident wants to merge 1 commit intomainfrom
Open
Thread individual GPU targets through test fetch pipeline#3452stellaraccident wants to merge 1 commit intomainfrom
stellaraccident wants to merge 1 commit intomainfrom
Conversation
When THEROCK_KPACK_SPLIT_ARTIFACTS=ON, the build produces per-target
archives (e.g. blas_lib_gfx942.tar.zst) but the test fetch logic only
looked for family-named archives (e.g. blas_lib_gfx94X-dcgpu.tar.zst),
causing test jobs to get only generic (host-only) binaries with no
device code.
Add `fetch-gfx-targets` list field to amdgpu_family_matrix.py mapping
each test runner to the GPU architecture(s) available on it. Thread
this as `amdgpu_targets` through the workflow chain:
configure_ci.py → multi_arch_ci_linux.yml → test_artifacts.yml →
test_{sanity_check,component}.yml → setup_test_environment action →
install_rocm_from_artifacts.py → fetch_artifacts.py
The fetch logic now uses inclusive ArtifactName-based matching: it
accepts both old family-named archives (mono-arch pipeline) and new
individual-target archives (split/kpack pipeline), so the same code
works against either bucket layout.
Changes:
- Add `fetch-gfx-targets` to all matrix entries in amdgpu_family_matrix.py
- Thread `amdgpu_targets` through configure_ci.py into CI matrix JSON
- Add `amdgpu_targets` input to test workflow YAML chain
- Accept `--amdgpu-targets` in fetch_artifacts.py, artifact_manager.py,
and install_rocm_from_artifacts.py
- Rewrite list_artifacts_for_group() to use ArtifactName.from_filename()
for structured parsing instead of substring matching
- Add `--amdgpu-targets` to artifact_manager.py fetch subcommand
Testing:
- 51 unit tests pass (configure_ci, fetch_artifacts, artifact_manager)
- Dry-run validated against real CI runs:
- Multi-arch run 21854651990: correctly finds split archives like
blas_lib_gfx942.tar.zst when --amdgpu-targets=gfx942
- Mono-arch run 22080712092: correctly finds family-named archives
like rocblas_lib_gfx94X-dcgpu.tar.xz (backwards compatible)
- Both with and without --amdgpu-targets flag verified
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #3444 (sub-issue of #3336)
When
THEROCK_KPACK_SPLIT_ARTIFACTS=ON, the build produces per-target archives (e.g.,blas_lib_gfx942.tar.zst) but the test fetch logic only looked for family-named archives (e.g.,blas_lib_gfx94X-dcgpu.tar.zst), causing test jobs to get only generic (host-only) binaries with no device code.fetch-gfx-targetslist field toamdgpu_family_matrix.pymapping each test runner to the GPU architecture(s) on itamdgpu_targetsthrough the workflow chain:configure_ci.py→multi_arch_ci_linux.yml→test_artifacts.yml→test_{sanity_check,component}.yml→setup_test_environment→install_rocm_from_artifacts.py→fetch_artifacts.pylist_artifacts_for_group()to useArtifactName.from_filename()for structured parsing with inclusive matching (accepts both old family-named and new target-named archives)--amdgpu-targetstoartifact_manager.pyfetch subcommandTest plan
blas_lib_gfx942.tar.zst)multi_arch/integration-kpackbranch (pushed, running)🤖 Generated with Claude Code