Bump rocm-systems from 0decb2c to 6276d4d#3307
Conversation
geomin12
left a comment
There was a problem hiding this comment.
lgtm assuming CI passes
FYI, we need to retry failed jobs, as the GitHub issues earlier caused failures
|
Will also take a look since on rotation this week! |
|
Looked at https://github.com/ROCm/TheRock/actions/runs/21823463045 and saw some failed tests... Re-running those failed ones. |
|
A newer version of rocm-systems exists, but since this PR has been edited by someone other than Dependabot I haven't updated it. You'll get a PR for the updated version as normal once this PR is merged. |
|
found problematic commit for rocrtst failures: ROCm/rocm-systems@de8012a tested locally by reverting above commit, able to build rocrtst |
Bumps [rocm-systems](https://github.com/ROCm/rocm-systems) from `0decb2c` to `6276d4d`. - [Release notes](https://github.com/ROCm/rocm-systems/releases) - [Commits](ROCm/rocm-systems@0decb2c...6276d4d) --- updated-dependencies: - dependency-name: rocm-systems dependency-version: 6276d4d7ab8350531e84a24d3db65b9f98d85eb6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
84a3b89 to
1251645
Compare
|
Quick notes on 11 failures so far:
|
There was a problem hiding this comment.
- lib+devel wheels for windows on gfx1151 rocm_sdk test failed
https://github.com/ROCm/TheRock/actions/runs/21923355781/job/63423253017?pr=3307#step:6:37
testCLIPathBin (rocm_sdk.tests.devel_test.ROCmDevelTest.testCLIPathBin) ... ++ Exec [B:\actions-runner\_work\TheRock\TheRock]$ 'B:\actions-runner\_work\TheRock\TheRock\.venv\Scripts\python.exe' -P -m rocm_sdk path --bin
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "B:\actions-runner\_work\TheRock\TheRock\.venv\Lib\site-packages\rocm_sdk\__main__.py", line 154, in <module>
main()
File "B:\actions-runner\_work\TheRock\TheRock\.venv\Lib\site-packages\rocm_sdk\__main__.py", line 150, in main
args.func(args)
File "B:\actions-runner\_work\TheRock\TheRock\.venv\Lib\site-packages\rocm_sdk\__main__.py", line 16, in _do_path
root_path = _devel.get_devel_root()
^^^^^^^^^^^^^^^^^^^^^^^
File "B:\actions-runner\_work\TheRock\TheRock\.venv\Lib\site-packages\rocm_sdk\_devel.py", line 63, in get_devel_root
_expand_devel_contents(rocm_sdk_devel_path, site_lib_path)
File "B:\actions-runner\_work\TheRock\TheRock\.venv\Lib\site-packages\rocm_sdk\_devel.py", line 154, in _expand_devel_contents
_lock_and_expand(
File "B:\actions-runner\_work\TheRock\TheRock\.venv\Lib\site-packages\rocm_sdk\_devel.py", line 208, in _lock_and_expand
dest_path.hardlink_to(hardlink_target)
File "B:\actions-runner\_work\_tool\Python\3.12.10\x64\Lib\pathlib.py", line 1396, in hardlink_to
os.link(target, self)
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'B:\\actions-runner\\_work\\TheRock\\TheRock\\.venv\\Lib\\site-packages\\_rocm_sdk_devel\\bin\\hipblaslt\\library\\..\\..\\..\\..\\_rocm_sdk_libraries_gfx1151\\bin\\hipblaslt\\library\\TensileLibrary_BB_BB_HA_Bias_Aux_SAV_UA_Type_BB_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_gfx1151.co' -> 'B:\\actions-runner\\_work\\TheRock\\TheRock\\.venv\\Lib\\site-packages\\_rocm_sdk_devel\\bin\\hipblaslt\\library\\TensileLibrary_BB_BB_HA_Bias_Aux_SAV_UA_Type_BB_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_gfx1151.co'
ERROR
testCLIPathCMake (rocm_sdk.tests.devel_test.ROCmDevelTest.testCLIPathCMake) ... ++ Exec [B:\actions-runner\_work\TheRock\TheRock]$ 'B:\actions-runner\_work\TheRock\TheRock\.venv\Scripts\python.exe' -P -m rocm_sdk path --cmake
FAIL
This looks like a file path length issue? FileNotFoundError: [WinError 3] The system cannot find the path specified: 'B:\\actions-runner\\_work\\TheRock\\TheRock\\.venv\\Lib\\site-packages\\_rocm_sdk_devel\\bin\\hipblaslt\\library\\..\\..\\..\\..\\_rocm_sdk_libraries_gfx1151\\bin\\hipblaslt\\library\\TensileLibrary_BB_BB_HA_Bias_Aux_SAV_UA_Type_BB_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_gfx1151.co' -> 'B:\\actions-runner\\_work\\TheRock\\TheRock\\.venv\\Lib\\site-packages\\_rocm_sdk_devel\\bin\\hipblaslt\\library\\TensileLibrary_BB_BB_HA_Bias_Aux_SAV_UA_Type_BB_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_gfx1151.co'
I wouldn't expect a change to rocm-systems to affect those files from rocm-libraries... I wonder if the tests are flaky or dependent on the runner? Is long path support enabled on that test runner?
There was a problem hiding this comment.
Good call, will check for that on
Runner name: 'windows-strix-halo-gpu-rocm-2'
Runner group name: 'default'
Machine name: 'CHRN-SI-112''B:\\actions-runner\\_work\\TheRock\\TheRock\\.venv\\Lib\\site-packages\\_rocm_sdk_devel\\bin\\hipblaslt\\library\\..\\..\\..\\..\\_rocm_sdk_libraries_gfx1151\\bin\\hipblaslt\\library\\TensileLibrary_BB_BB_HA_Bias_Aux_SAV_UA_Type_BB_HPA_Contraction_l_Ailk_Bjlk_Cijk_Dijk_gfx1151.co' seems to be exactly 260 characters, though it gets resolved to the latter path to 198
There was a problem hiding this comment.
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\LongPathsEnabled was currently 0, so will now set to 1
(May be toggled via Windows Search > MAX_PATH > Enable long paths > Toggle: On)
|
Noticed a force push, so dependabot no longer is managing, but it did create a new bump PR here #3411 (for later commits?) |
|
Will go ahead and merge this after some standup discussions and speaking with @geomin12 about whether these failures are blocking for merging.
For try 4 and 5, the failures could be related to MIOpen test sizes. ROCm/rocm-libraries#3956 (comment) |
That is #3438 . There's a real code issue there, but it isn't a regression. |
Bumps rocm-systems from
0decb2cto6276d4d.Commits
6276d4dFix FindTBB version lookup logic (#3113)8cc3468rocrtst: Enable rocrtstPerf.Memory_Async_Copy emu (#3080)f162895rocr-runtime: fix segfault when queue allocation fails (#2850)8d84709fix shutdown ordering race condition, prevent use-after-free crash (#3004)9c1de5dDisable Direct Reduce Scatter if PXN is disabled (#3077)330dec2Fix merging perfetto files from cached data when multiple mpi rangs are avail...e4c0801[rocprofiler-systems] Fix for Perfetto flow events (#3111)291173fAdd workload specific presets for easier and faster profiling (#2592)85db4e4[ROCM] Enhance amd-smi node to display baseboard temp (#2943)67a57d9[SWDEV-552020/SWDEV-563971] Fail memory partition tests if ASIC supports memo...Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)