Feature/mi300 a support #270

benielT · 2025-09-24T20:10:45Z

No description provided.

(cherry picked from commit 81b5ef6)

(cherry picked from commit a8f4dcc)

…d and data of a dat points to same pointer. if the malloc is handled by the target rt, the allocated memory will be visible to both sides.

…host.

reguly

This looks good to me. However, please try and run the full test suite on an MI300A to make sure all the code paths are covered well

gihanmudalige · 2025-09-25T08:24:55Z

@Ashutosh-Londhe do you have access to an MI300A to run our usual test scripts for this ?

Ashutosh-Londhe · 2025-09-25T11:16:20Z

@Ashutosh-Londhe do you have access to an MI300A to run our usual test scripts for this ?

Sorry i dont have access to MI300A. But seems like benny is trying on COSMA machine so i can request an access to that project which has MPI300A. I will check wtih Beniel about the details.

benielT · 2025-09-25T12:16:12Z

This looks good to me. However, please try and run the full test suite on an MI300A to make sure all the code paths are covered well

I have access to MI300A and am testing it right now before finalizing the pull request. There are some minor bugs in the current development related to ROCM-SMI power profiling. I will add that change with this pull request as well.

…MI300A_support

…pport rsmi API

benielT · 2025-09-25T19:57:25Z

Added all the changes. With these changes, 10% perf improvement on the gnu_ompi combination. The MI300A platform we tested on Archer2 does have ROCM toolchain + OpenMPI, where we saw significant improvement. Unfortunately, I couldn't
replicate similar results in COSMA.

benielT · 2025-09-25T19:58:56Z

An additional fix commit has been added to resolve the linking error related to power profiling with the rocmSMI APIs

Ashutosh-Londhe · 2025-10-03T14:17:59Z

Hi @gihanmudalige I have got the access to MI300A to perform the testing. For Laplace2d and Poisson, HIP version is failing to produce correct result, informed this to Beniel, he is looking into the issue.

@reguly will similar changes will also require for SYCL backend if underline GPU support unified memory? or SYCL doesnt support this feature?

reguly · 2025-10-07T20:13:19Z

Yes, ideally with all "offload"-style backends should support this.

benielT added 11 commits June 30, 2025 11:54

MI300A_archer2 environment setup scripts added

faba10b

(cherry picked from commit 81b5ef6)

Added GPU afinity script for AMD APUs like MI300A

a918ec9

(cherry picked from commit a8f4dcc)

Added GNU specific AMD linking flags

a7eb9a3

OPS_uvm_device meta data added to OPS_instance. If UVM is used, data_…

572fc38

…d and data of a dat points to same pointer. if the malloc is handled by the target rt, the allocated memory will be visible to both sides.

COSMA setup script for AMD GPU MI300A added.

5215200

HIP related changes added to support intergrated GPUs.

b977fd7

Fixing singlenode UVM device. instead of maloc used ops_device_malloc…

a7a9f19

…host.

UVM related changes for ops mpi partitions.

02fd240

Minor change related to uvm device

60c047c

Removing legacy app make from poisson app.

bff211b

Minor correction on COSMA icx source script

bd6575a

reguly approved these changes Sep 25, 2025

View reviewed changes

benielT added 4 commits September 25, 2025 16:34

Added minor changes to COSMA MI300A source script

98bb0b4

GNU OMPI source file added for COSMA MI300A env. Minor filename change

1105fba

Merge branch 'develop' of https://github.com/OP-DSL/OPS into feature/…

e35d743

…MI300A_support

ops power profile link for HIP targets broken linking fix added to su…

6702637

…pport rsmi API

benielT marked this pull request as ready for review September 25, 2025 19:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/mi300 a support #270

Feature/mi300 a support #270

Uh oh!

benielT commented Sep 24, 2025

Uh oh!

reguly left a comment

Uh oh!

gihanmudalige commented Sep 25, 2025

Uh oh!

Ashutosh-Londhe commented Sep 25, 2025 •

edited

Loading

Uh oh!

benielT commented Sep 25, 2025

Uh oh!

benielT commented Sep 25, 2025

Uh oh!

benielT commented Sep 25, 2025

Uh oh!

Ashutosh-Londhe commented Oct 3, 2025

Uh oh!

reguly commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Feature/mi300 a support #270

Are you sure you want to change the base?

Feature/mi300 a support #270

Uh oh!

Conversation

benielT commented Sep 24, 2025

Uh oh!

reguly left a comment

Choose a reason for hiding this comment

Uh oh!

gihanmudalige commented Sep 25, 2025

Uh oh!

Ashutosh-Londhe commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benielT commented Sep 25, 2025

Uh oh!

benielT commented Sep 25, 2025

Uh oh!

benielT commented Sep 25, 2025

Uh oh!

Ashutosh-Londhe commented Oct 3, 2025

Uh oh!

reguly commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Ashutosh-Londhe commented Sep 25, 2025 •

edited

Loading