Skip to content

NTHU-SC/AMD_SeisSol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

SeisSol Performance Evaluation

This repository provides the compilation scripts for SeisSol (v1.3.1), comparing execution efficiency and performance between AMD MI210 and NVIDIA V100.

Performance Comparison

The experiment focus on two test cases: tpv33 and Turkey.
The AMD platform utilized MI210 GPUs, while the NV platform (Taiwania 2) utilized V100 GPUs.

Results

Test Case AMD MI210 NV V100 (Taiwania 2)
tpv33 4 min 41.23 s (w/ Hip Graph)
10 min 29.85 s (w/o Hip Graph)
23 min 24.52 s (w/ Cuda Graph)
Turkey OOM (Plasticity = 1)
1 h 41 min 40 s (Plasticity = 0)
> 2 hrs (Unified Memory, Plasticity = 1)
Segmentation Fault (Separated Memory, Plasticity = 0)

Profile Results

The execution efficiency is heavily dictated by synchronization and kernel launch overhead.

  • cudaStreamSynchronize accounted for 86.1% of total CUDA API time.
  • MPI communication was negligible, totaling less than 2 seconds.
  • Utilizing CUDA/HIP Graphs effectively reducing the CPU-to-GPU submission overhead.

Discussion and Conclusion

  • The AMD MI210 demonstrated superior raw performance over the NV V100.
  • The Turkey test case highlighted significant memory constraints. The V100 suffered from OOM errors when plasticity was enabled. On AMD platforms, Unified Memory was required to prevent illegal memory access.
  • Since scientific computing relies heavily on GEMM, cuBLAS parameter tuning or converting operations to GEMM is critical for future performance gains.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages