Skip to content

sporadic segfault when not using MPI_THREAD_MULTIPLE #16

@ye-luo

Description

@ye-luo

Reproducer
https://github.com/ye-luo/gpu_shmem/tree/main/ishmem

In app.cpp, if the MPI initialization is changed from MPI_THREAD_MULTIPLE to

MPI_Init_thread(MPI_THREAD_FUNNELED)
MPI_Init()

I got segfault.

On Aurora/Sunspot with oneapi SDK 2025.3.1

$ mkdir build && cd build
$ cmake -DCMAKE_CXX_COMPILER=icpx -DCMAKE_CXX_FLAGS="-fiopenmp -fopenmp-targets=spir64" ..
$ make -j32
$ mpiexec -np 2 gpu_tile_compact.sh ./my_ishmem_app
x1921c1s0b0n0-hsn0.hsn.cm.sunspot.alcf.anl.gov: rank 9 died from signal 11
x1921c1s0b0n0-hsn0.hsn.cm.sunspot.alcf.anl.gov: rank 5 died from signal 15

Most applications actually don't use MPI_THREAD_MULTIPLE and requiring it significantly restrict the adoption of intel shmem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions