Skip to content

Signal 11 error on multiple machines #276

@TimWin

Description

@TimWin

Hello,
I get this error when trying to run any grappa program on multiple machines:

mpirun -hostfile my_hosts applications/demos/hello_world.exe
. . .
I0328 12:19:22.515194 101851 Grappa.cpp:647]
Shared memory breakdown:
node total: 125.524 GB
locale shared heap total: 62.7622 GB
locale shared heap per core: 62.7622 GB
communicator per core: 0.125 GB
tasks per core: 0.0156631 GB
global heap per core: 15.6905 GB
aggregator per core: 0.0650177 GB
shared_pool current per core: 4.76837e-07 GB
shared_pool max per core: 15.6905 GB
free per locale: 46.8659 GB
free per core: 46.8659 GB

Exiting due to signal 11 with siginfo 0x4003f5326870 and payload 0x4003f5326740
I0328 12:19:22.534696 101851 hello_world.cpp:45] Hello world from locale 0 core 0

Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[29000,1],1]
Exit code: 1

I can successfully execute the programs, e.g. hello_world, on a single machine, but it always chrashes with that signal 11 error when I try to run it on multiple machines.

What can I do to solve that problem?
Please let me know if you need any further information.

Thanks in advance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions