Skip to content

Petsc overrides Julia signal handler #228

@vchuravy

Description

@vchuravy

Julia multithreaded uses signals for stop-the-world purposes, loading PETSc.jl in an multi-threaded Julia leads to failures like:

Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
ijl_gc_safepoint () at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/jlapi.c:786
⚠ warning: 786 /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/jlapi.c: File o directory non esistente
(gdb) bt
#0  ijl_gc_safepoint () at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/jlapi.c:786
#1  0x00007ffff72b387c in ijl_task_get_next (trypoptask=<optimized out>, q=<optimized out>, checkempty=<optimized out>)
    at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/scheduler.c:459
#2  0x00007fffd8717e1f in julia_poptask_5618 () at task.jl:1216
#3  0x00007fffd7e95200 in julia_wait_5555 () at task.jl:1228
#4  0x00007fffd6fd23c0 in julia_#wait#398_6619 () at condition.jl:141
#5  0x00007fffb9567929 in julia_wait_readnb_14431 (x=..., nb=1) at stream.jl:392
#6  0x00007fffb96f21c9 in julia_eof_14434 (s=...) at stream.jl:106
#7  0x00007fffb907c89f in jfptr_eof_14435 () from /home/fferrazzini/.julia/juliaup/julia-1.12.5+0.x64.linux.gnu/share/julia/compiled/v1.12/REPL/u0gqU_ZarBC.so
#8  0x00007fffb9563f2c in julia_eof_14767 (io=...) at io.jl:472
#9  0x00007fffb93499ff in julia_#prompt!##2_18239 ()
    at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2948
#10 0x00007fffb906d5bc in jfptr_YY.promptNOT.YY.YY.2_18240.1 ()
   from /home/fferrazzini/.julia/juliaup/julia-1.12.5+0.x64.linux.gnu/share/julia/compiled/v1.12/REPL/u0gqU_ZarBC.so
#11 0x00007ffff727bb8a in jl_apply (nargs=1, args=0x7fffd0e5f010) at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/julia.h:2391
#12 start_task () at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/task.c:1252

Running under gdb and breaking on sigaction we can see that PETSc installs it's own signal handlers

Thread 1 "julia" hit Breakpoint 1, __GI___sigaction (sig=7,
    act=act@entry=0x7fffffff6a20, oact=oact@entry=0x7fffffff6ac0)
    at sigaction.c:27
27    {
#0  __GI___sigaction (sig=7, act=act@entry=0x7fffffff6a20,
    oact=oact@entry=0x7fffffff6ac0) at sigaction.c:27
#1  0x00007ffff7dcc141 in __bsd_signal (sig=<optimized out>,
    handler=<optimized out>) at ../sysdeps/posix/signal.c:45
#2  0x00007fff01b063b2 in PetscPushSignalHandler ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#3  0x00007fff01b4d49a in PetscOptionsCheckInitial_Private ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#4  0x00007fff01b6c8d1 in PetscInitialize_Common ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#5  0x00007fff01b6ddfc in PetscInitialize.part.0 ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#6  0x00007fff01b6df98 in PetscInitializeNoArguments ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#7  0x00007fff4a510bb9 in ?? ()
#8  0x0000000000000000 in ?? ()

https://petsc.org/release/manualpages/Sys/PetscPushSignalHandler/

There is no way to return to a signal handler that was set directly by the user with the UNIX signal handler API or by the loader. That information is lost with the first call to [PetscPushSignalHandler](https://petsc.org/release/manualpages/Sys/PetscPushSignalHandler/)()

Looking at the source code there seems to be:

  PetscCall(PetscOptionsGetBool(NULL, NULL, "-no_signal_handler", &flg1, NULL));
  if (!flg1) PetscCall(PetscPushSignalHandler(PetscSignalHandlerDefault, NULL));

So perhaps we can simply set that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions