Skip to content

bug/jenkins: cuda memory error in nightly tests #170

@hzhou

Description

@hzhou

GPU testing on Jenkins intermittently shows CUDA memory errors. For example, one of the nightly gpu test (https://jenkins-pmrs.cels.anl.gov/view/yaksa/job/yaksa-nightly-gpu/lastCompletedBuild/testReport/):

test/pack/pack -datatype int -count 17 -seed 73 -iters 32768 -segments 1 -ordering normal -overlap none -num-threads 4
 Stack Trace

CUDA Error (yaksuri_cudai_event_query:src/backend/cuda/pup/yaksuri_cudai_event.c,65): an illegal memory access was encountered
lt-pack: test/pack/pack.c:135: runtest: Assertion `dbuf_h' failed.
CUDA Error (yaksuri_cudai_type_free_hook:src/backend/cuda/hooks/yaksuri_cudai_type_hooks.c,92): an illegal memory access was encountered
lt-pack: test/pack/pack.c:249: runtest: Assertion `rc == (0)' failed.
lt-pack: test/pack/pack.c:108: runtest: Assertion `sbuf_h' failed.
      

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions