Skip to content

donglinz/nvtx-cuda-graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nvtx-cuda-graph

An enhanced NVTX extension that lets you insert NVTX push/pop ranges not only normally but also into a CUDA Graph capture. The range push/pop will automatically decorate your nsys report during graph replay.

Usage:

from nvtx_cuda_graph import graph_captured_nvtx_range

@graph_captured_nvtx_range("dummy_kernel")
def dummy_kernel(x):
    return x * 2.0

graph_captured_nvtx_range will insert nvtx range node if under graph capturing or mark nvtx range normally if not under capturing.

Graph capture awared nvtx will significantly increase your runtime overhead so this repo is for debugging purpose only.

About

CUDA Graph aware nvtx

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors