Skip to content

rlogwood/gpu_tests

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPU Tests

Note: switched to CachyOS where newer NVIDIA drivers are available.

GPU Tests for CachyOS

Using the latest NVIDIA drivers, the GPU is working for PyTorch and TensorFlow with production packages.

CachyOS setup

sh-5.3$ uname -a
Linux ****** 6.19.3-2-cachyos #1 SMP PREEMPT_DYNAMIC Thu, 19 Feb 2026 21:03:04 +0000 x86_64 GNU/Linux

sh-5.3$ nvidia-smi
Sun Mar  1 22:48:11 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 590.48.01              Driver Version: 590.48.01      CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5080        On  |   00000000:01:00.0  On |                  N/A |
|  0%   38C    P5             25W /  360W |    3756MiB /  16303MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

sh-5.3$ cat /etc/os-release 
NAME="CachyOS Linux"
PRETTY_NAME="CachyOS"
ID=cachyos
ID_LIKE=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://cachyos.org/"
DOCUMENTATION_URL="https://wiki.cachyos.org/"
SUPPORT_URL="https://discuss.cachyos.org/"
BUG_REPORT_URL="https://github.com/cachyos"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=cachyos

  • Abandoned: GPU Tests for Ubuntu 24.04.3 LTS and Nvidia 5080

    NOTE: Ubuntu support for the 5080 was found only in nightly build tf-nighly[cuda] at the time this test repo was created. Since them I've moved to CachyOS, which works great with the latest NVIDIA drivers, that support the 5080, as well as the latest TensorFlow production packages. With CachyOS the only blank screens I get is sometimes after logging in after a long sleeps, and a fix for this is being tested. There's also an open issue with Nividia for this problem.

Summary

This repo has a couple of simple tests to confirm the GPU is being used for PyTorch and TensorFlow. The tests were developed with claude.ai.

For more seemless GPU 5000 series support, try CachyOS.

NOTE: As of 12/19/25 TensorFlow Ubuntu support for the 5080 is found only in nightly build and requires setup found in config_tf_env.sh.

Running the tests

setup the virtual environment

python -m venv .venv
. .venv/bin/activate
pip install -r requirements

In a new shell

# pytorch test
python pytorch_cuda_test.py

# tensorflow tests env setup
. config_tf_env.sh

# simple GPU test for TensorFlow
python tensorflow_cuda_test.py

# more intensive GPU test for TensorFlow
python tf_benchmark_mnist_cnn.py

Note: JIT compilation and warning configuration in tensorflow_cuda_test.py

# Disable XLA JIT compilation BEFORE importing TensorFlow.
# The RTX 5080 (Compute Capability 12.0) is not yet fully supported.
# XLA's fused kernel compilation consumes excessive memory and crashes with OOM.
os.environ['TF_XLA_FLAGS'] = '--tf_xla_auto_jit=-1'

# Suppress C++ level warnings (like "GPU interconnect information not available")
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# Disable oneDNN custom operations to avoid floating-point round-off warnings
# and ensure consistent CPU results for comparison.
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'

Background

Using the nightly build of tensorflow as of 11/24/25 resolves the following problem as summarized by Claude.ai:

Unfortunately, TensorFlow doesn't properly support RTX 5080 (Blackwell/sm_120) yet. The exact error you're seeing (CUDA_ERROR_INVALID_PTX) is a known issue affecting all RTX 50-series cards. The Problem Your RTX 5080 has compute capability 12.0 (Blackwell architecture), which is so new that:

TensorFlow wasn't built with CUDA kernel binaries compatible with compute capability 12.0, and attempts to JIT-compile from PTX fail with CUDA_ERROR_INVALID_PTX GitHub

Error:

Performing matrix multiplication on GPU...
W0000 00:00:1764044088.195750   20660 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1764044088.197440   20660 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11463 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
2025-11-24 23:14:48.524495: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_PTX'

see: tensorflow/tensorflow#90291

About

Test GPU for AI/ML tasks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors