Skip to content

CUDAExecutionProvider may silently fall back to CPU - recommend using tuple format #18

@kelunyang

Description

@kelunyang

Summary

When using --device cuda, ONNX Runtime may silently fall back to CPU because the provider is specified as a simple string list instead of tuple format with device configuration.

Background

I'm a high school teacher from Taiwan. I was using Claude (AI assistant) to write a Google Colab notebook for OCR-ing pre-war Japanese official documents (戰前日文公文). While processing a 1400-page PDF, I noticed that GPU VRAM usage was suspiciously low (0 MB) despite specifying --device cuda. After investigation, Claude found the issue in the provider configuration.

Environment

  • Google Colab (Tesla T4)
  • onnxruntime-gpu 1.24.2
  • CUDA 12.8
  • Python 3.12

Problem

Current code in deim.py (line 38) and parseq.py (line 32):

providers = ['CUDAExecutionProvider','CPUExecutionProvider']

This format can cause ONNX Runtime to silently fall back to CPU without any warning.

Suggested Fix

Use tuple format with explicit device configuration:
providers = [('CUDAExecutionProvider', {'device_id': 0}), 'CPUExecutionProvider']

How We Discovered This

1. Ran OCR with --device cuda
2. Checked GPU memory with nvidia-smishowed 0 MB used
3. Applied the fix aboveGPU memory increased to 183 MB
4. Confirmed with session.get_providers() that CUDAExecutionProvider was actually active

References

- https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html

Thank you for developing this excellent tool for historical document recognition!

---

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions