-
Notifications
You must be signed in to change notification settings - Fork 41
Open
Description
Summary
When using --device cuda, ONNX Runtime may silently fall back to CPU because the provider is specified as a simple string list instead of tuple format with device configuration.
Background
I'm a high school teacher from Taiwan. I was using Claude (AI assistant) to write a Google Colab notebook for OCR-ing pre-war Japanese official documents (戰前日文公文). While processing a 1400-page PDF, I noticed that GPU VRAM usage was suspiciously low (0 MB) despite specifying --device cuda. After investigation, Claude found the issue in the provider configuration.
Environment
- Google Colab (Tesla T4)
- onnxruntime-gpu 1.24.2
- CUDA 12.8
- Python 3.12
Problem
Current code in deim.py (line 38) and parseq.py (line 32):
providers = ['CUDAExecutionProvider','CPUExecutionProvider']
This format can cause ONNX Runtime to silently fall back to CPU without any warning.
Suggested Fix
Use tuple format with explicit device configuration:
providers = [('CUDAExecutionProvider', {'device_id': 0}), 'CPUExecutionProvider']
How We Discovered This
1. Ran OCR with --device cuda
2. Checked GPU memory with nvidia-smi → showed 0 MB used
3. Applied the fix above → GPU memory increased to 183 MB
4. Confirmed with session.get_providers() that CUDAExecutionProvider was actually active
References
- https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html
Thank you for developing this excellent tool for historical document recognition!
---Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels