LLava-Med CUDA Setup failed despite GPU being available.

I am trying to do is to load the model and use it for the medical VQA. I have a dataset with image, question and answer. I would like run the model programmatically.

I have followed the instruction on this repo by creating conda env like the following

```
conda create -n llava-med python=3.10 -y
conda activate llava-med
pip install --upgrade pip  # enable PEP 660 support
pip install -e .
```

And I have run following code which load the hugging face model weight which I download from [Hugging Face](https://huggingface.co/microsoft/llava-med-v1.5-mistral-7b).

```
from PIL import Image
import torch
from libs.LLaVA_Med.llava.model.builder import load_pretrained_model
from libs.LLaVA_Med.llava.constants import DEFAULT_IMAGE_TOKEN, DEFAULT_IM_START_TOKEN, DEFAULT_IM_END_TOKEN

# Set model parameters
model_path = '/mnt/synology/myothiha/models/llava-med-v1.5-mistral-7b'
model_base = None
model_name = 'microsoft/llava-med-v1.5-mistral-7b'

# Load model and tokenizer
# (load_8bit and load_4bit can be set to True for lower memory usage)
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path, model_base, model_name, load_8bit=False, load_4bit=False, device="cuda"
)

# Example: Prepare an image and a prompt
image_path = 'example.jpg'  # Replace with your image path
image = Image.open(image_path).convert('RGB')

# Use multimodal prompt format for LLaVA
prompt = "What abnormality do you see in this X-ray?"

# Compose prompt with image tokens if needed
if hasattr(model, 'config') and getattr(model.config, 'mm_use_im_start_end', False):
    prompt = f"{DEFAULT_IM_START_TOKEN}{DEFAULT_IMAGE_TOKEN}{DEFAULT_IM_END_TOKEN}\n{prompt}"
else:
    prompt = f"{DEFAULT_IMAGE_TOKEN}\n{prompt}"

# Tokenize the prompt (simulate tokenizer_image_token)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()

# Preprocess the image (simulate process_images)
inputs = image_processor(images=image, return_tensors="pt")
image_tensor = inputs['pixel_values'].half().cuda()

# Generate model output (forward pass)
with torch.inference_mode():
    output = model.generate(
        input_ids=input_ids,
        images=image_tensor,
        max_new_tokens=128,
        do_sample=False
    )

# Decode the output
answer = tokenizer.decode(output[0], skip_special_tokens=True)
print("Model answer:", answer)
```

But it cannot run because of the cuda problem. Here's the full error report. 

```
================================================ERROR=====================================
CUDA SETUP: CUDA detection failed! Possible reasons:
1. You need to manually override the PyTorch CUDA version. Please see: "https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
2. CUDA driver not installed
3. CUDA not installed
4. You have multiple conflicting CUDA libraries
5. Required library not pre-compiled for this bitsandbytes release!
CUDA SETUP: If you compiled from source, try again with `make CUDA_VERSION=DETECTED_CUDA_VERSION` for example, `make CUDA_VERSION=113`.
CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via `conda list | grep cuda`.
================================================================================

CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=126
python setup.py install
CUDA SETUP: Setup Failed!
Traceback (most recent call last):
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1382, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/home/anaconda3/envs/llava_med/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 85, in <module>
    from accelerate.hooks import AlignDevicesHook, add_hook_to_module
  File "/home/dice/.local/lib/python3.10/site-packages/accelerate/__init__.py", line 3, in <module>
    from .accelerator import Accelerator
  File "/home/dice/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 35, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/home/dice/.local/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/home/dice/.local/lib/python3.10/site-packages/accelerate/utils/__init__.py", line 131, in <module>
    from .bnb import has_4bit_bnb_layers, load_and_quantize_model
  File "/home/dice/.local/lib/python3.10/site-packages/accelerate/utils/bnb.py", line 42, in <module>
    import bitsandbytes as bnb
  File "/home/dice/.local/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/home/dice/.local/lib/python3.10/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/home/dice/.local/lib/python3.10/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/home/dice/.local/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/home/dice/.local/lib/python3.10/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/home/dice/.local/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 20, in <module>
    raise RuntimeError('''
RuntimeError:
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1382, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/home/anaconda3/envs/llava_med/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 36, in <module>
    from ...modeling_utils import PreTrainedModel
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 42, in <module>
    from .generation import GenerationConfig, GenerationMixin
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1372, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1384, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/dice/myothiha/thesis/test.py", line 3, in <module>
    from libs.LLaVA_Med.llava.model.builder import load_pretrained_model
  File "/home/dice/myothiha/thesis/libs/LLaVA_Med/llava/model/__init__.py", line 1, in <module>
    from .language_model.llava_mistral import LlavaMistralForCausalLM, LlavaMistralConfig
  File "/home/dice/myothiha/thesis/libs/LLaVA_Med/llava/model/language_model/llava_mistral.py", line 6, in <module>
    from transformers import AutoConfig, AutoModelForCausalLM, \
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1373, in __getattr__
    value = getattr(module, name)
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1372, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/home/dice/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1384, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.mistral.modeling_mistral because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback):

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
```

If I don't install the requirement of this repo. I have no CUDA issues but I have other errors. So, I tried to recreate the same env as the original model. 

If someone knows the solution, please let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLava-Med CUDA Setup failed despite GPU being available. #124

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LLava-Med CUDA Setup failed despite GPU being available. #124

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions