I think a fix may be needed for the DLL loading issue, as problems can arise in different Python versions on Windows OS. #230

Han-sok · 2025-08-23T12:25:24Z

Hello,
While I was running your work on my laptop with RyzenAI processor, I came across the problem which you might be interested.

This PR makes small modifications to improve environment compatibility:

Adjusted DLL path handling to work with Python >=3.8
Ensured the script runs consistently across Windows and Linux setups

Tested locally on Python 3.11 with Windows OS. Without these changes, the script fails with a missing DLL error.
Below is the copy of the error message:
(ryzen-ai-1.5.1) PS C:\Projects\RyzenAI\RyzenAI-SW\example\llm\RAG-OGA> python .\rag.py --direct_llm Using ONNX embedding model on VitisAI NPU... WARNING: Logging before InitGoogleLogging() is written to STDERR I20250823 08:23:35.164912 11704 vitisai_compile_model.cpp:1157] Vitis AI EP Load ONNX Model Success I20250823 08:23:35.165664 11704 vitisai_compile_model.cpp:1158] Graph Input Node Name/Shape (2) I20250823 08:23:35.166667 11704 vitisai_compile_model.cpp:1162] input_ids : [1x512] I20250823 08:23:35.167423 11704 vitisai_compile_model.cpp:1162] attention_mask : [1x512] I20250823 08:23:35.167423 11704 vitisai_compile_model.cpp:1168] Graph Output Node Name/Shape (2) I20250823 08:23:35.168171 11704 vitisai_compile_model.cpp:1172] last_hidden_state : [1x512x1024] I20250823 08:23:35.168171 11704 vitisai_compile_model.cpp:1172] pooler_output : [1x1024] 2025-08-23 08:23:37.5524076 [W:onnxruntime:, session_state.cc:1280 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2025-08-23 08:23:37.5741899 [W:onnxruntime:, session_state.cc:1282 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. NPU session created successfully. Number of vectors: 551 Traceback (most recent call last): File "C:\Projects\RyzenAI\RyzenAI-SW\example\llm\RAG-OGA\rag.py", line 73, in <module> llm = custom_llm(model_path=r"C:\huggingface") #update this path File "C:\Projects\RyzenAI\RyzenAI-SW\example\llm\RAG-OGA\custom_llm\custom_llm.py", line 29, in __init__ self._model = og.Model(model_path) RuntimeError: Error loading "C:/Program Files/RyzenAI/1.5.1/deployment/onnx_custom_ops.dll" which depends on "libutf8_validity.dll" which is missing. (Error 126: "The specified module could not be found.")

This should make the repo easier to run out-of-the-box for people with varying environments.

Best regards,
Han

abhishek-amd · 2025-09-04T17:27:25Z

Thanks for raising the PR , we are currently evaluating it.
MSI installer automatically creates a conda environment and install Ryzen AI package inside it , it comes with python-3.10 .
Changing python version may cause compatibility issue.

Currently, we are unable to reproduce the issue you are experiencing. Could you please share the steps you followed to change the python version, along with any other dependency changes you made ?

Han-sok · 2025-09-09T18:01:39Z

Hi Abhishek,

Thank you for your reply.
The environment I am using is the conda setup included with the Ryzen AI package (Python 3.10). I did not alter or create a separate environment. The workloads were run with the RyzenAI software stack and NPU driver as provided, without modifications, other than specifying the Hugging Face model path in the rag.py file.

I even performed a fresh installation of RyzenAI-SW, the RyzenAI NPU driver, and the conda environment, but the result was the same. This might be specific to the customer laptop, since DLL files from external sources are treated as non-system DLLs.

[ctypes](https://docs.python.org/3.8/whatsnew/3.8.html#ctypes)
On Windows, CDLL and subclasses now accept a winmode parameter to specify flags for the underlying LoadLibraryEx call. The default flags are set to only load DLL dependencies from trusted locations, including the path where the DLL is stored (if a full or partial path is used to load the initial DLL) and paths added by [add_dll_directory()](https://docs.python.org/3.8/library/os.html#os.add_dll_directory). (Contributed by Steve Dower in [bpo-36085](https://bugs.python.org/issue?@action=redirect&bpo=36085).)

In short, for Python 3.8+, Windows loader ignores PATH for dependent DLLs of python extensions. Only these locations are searched:

the app/extension’s own folder,
system directories,
directories a user add at runtime via os.add_dll_directory

For your reference, here's my routine to replicate the problem, using Ryzen-AI software out-of-the-box:

set the System PATH variable for path\to\miniforge3\condabin or path\to\miniforge3\Scripts\, and path\to\miniforge3\
install NPU driver
install Ryzen AI Software (ryzen-ai-1.5.1.msi)
git lfs install
git clone https://github.com/amd/RyzenAI-SW.git
git lfs pull
git clone https://huggingface.co/amd/Llama-3.2-3B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid C:\huggingface\Llama-3.2-3B-Instruct_jit
conda activate ryzen-ai-1.5.1
cd RyzenAI-SW/example/llm/RAG-OGA
pip install -r requirements.txt
<update genai_config.json>"custom_ops_library": "C:\Program Files\RyzenAI\1.5.1\deployment\onnx_custom_ops.dll"
python custom_embedding/export_bge_onnx.py
<modify rag.py file for huggingface model loading> llm = custom_llm(model_path="C:\huggingface\Llama-3.2-3B-Instruct_jit")
python .\rag.py

and here's the output I got:

Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows

Loading personal and system profiles took 1050ms.
(base) PS C:\Users\HS S> conda activate ryzen-ai-1.5.1
(ryzen-ai-1.5.1) PS C:\Users\HS S> cd C:\Projects\RyzenAI\test\RyzenAI-SW\example\llm\RAG-OGA\
(ryzen-ai-1.5.1) PS C:\Projects\RyzenAI\test\RyzenAI-SW\example\llm\RAG-OGA> python .\rag.py
Using ONNX embedding model on VitisAI NPU...
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1157] Vitis AI EP Load ONNX Model Success
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1158] Graph Input Node Name/Shape (2)
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1162]          input_ids : [1x512]
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1162]          attention_mask : [1x512]
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1168] Graph Output Node Name/Shape (2)
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1172]          last_hidden_state : [1x512x1024]
I20250909 13:22:50.246279 71084 vitisai_compile_model.cpp:1172]          pooler_output : [1x1024]
2025-09-09 13:22:54.0964435 [W:onnxruntime:, session_state.cc:1280 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-09-09 13:22:54.1081511 [W:onnxruntime:, session_state.cc:1282 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
NPU session created successfully.
Number of vectors: 551
Traceback (most recent call last):
  File "C:\Projects\RyzenAI\test\RyzenAI-SW\example\llm\RAG-OGA\rag.py", line 73, in <module>
    llm = custom_llm(model_path=r"C:\huggingface\Llama-3.2-3B-Instruct_jit")   #update this path
  File "C:\Projects\RyzenAI\test\RyzenAI-SW\example\llm\RAG-OGA\custom_llm\custom_llm.py", line 20, in __init__
    self._model = og.Model(model_path)
RuntimeError: Error loading "C:/Program Files/RyzenAI/1.5.1/deployment/onnx_custom_ops.dll" which depends on "libutf8_validity.dll" which is missing. (Error 126: "The specified module could not be found.")
(ryzen-ai-1.5.1) PS C:\Projects\RyzenAI\test\RyzenAI-SW\example\llm\RAG-OGA> python --version
Python 3.10.0

Here are the system and user paths currently set on my machine:

Hope that helps.

Best regards,
Han

Han-sok added 2 commits August 23, 2025 07:35

Fix script compatibility with Python >=3.8

472f5cd

Added conditional DLL loading based on Python version.

8f405bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

I think a fix may be needed for the DLL loading issue, as problems can arise in different Python versions on Windows OS. #230

I think a fix may be needed for the DLL loading issue, as problems can arise in different Python versions on Windows OS. #230

Uh oh!

Han-sok commented Aug 23, 2025

Uh oh!

abhishek-amd commented Sep 4, 2025

Uh oh!

Han-sok commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

I think a fix may be needed for the DLL loading issue, as problems can arise in different Python versions on Windows OS. #230

Are you sure you want to change the base?

I think a fix may be needed for the DLL loading issue, as problems can arise in different Python versions on Windows OS. #230

Uh oh!

Conversation

Han-sok commented Aug 23, 2025

Uh oh!

abhishek-amd commented Sep 4, 2025

Uh oh!

Han-sok commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants