I think a fix may be needed for the DLL loading issue, as problems can arise in different Python versions on Windows OS. #230
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

Hello,
While I was running your work on my laptop with RyzenAI processor, I came across the problem which you might be interested.
This PR makes small modifications to improve environment compatibility:
Tested locally on Python 3.11 with Windows OS. Without these changes, the script fails with a missing DLL error.
Below is the copy of the error message:
(ryzen-ai-1.5.1) PS C:\Projects\RyzenAI\RyzenAI-SW\example\llm\RAG-OGA> python .\rag.py --direct_llm Using ONNX embedding model on VitisAI NPU... WARNING: Logging before InitGoogleLogging() is written to STDERR I20250823 08:23:35.164912 11704 vitisai_compile_model.cpp:1157] Vitis AI EP Load ONNX Model Success I20250823 08:23:35.165664 11704 vitisai_compile_model.cpp:1158] Graph Input Node Name/Shape (2) I20250823 08:23:35.166667 11704 vitisai_compile_model.cpp:1162] input_ids : [1x512] I20250823 08:23:35.167423 11704 vitisai_compile_model.cpp:1162] attention_mask : [1x512] I20250823 08:23:35.167423 11704 vitisai_compile_model.cpp:1168] Graph Output Node Name/Shape (2) I20250823 08:23:35.168171 11704 vitisai_compile_model.cpp:1172] last_hidden_state : [1x512x1024] I20250823 08:23:35.168171 11704 vitisai_compile_model.cpp:1172] pooler_output : [1x1024] 2025-08-23 08:23:37.5524076 [W:onnxruntime:, session_state.cc:1280 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2025-08-23 08:23:37.5741899 [W:onnxruntime:, session_state.cc:1282 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. NPU session created successfully. Number of vectors: 551 Traceback (most recent call last): File "C:\Projects\RyzenAI\RyzenAI-SW\example\llm\RAG-OGA\rag.py", line 73, in <module> llm = custom_llm(model_path=r"C:\huggingface") #update this path File "C:\Projects\RyzenAI\RyzenAI-SW\example\llm\RAG-OGA\custom_llm\custom_llm.py", line 29, in __init__ self._model = og.Model(model_path) RuntimeError: Error loading "C:/Program Files/RyzenAI/1.5.1/deployment/onnx_custom_ops.dll" which depends on "libutf8_validity.dll" which is missing. (Error 126: "The specified module could not be found.")This should make the repo easier to run out-of-the-box for people with varying environments.
Best regards,
Han