Releases: amd/RyzenAI-SW
Release v1.6.0
Release Note: Version 1.6.0
-
BF16 Compiler (CNN, Transformer, ASR)
-
BF16 CNN perf improvements average 80% across release
-
BF16 perf improvements - 1.3X faster on CNN than iGPU and 2.6X faster for transformers than iGPU
-
Improved coverage and improved performance for ASR models
-
Average 3x compile time improvement
-
Smaller installation size
-
Reduction in CPU overhead by pushing data layout transformation to NPU
-
Dynamic batch size support for compilation
-
-
New Integer Compiler (CNN)
- Support for General Asymmetric Quantization enabling third party quantized models to run on NPU
- Support for XINT8, A8W8, A16W8
-
LLM
- Broad Set of NPU only models optimized performance
- New set of hybrid models with bfp16 activation
- New architecture support in hybrid flow (Phi-4, Qwen-3)
- Context length improvement from 2K to 4K for all models.
-
Stable Diffusion Demo
- 8x Dynamic Resolution for SD3.0/3.5 (text2image and image2imageControlNet)
- Performance boost for SD 1.5/2.1-base/turbo/XL-turbo
Support Batch Size 1 for SD-turbo/SDXL-turbo
New model support (SD2.1-v 768x768 text2image, SDXL-base 1024x1024 text2image)
-
Breaking Changes
-
For running INT8 models on STX/KRK or newer devices, the xclbin provider option is no longer supported and should no longer be used. See Using INT8 Models for full details.
-
For running INT8 models on PHX/HPT devices, the target option should be set to X1. The NPU binary should still be specified using the xclbin provider option. See Using INT8 Models for full details.
-
For BF16 models, the default configurations file requires a new target section. See Config File Options for full details.
-
LLM:
- OGA version has been updated to v0.9.2 (Ryzen AI 1.6) from v0.7.0 (Ryzen AI 1.5). Any APIs that are obsolete must be updated to the supported equivalents as described in the Microsoft ONNX Runtime GenAI v0.9.2 documentation
- Hybrid models published with earlier releases are not compatible with Ryzen AI 1.6. Please use the hybrid models published with the 1.6 release.
-
Release v1.5.0
Release Note: Version 1.5.0
-
EoU Improvement
- Application concurrency: improves the resource distribution across applications
- Model Compilation time: 2x – 8x faster
- Installation Size: 80% smaller
-
New getting started tutorial with fine-tuned ResNet BF16 model using Python / C++ for deployment on NPU
-
New object detection tutorial with Yolov8m model with BF16 / XINT8 quantization using AMD-Quark
-
Multi-model demo has been removed
-
Support for New LLMs released
- Qwen/Qwen2.5-1.5B-Instruct
- Qwen/Qwen2.5-3B-Instruct
- Qwen/Qwen2.5-7B-Instruct
-
Bug fixes
-
Breaking Changes
-
The
%RYZEN_AI_INSTALLATION_PATH%\deploymentfolder has been reorganized and flattened. Deployment DLLs are no longer organized in subfolders. If you use application build scripts that pull DLLs from thedeploymentfolder, you need to update them based on the new paths. Refer to the :ref:Application Packaging Requirements <app-packaging>section for further details. -
The
1x4.xclbin(PHX/HPT) andAMD_AIE2P_Nx4_Overlay.xclbin(STX/KRK) NPU binaries are no longer supported and should not be used. You should use the4x4.xclbin(PHX/HPT) andAMD_AIE2P_4x4_Overlay.xclbin(STX/KRK) NPU binaries instead. -
The
XLNX_ENABLE_CACHE,XLNX_VART_FIRMWARE, andXLNX_TARGET_NAMEenvironment variables are no longer supported and should not be relied upon. -
Support for VitisAI EP cache encryption is no longer available. To encrypt the compiled models, use the ONNX Runtime :ref:
EP Context Cache <ort-ep-context-cache>feature instead. -
For INT8 models, the VitisAI EP does not save the compiled model to disk by default. To save the compiled model, use the ONNX Runtime :ref:
EP Context Cache <ort-ep-context-cache>feature or set the :option:enable_cache_file_io_in_memprovider option to 0. -
Generation of the
vitisai_ep_report.jsonfile is no longer automatic and should be manually enabled. See the :ref:Operator Assignment Report <op-assignment-report>section for details. -
Changes to the OGA flow for LLMs:
- OGA Version is updated to v0.7.0 (Ryzen AI 1.5) from v0.6.0 (Ryzen AI 1.4).
- The
hybrid_llmandnpu_llmfolders are consolidated into a new folder namedLLM, which contains themodel_benchmark.exeandrun_model.pyscripts, along with the necessary C++ headers and .lib files to support both the Hybrid LLM and NPU LLM workflows in C++ and Python. - For NPU LLM models, the
vaip_llm.jsonfile is no longer required. As a result, thevaip_llm.jsonpath is removed from thegenai_config.jsonfor all NPU models. Ensure that you re-download the NPU models fromHugging Face <https://huggingface.co/collections/amd/ryzenai-15-llm-npu-models-6859846d7c13f81298990db0>_ when using the Ryzen AI 1.5 installer.
-
Release v1.4.0
Merge pull request #173 from cyndwith/resnet_config removing the optional configuration option
Release v1.3.1
Merge pull request #161 from jeremyfowers/main Add DeepSeek-R1-Distill CPU examples
Release v1.2.0
Merge pull request #119 from cyndwith/main Updates to the RyzenAI-SW demos, examples and tutorials