Add the support of AMD MI300X/MI325X/MI355X of Ernie 4.5-VL recipe by ChangLiu0709 · Pull Request #228 · vllm-project/recipes

ChangLiu0709 · 2026-01-29T16:30:18Z

Update Ernie 4.5-VL docs for AMD GPUs
Add VLLM_ROCM_USE_AITER=1 for enabling AITER backend

gemini-code-assist · 2026-01-29T16:30:28Z

Summary of Changes

Hello @ChangLiu0709, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the ERNIE-4.5-VL recipe documentation by integrating detailed instructions for deploying and benchmarking the model on AMD's latest MI300X, MI325X, and MI355X GPUs. The changes provide users with a clear guide to leverage AMD hardware, including specific Docker commands and vLLM server configurations, thereby broadening the accessibility and utility of the ERNIE-4.5-VL model across different hardware platforms.

Highlights

AMD GPU Support: Added comprehensive documentation for running the ERNIE-4.5-VL model on AMD MI300X, MI325X, and MI355X GPUs, including Docker setup, Hugging Face login, vLLM server configuration, and benchmarking instructions.
AITER Backend Activation: Included the VLLM_ROCM_USE_AITER=1 environment variable in the vLLM server command to enable the AITER backend for improved performance on ROCm-enabled systems.
Documentation Refinement: Corrected the capitalization of 'ERNIE-4.5-VL' in the vLLM installation section for consistency.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request successfully adds comprehensive documentation for running ERNIE-4.5-VL on AMD MI300X/MI325X/MI355X GPUs. This includes detailed steps for Docker environment setup, Hugging Face login, vLLM server startup, and benchmarking. The capitalization of 'Ernie' to 'ERNIE' has also been corrected for consistency.

gemini-code-assist · 2026-01-29T16:31:56Z

Ernie/Ernie4.5-VL.md

+  --dataset-name random \
+  --random-input-len 8000 \
+  --random-output-len 1000 \
+  --request-rate 10000 \


The --request-rate for the AMD benchmark is set to 10000, which is significantly higher than the 10 used in the general benchmarking section (line 62). This large discrepancy could lead to the benchmark failing or producing unrealistic results if the hardware cannot sustain such a high rate. Please verify if 10000 is the intended value or if it should be adjusted to a more realistic number, possibly aligning with the other benchmark examples or providing context for this high rate.

Suggested change

--request-rate 10000 \

--request-rate 10 \

gemini-code-assist · 2026-01-29T16:31:56Z

Ernie/Ernie4.5-VL.md

+Launch the ROCm vLLM docker: 
+
+```shell
+docker run -it --ipc=host --network=host --privileged --cap-add=CAP_SYS_ADMIN --device=/dev/kfd --device=/dev/dri --device=/dev/mem --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $(pwd):/work -e SHELL=/bin/bash --name Ernie-4.5-VL vllm/vllm-openai-rocm:v0.14.1 


The docker run command uses --privileged and --cap-add=CAP_SYS_ADMIN. While these might be necessary for ROCm environments, they grant extensive permissions to the container. It's generally recommended to use the most restrictive permissions possible. Consider adding a note about the security implications or exploring if a more granular set of capabilities can achieve the same functionality.

gemini-code-assist · 2026-01-29T16:31:56Z

Ernie/Ernie4.5-VL.md

+    --tensor-parallel-size 4 \
+    --gpu-memory-utilization 0.9 \
+    --disable-log-requests \
+    --no-enable-prefix-caching \


The --no-enable-prefix-caching flag is used in the AMD GPU server startup command, but it's not present in the general benchmarking section's server command (lines 19-27). This inconsistency might lead to different performance characteristics between the two setups. If this flag is crucial for AMD GPUs or specific to this benchmark, it should be explained, or its absence in other sections should be justified for clarity.

ChangLiu0709 · 2026-02-09T14:29:44Z

Hi @tjtanaa I have updated the recipe with uv pip installation of vLLM. Please have a check : )

tjtanaa · 2026-02-13T14:54:39Z

Ernie/Ernie4.5-VL.md

+### Step 4: Run Benchmark
+Open a new terminal and run the following command to execute the benchmark script inside the container.
+```shell
+docker exec -it Ernie-4.5-VL vllm bench serve \


We prioritize the pip install approach.
So, we should also include the command where there is no docker exec -it Ernie-4.5-VL,

tjtanaa · 2026-02-13T14:55:06Z

Ernie/Ernie4.5-VL.md

+Pull the latest vllm docker:
+
+```shell
+docker pull vllm/vllm-openai-rocm:v0.15.1


let's just use docker pull vllm/vllm-openai-rocm:latest so that we don't need to keep on updating the doc.

tjtanaa · 2026-02-13T14:55:34Z

Ernie/Ernie4.5-VL.md

+Launch the ROCm vLLM docker: 
+
+```shell
+docker run -it --ipc=host --network=host --privileged --cap-add=CAP_SYS_ADMIN --device=/dev/kfd --device=/dev/dri --device=/dev/mem --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $(pwd):/work -e SHELL=/bin/bash --name Ernie-4.5-VL vllm/vllm-openai-rocm:v0.15.1 


let's just use vllm/vllm-openai-rocm:latest so that we don't need to keep on updating the doc.

tjtanaa · 2026-02-13T14:58:48Z

Ernie/Ernie4.5-VL.md

+Launch the ROCm vLLM docker: 
+
+```shell
+docker run -it --ipc=host --network=host --privileged --cap-add=CAP_SYS_ADMIN --device=/dev/kfd --device=/dev/dri --device=/dev/mem --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $(pwd):/work -e SHELL=/bin/bash --name Ernie-4.5-VL vllm/vllm-openai-rocm:v0.15.1 


Let's make it multiline command and be consistent with existing format.

ChangLiu0709 · 2026-02-24T17:42:37Z

Hi @tjtanaa thanks for all the feedback! I have just modified the content accordingly. Please have a re-check : ))

ChangLiu0709 · 2026-02-27T15:58:13Z

Hi @tjtanaa just removed unneeded docker command and please have a check.

ChangLiu0709 · 2026-03-04T12:00:51Z

Wondering if this can be merged @tjtanaa ?

tjtanaa · 2026-03-05T10:23:54Z

Ernie/Ernie4.5-VL.md


 ## Installing vLLM
-Ernie4.5-VL support was recently added to vLLM main branch and is not yet available in any official release:
+ERNIE-4.5-VL support was recently added to vLLM main branch and is not yet available in any official release:


Let's add a subheader called ### CUDA and ### AMD ROCm: MI300x/MI325x/MI355x

Updated : )

Signed-off-by: seungrokj <seungrok.jung@amd.com> Signed-off-by: ChangLiu0709 <ChangLiu0709@users.noreply.github.com>

Signed-off-by: ChangLiu0709 <ChangLiu0709@users.noreply.github.com>

ChangLiu0709 · 2026-03-09T14:26:45Z

Hi @tjtanaa please have a check of this PR. I have updated the content according to your previous feedback.

gemini-code-assist bot reviewed Jan 29, 2026

View reviewed changes

ChangLiu0709 force-pushed the Ernie4.5-VL branch from 86a6790 to 043b8a7 Compare January 29, 2026 16:32

tjtanaa reviewed Feb 13, 2026

View reviewed changes

ChangLiu0709 force-pushed the Ernie4.5-VL branch 7 times, most recently from 4082bc9 to 48c4650 Compare February 24, 2026 17:41

ChangLiu0709 force-pushed the Ernie4.5-VL branch from c24d0ad to 80c4780 Compare February 27, 2026 15:57

ChangLiu0709 force-pushed the Ernie4.5-VL branch from 03e8f32 to 77e7e8e Compare February 27, 2026 16:53

tjtanaa reviewed Mar 5, 2026

View reviewed changes

ChangLiu0709 added 4 commits March 5, 2026 12:36

Add Ernie4.5-VL recipe with AMD MI300X/MI325X/MI355X support

0f065ac

Signed-off-by: seungrokj <seungrok.jung@amd.com> Signed-off-by: ChangLiu0709 <ChangLiu0709@users.noreply.github.com>

Remove docker benchmark command

b751ce5

Signed-off-by: ChangLiu0709 <ChangLiu0709@users.noreply.github.com>

Reformat the content merging AMD and NVIDIA settings together

81e69b6

Signed-off-by: ChangLiu0709 <ChangLiu0709@users.noreply.github.com>

Add sections for CUDA and ROCm

437d5bf

Signed-off-by: ChangLiu0709 <ChangLiu0709@users.noreply.github.com>

ChangLiu0709 force-pushed the Ernie4.5-VL branch from 6a8f37d to 437d5bf Compare March 5, 2026 12:36

Conversation

ChangLiu0709 commented Jan 29, 2026

Uh oh!

gemini-code-assist bot commented Jan 29, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

ChangLiu0709 commented Feb 9, 2026

Uh oh!

tjtanaa Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

tjtanaa Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

tjtanaa Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tjtanaa Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

ChangLiu0709 commented Feb 24, 2026

Uh oh!

ChangLiu0709 commented Feb 27, 2026

Uh oh!

ChangLiu0709 commented Mar 4, 2026

Uh oh!

tjtanaa Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

ChangLiu0709 Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

ChangLiu0709 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tjtanaa Feb 13, 2026 •

edited

Loading