vllm-project · haic0 · Jan 29, 2026 · Jan 31, 2026 · Jan 31, 2026 · Feb 3, 2026
diff --git a/Qwen/Qwen3-Coder-480B-A35B.md b/Qwen/Qwen3-Coder-480B-A35B.md
@@ -132,3 +132,59 @@ ERROR [multiproc_executor.py:511] ValueError: The output_size of gate's and up's
 - [EvalPlus](https://github.com/evalplus/evalplus)
 - [Qwen3-Coder](https://github.com/QwenLM/Qwen3-Coder)
 - [vLLM Documentation](https://docs.vllm.ai/)
+
+
+
+## AMD GPU Support
+Recommended approaches by hardware type are:
+
+
+MI300X/MI325X/MI355X 
+
+Please follow the steps here to install and run Qwen3-Coder models on AMD MI300X/MI325X/MI355X GPU.
+
+### Step 1: Installing vLLM (AMD ROCm Backend: MI300X, MI325X, MI355X) 
+ > Note: The vLLM wheel for ROCm requires Python 3.12, ROCm 7.0, and glibc >= 2.35. If your environment does not meet these requirements, please use the Docker-based setup as described in the [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu/#pre-built-images).
+
+
+ ```bash 
+ uv venv 
+ source .venv/bin/activate 
+ uv pip install vllm --extra-index-url https://wheels.vllm.ai/rocm/
+ ```
+
+
+### Step 2: Start the vLLM server
+
+Run the vllm online serving
+
+### BF16 
+
+
+```shell
+VLLM_ROCM_USE_AITER=1 vllm serve Qwen/Qwen3-Coder-480B-A35B-Instruct --trust-remote-code --max-model-len 131072 --enable-expert-parallel --data-parallel-size 8 --enable-auto-tool-choice --tool-call-parser qwen3_coder
+```
+
+### FP8 
+
+```shell
+
+VLLM_ROCM_USE_AITER=1 vllm serve Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 --trust-remote-code --max-model-len 131072 --enable-expert-parallel --data-parallel-size 8 --enable-auto-tool-choice --tool-call-parser qwen3_coder
+
+```
+
+
+### Step 4: Run Benchmark
+Open a new terminal and run the following command to execute the benchmark script inside the container.
+
+```shell
+vllm bench serve \
+  --backend vllm \
+  --model Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 \
+  --endpoint /v1/completions \
+  --dataset-name random \
+  --random-input 2048 \
+  --random-output 1024 \
+  --max-concurrency 10 \
+  --num-prompt 100
+```