Skip to content

Update Ministral-3-Reasoning.md for AMD GPU#208

Open
jiacao-amd wants to merge 2 commits intovllm-project:mainfrom
haic0:Ministral-3-Reasoning-recipe-update
Open

Update Ministral-3-Reasoning.md for AMD GPU#208
jiacao-amd wants to merge 2 commits intovllm-project:mainfrom
haic0:Ministral-3-Reasoning-recipe-update

Conversation

@jiacao-amd
Copy link

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @jiacao-amd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the documentation for the Ministral-3 Reasoning models by introducing a dedicated section for AMD GPU support. It provides a complete guide for users to set up their environment, deploy the models, and conduct benchmarks on AMD MI300X and MI355 hardware, thereby enhancing the model's compatibility and ease of use across different GPU architectures.

Highlights

  • New AMD GPU Support Section: A new section titled 'AMD GPU Support' has been added to the Ministral-3-Reasoning.md document.
  • Docker Environment Setup: Detailed instructions are provided for preparing the Docker environment, including pulling and running the vllm/vllm-openai-rocm:v0.14.1 image for AMD GPUs.
  • vLLM Server Configuration: Commands to start the vLLM server are included, with specific environment variables and parameters tailored for AMD MI300X and MI355 GPUs.
  • Benchmark Execution Guide: Steps are outlined for running a benchmark script inside the Docker container to test performance.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds valuable documentation for running the model on AMD GPUs. The instructions are generally clear, but I've identified a few areas for improvement regarding clarity, correctness, and security. My main points are:

  • A potential typo in the model name ('Ministral' instead of 'Mistral') throughout the document.
  • A contradiction in the Docker instructions regarding the image version.
  • A significant security concern with mounting the host's root directory into the Docker container.
  • A lack of clarity regarding the hardware requirements for the provided commands.

Additionally, the new section only covers the 14B model. It would be beneficial to include instructions for the 3B and 8B models as well, to make the guide more comprehensive. Please see my detailed comments for specific suggestions.


## AMD GPU Support

Please follow the steps here to install and run Ministral-3 Reasoning models on AMD MI300X and MI355 GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There appears to be a typo in the model name. It's written as 'Ministral', but it should likely be 'Mistral', especially since the directory is named Mistral. This typo appears throughout the document. For consistency and to avoid confusion, it's recommended to correct this.

Suggested change
Please follow the steps here to install and run Ministral-3 Reasoning models on AMD MI300X and MI355 GPU.
Please follow the steps here to install and run Mistral-3 Reasoning models on AMD MI300X and MI355 GPU.


Please follow the steps here to install and run Ministral-3 Reasoning models on AMD MI300X and MI355 GPU.
### Step 1: Prepare Docker Environment
Pull the latest vllm docker:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The instruction says to pull the 'latest' vllm docker image, but the command on line 418 specifies a fixed version (v0.14.1). This is contradictory. If this specific version is required, please update the text to avoid confusion. For example: 'Pull the required vllm docker image:'.

Suggested change
Pull the latest vllm docker:
Pull the required vllm docker image:


### Step 3: Start the vLLM server

Run the vllm online serving with this sample command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The sample command uses --tensor-parallel-size 8, which implies an 8-GPU setup. To prevent user confusion and errors, it would be helpful to explicitly state the hardware configuration this command is intended for (e.g., 'Run the vllm online serving with this sample command on an 8-GPU server:').

Suggested change
Run the vllm online serving with this sample command:
Run the vllm online serving with this sample command on an 8-GPU server:

```

### Step 4: Run Benchmark
Open a new terminal and run the following command to execute the benchmark script inside the container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrase 'Open a new terminal' can be ambiguous. To improve clarity, please specify that this new terminal should be on the host machine, not another session inside the container.

Suggested change
Open a new terminal and run the following command to execute the benchmark script inside the container.
Open a new terminal on the host machine and run the following command to execute the benchmark script inside the container.

@jiacao-amd jiacao-amd force-pushed the Ministral-3-Reasoning-recipe-update branch from 1b3769b to b895163 Compare January 27, 2026 21:43
Signed-off-by: jiacao-amd <jiahui.cao@amd.com>

add uv launch support

Signed-off-by: jiacao-amd <jiahui.cao@amd.com>
@jiacao-amd jiacao-amd force-pushed the Ministral-3-Reasoning-recipe-update branch from 75e0d71 to dc6026d Compare February 6, 2026 07:04
@tjtanaa
Copy link

tjtanaa commented Feb 13, 2026

@jiacao-amd Please revise all of your command to up-to-date with the latest vLLM version. There is environment variables that have been deprecated many versions ago.

For this documentation, we don't need to attach the benchmark script.

Please merge the ROCm installation step and server command with the main context. Open subheaders to split between two hardware similar to the suggestion mentioned here https://github.com/vllm-project/recipes/pull/219/changes#r2804656457

Signed-off-by: jiacao-amd <jiahui.cao@amd.com>
@jiacao-amd jiacao-amd force-pushed the Ministral-3-Reasoning-recipe-update branch from 694943b to c5a0e67 Compare February 26, 2026 01:11
@jiacao-amd
Copy link
Author

@jiacao-amd Please revise all of your command to up-to-date with the latest vLLM version. There is environment variables that have been deprecated many versions ago.

For this documentation, we don't need to attach the benchmark script.

Please merge the ROCm installation step and server command with the main context. Open subheaders to split between two hardware similar to the suggestion mentioned here https://github.com/vllm-project/recipes/pull/219/changes#r2804656457

done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants