Update Ministral-3-Reasoning.md for AMD GPU#208
Update Ministral-3-Reasoning.md for AMD GPU#208jiacao-amd wants to merge 2 commits intovllm-project:mainfrom
Conversation
Summary of ChangesHello @jiacao-amd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the documentation for the Ministral-3 Reasoning models by introducing a dedicated section for AMD GPU support. It provides a complete guide for users to set up their environment, deploy the models, and conduct benchmarks on AMD MI300X and MI355 hardware, thereby enhancing the model's compatibility and ease of use across different GPU architectures. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds valuable documentation for running the model on AMD GPUs. The instructions are generally clear, but I've identified a few areas for improvement regarding clarity, correctness, and security. My main points are:
- A potential typo in the model name ('Ministral' instead of 'Mistral') throughout the document.
- A contradiction in the Docker instructions regarding the image version.
- A significant security concern with mounting the host's root directory into the Docker container.
- A lack of clarity regarding the hardware requirements for the provided commands.
Additionally, the new section only covers the 14B model. It would be beneficial to include instructions for the 3B and 8B models as well, to make the guide more comprehensive. Please see my detailed comments for specific suggestions.
Mistral/Ministral-3-Reasoning.md
Outdated
|
|
||
| ## AMD GPU Support | ||
|
|
||
| Please follow the steps here to install and run Ministral-3 Reasoning models on AMD MI300X and MI355 GPU. |
There was a problem hiding this comment.
There appears to be a typo in the model name. It's written as 'Ministral', but it should likely be 'Mistral', especially since the directory is named Mistral. This typo appears throughout the document. For consistency and to avoid confusion, it's recommended to correct this.
| Please follow the steps here to install and run Ministral-3 Reasoning models on AMD MI300X and MI355 GPU. | |
| Please follow the steps here to install and run Mistral-3 Reasoning models on AMD MI300X and MI355 GPU. |
Mistral/Ministral-3-Reasoning.md
Outdated
|
|
||
| Please follow the steps here to install and run Ministral-3 Reasoning models on AMD MI300X and MI355 GPU. | ||
| ### Step 1: Prepare Docker Environment | ||
| Pull the latest vllm docker: |
There was a problem hiding this comment.
The instruction says to pull the 'latest' vllm docker image, but the command on line 418 specifies a fixed version (v0.14.1). This is contradictory. If this specific version is required, please update the text to avoid confusion. For example: 'Pull the required vllm docker image:'.
| Pull the latest vllm docker: | |
| Pull the required vllm docker image: |
Mistral/Ministral-3-Reasoning.md
Outdated
|
|
||
| ### Step 3: Start the vLLM server | ||
|
|
||
| Run the vllm online serving with this sample command: |
There was a problem hiding this comment.
The sample command uses --tensor-parallel-size 8, which implies an 8-GPU setup. To prevent user confusion and errors, it would be helpful to explicitly state the hardware configuration this command is intended for (e.g., 'Run the vllm online serving with this sample command on an 8-GPU server:').
| Run the vllm online serving with this sample command: | |
| Run the vllm online serving with this sample command on an 8-GPU server: |
Mistral/Ministral-3-Reasoning.md
Outdated
| ``` | ||
|
|
||
| ### Step 4: Run Benchmark | ||
| Open a new terminal and run the following command to execute the benchmark script inside the container. |
There was a problem hiding this comment.
The phrase 'Open a new terminal' can be ambiguous. To improve clarity, please specify that this new terminal should be on the host machine, not another session inside the container.
| Open a new terminal and run the following command to execute the benchmark script inside the container. | |
| Open a new terminal on the host machine and run the following command to execute the benchmark script inside the container. |
1b3769b to
b895163
Compare
Signed-off-by: jiacao-amd <jiahui.cao@amd.com> add uv launch support Signed-off-by: jiacao-amd <jiahui.cao@amd.com>
75e0d71 to
dc6026d
Compare
|
@jiacao-amd Please revise all of your command to up-to-date with the latest vLLM version. There is environment variables that have been deprecated many versions ago. For this documentation, we don't need to attach the benchmark script. Please merge the ROCm installation step and server command with the main context. Open subheaders to split between two hardware similar to the suggestion mentioned here https://github.com/vllm-project/recipes/pull/219/changes#r2804656457 |
Signed-off-by: jiacao-amd <jiahui.cao@amd.com>
694943b to
c5a0e67
Compare
done! |
No description provided.