Support qwen2.5-VL in sft.py and solve GRPO deepspeed training issue#110
Open
LiuRicky wants to merge 3 commits intoStarsfieldAI:mainfrom
Open
Support qwen2.5-VL in sft.py and solve GRPO deepspeed training issue#110LiuRicky wants to merge 3 commits intoStarsfieldAI:mainfrom
LiuRicky wants to merge 3 commits intoStarsfieldAI:mainfrom
Conversation
|
Hi, thank you for debugging. Can you specify the commit version of the transformers? For me, the current main is at 92c5ca9dd70de3ade2af2eb835c96215cc50e815. Is it as same as your version? |
|
And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL. |
Contributor
Author
I guess it is the version 5 days ago. Maybe 8ee50537fe7613b87881cd043a85971c85e99519 or e3d99ec2f58e0e2a4df6b2b41152fdfb3f92a52f |
|
I find the bug,which is from Qwen 2.5 VL. It updates the processor config 8 days ago.... Therefore old version of Qwen 2.5 VL is not compatible.
…---Original---
From: ***@***.***>
Date: Sun, Feb 23, 2025 20:14 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [Deep-Agent/R1-V] Support qwen2.5-VL in sft.py and solve GRPOdeepspeed training issue (PR #110)
And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL.
I guess it is the version 5 days ago. Maybe 8ee50537fe7613b87881cd043a85971c85e99519 or e3d99ec2f58e0e2a4df6b2b41152fdfb3f92a52f
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
LiuRicky left a comment (StarsfieldAI/R1-V#110)
And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL.
I guess it is the version 5 days ago. Maybe 8ee50537fe7613b87881cd043a85971c85e99519 or e3d99ec2f58e0e2a4df6b2b41152fdfb3f92a52f
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#109
Also solve the problem when using deepspeed zero3 training, as shown in huggingface/transformers@8ee5053
After update these changes, one can add "--deepspeed r1-v/local_scripts/zero3.json " in the training script when using deepspeed.