Skip to content

Commit c75e43f

Browse files
committed
wip
1 parent 91cad80 commit c75e43f

File tree

5 files changed

+6
-5
lines changed

5 files changed

+6
-5
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ Or use ModelScope's [official image](https://www.modelscope.cn/docs/intro/enviro
101101

102102
## Changelog
103103

104+
- 🎉2026-03-28 Support DPO training with both Transformers and Megatron backends. See [dpo_full.py](cookbook/rl/dpo_full.py) and [dpo_lora.py](cookbook/rl/dpo_lora.py).
104105
- 🎉2026-03-24 Twinkle Web site is now live at https://modelscope.github.io/twinkle-web/
105106
- 🎉2026-03-19 Support GKD training ,please refer to this [cookbook](cookbook/rl/gkd_on_policy.py).
106107
- 🎉2026-02-13 Initial version of Twinkle✨ released, including SFT/PT/RL support for text models.

README_ZH.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Cl
9191

9292
## 更新日志
9393

94+
🎉2026-03-28 支持 DPO 训练,同时支持 Transformers 和 Megatron 后端。参考 [dpo_full.py](cookbook/rl/dpo_full.py)[dpo_lora.py](cookbook/rl/dpo_lora.py)
9495
🎉2026-03-24 Twinkle 站点上线,访问地址 https://modelscope.github.io/twinkle-web/
9596
🎉2026-03-19 支持GKD蒸馏能力,参考[cookbook](cookbook/rl/gkd_on_policy.py)
9697
🎉2026-02-13 Twinkle✨ 初始版本发布,支持文本模型的SFT/PT/RL训练。我们还通过兼容Tinker的API,在魔搭社区上提供了无服务器训练功能。

cookbook/rl/dpo_lora.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def main():
149149
else:
150150
# Transformers: fsdp=4, dp=2
151151
from twinkle.model import TransformersModel
152-
policy_mesh = DeviceMesh.from_sizes(world_size=MODEL_GPUS, fsdp_size=4, dp_size=2)
152+
policy_mesh = DeviceMesh.from_sizes(world_size=MODEL_GPUS, dp_size=4, fsdp_size=2)
153153
ModelClass = TransformersModel
154154

155155
twinkle.initialize(mode='ray', nproc_per_node=MODEL_GPUS, groups=device_groups)

cookbook/transformers/fsdp2.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@
99
from twinkle.model import TransformersModel
1010
from twinkle.preprocessor import SelfCognitionProcessor
1111

12-
# Construct a device_mesh, dp=2
13-
device_mesh = DeviceMesh.from_sizes(dp_size=2)
12+
# Construct a device_mesh, fsdp_size=2, dp=4
13+
device_mesh = DeviceMesh.from_sizes(fsdp_size=2, dp_size=4)
1414
# use torchrun mode
1515
twinkle.initialize(mode='local', global_device_mesh=device_mesh)
1616

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
# Copyright (c) ModelScope Contributors. All rights reserved.
22
from .base import DataFilter, Preprocessor
3-
from .dpo import (DPOProcessor, EmojiDPOProcessor, HHRLHFProcessor, IntelOrcaDPOProcessor, ShareGPTDPOProcessor,
4-
UltraFeedbackKTOProcessor, UltraFeedbackProcessor)
3+
from .dpo import EmojiDPOProcessor
54
from .llm import (AlpacaProcessor, CompetitionMathGRPOProcessor, CompetitionMathProcessor, CountdownProcessor,
65
GSM8KProcessor, SelfCognitionProcessor)

0 commit comments

Comments
 (0)