Skip to content

Commit 850803e

Browse files
author
Yingda Chen
committed
update
1 parent 537baa3 commit 850803e

File tree

4 files changed

+49
-53
lines changed

4 files changed

+49
-53
lines changed

README.md

Lines changed: 38 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ be reused in [ms-swift](https://github.com/modelscope/ms-swift).
5151
pip install 'twinkle-kit'
5252
```
5353

54-
### Installation from Source:
54+
### Install from Source:
5555

5656
```shell
5757
git clone https://github.com/modelscope/twinkle.git
@@ -75,13 +75,14 @@ pip install -e .
7575

7676
## Changelog
7777

78-
- 🎉2026-02-10 Initial version of Twinkle✨ released, including SFT/PT/RL for text models and serverless training capabilities on [ModelScope](https://modelscope.cn).
78+
- 🎉2026-02-13 Initial version of Twinkle✨ released, including SFT/PT/RL support for text models and serverless training capabilities on [ModelScope](https://modelscope.cn).
7979

80-
# ModelScope Community
80+
## Training as a Service on ModelScope
8181

82-
## ModelScope Official Environment
83-
84-
The ModelScope community provides an official environment for running Twinkle. The API endpoint is: [base_url](https://www.modelscope.cn/twinkle). Developers can refer to our [documentation](docs/source_en/Usage%20Guide/ModelScope-Official-Resources.md) for usage instructions.
82+
We are rolling out training service built atop Twinkle✨ on ModelScope. It is currently in _Beta_. You may
83+
sign up for free access by joining the [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) organization, and
84+
train via API endpoint `base_url=https://www.modelscope.cn/twinkle`. For more details, please refer to
85+
our [documentation](docs/source_en/Usage%20Guide/ModelScope-Official-Resources.md).
8586

8687
## Supported Hardware
8788

@@ -95,29 +96,33 @@ The ModelScope community provides an official environment for running Twinkle. T
9596
## Supported Models
9697

9798
We will be adding support for more models as new models are released. The following table lists current models
98-
supported on Twinkle✨ framework. However, the models supported on our serverless training backend may be a
99-
much smaller subset. Please refer to the [doc](link) section for more information.
100-
101-
| Model Type | Model ID on[ModelScope](https://modelscope.cn) | Requires | Megatron Support | HF Model ID |
102-
| ------------------- | --------------------------------------------------------------------------------------------------------------------- | -------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------- |
103-
| qwen3 series | [Qwen/Qwen3-0.6B-Base](https://modelscope.cn/models/Qwen/Qwen3-0.6B-Base)~32B | transformers>=4.51 || [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base) |
104-
| qwen3_moe series | [Qwen/Qwen3-30B-A3B-Base](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Base) | transformers>=4.51 || [Qwen/Qwen3-30B-A3B-Base](https://huggingface.co/Qwen/Qwen3-30B-A3B-Base) |
105-
| | [Qwen/Qwen3-30B-A3B](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B)~235B | transformers>=4.51 || [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B) |
106-
| qwen2 series | [Qwen/Qwen2-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2-0.5B-Instruct) ~72B | transformers>=4.37 || [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) |
107-
| | [Qwen/Qwen2.5-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B-Instruct)~72B | transformers>=4.37 || [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) |
108-
| | [Qwen/Qwen2.5-0.5B](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B)~72B | transformers>=4.37 || [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) |
109-
| qwen2_moe series | [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://modelscope.cn/models/Qwen/Qwen1.5-MoE-A2.7B-Chat) | transformers>=4.40 || [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B-Chat) |
99+
supported on Twinkle✨ framework.
100+
101+
>[!Note]
102+
> For serverless training service accessed via `base_url=https://www.modelscope.cn/twinkle`, it currently supports
103+
> one training base at a time, and currently it is [Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507).
104+
105+
106+
| Model Type | Model ID on [ModelScope](https://modelscope.cn) | Requires | Megatron Support | HF Model ID |
107+
| ------------------- |--------------------------------------------------------------------------------------------------------------------------| -------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------- |
108+
| qwen3 series | [Qwen/Qwen3-0.6B-Base](https://modelscope.cn/models/Qwen/Qwen3-0.6B-Base)~32B | transformers>=4.51 || [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base) |
109+
| qwen3_moe series | [Qwen/Qwen3-30B-A3B-Base](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Base) | transformers>=4.51 || [Qwen/Qwen3-30B-A3B-Base](https://huggingface.co/Qwen/Qwen3-30B-A3B-Base) |
110+
| | [Qwen/Qwen3-30B-A3B](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B)~235B | transformers>=4.51 || [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B) |
111+
| qwen2 series | [Qwen/Qwen2-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2-0.5B-Instruct) ~72B | transformers>=4.37 || [Qwen/Qwen2-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2-0.5B-Instruct) |
112+
| | [Qwen/Qwen2.5-0.5B-Instruct](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B-Instruct)~72B | transformers>=4.37 || [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) |
113+
| | [Qwen/Qwen2.5-0.5B](https://modelscope.cn/models/Qwen/Qwen2.5-0.5B)~72B | transformers>=4.37 || [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) |
114+
| qwen2_moe series | [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://modelscope.cn/models/Qwen/Qwen1.5-MoE-A2.7B-Chat) | transformers>=4.40 || [Qwen/Qwen1.5-MoE-A2.7B-Chat](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B-Chat) |
110115
| chatglm4 series | [ZhipuAI/glm-4-9b-chat](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat) | transformers>=4.42 || [zai-org/glm-4-9b-chat](https://huggingface.co/zai-org/glm-4-9b-chat) |
111116
| | [ZhipuAI/LongWriter-glm4-9b](https://modelscope.cn/models/ZhipuAI/LongWriter-glm4-9b) | transformers>=4.42 || [zai-org/LongWriter-glm4-9b](https://huggingface.co/zai-org/LongWriter-glm4-9b) |
112117
| glm_edge series | [ZhipuAI/glm-edge-1.5b-chat](https://modelscope.cn/models/ZhipuAI/glm-edge-1.5b-chat) | transformers>=4.46 || [zai-org/glm-edge-1.5b-chat](https://huggingface.co/zai-org/glm-edge-1.5b-chat) |
113118
| | [ZhipuAI/glm-edge-4b-chat](https://modelscope.cn/models/ZhipuAI/glm-edge-4b-chat) | transformers>=4.46 || [zai-org/glm-edge-4b-chat](https://huggingface.co/zai-org/glm-edge-4b-chat) |
114119
| internlm2 series | [Shanghai_AI_Laboratory/internlm2-1_8b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-1_8b) | transformers>=4.38 || [internlm/internlm2-1_8b](https://huggingface.co/internlm/internlm2-1_8b) |
115120
| | [Shanghai_AI_Laboratory/internlm2-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-7b) | transformers>=4.38 || [internlm/internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b) |
116-
| deepseek_v1 | [deepseek-ai/deepseek-vl-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-7b-chat) | transformers>=4.39.4 | | —— |
117-
| | [deepseek-ai/DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite) | transformers>=4.39.3 | | [deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite) |
118-
| | [deepseek-ai/DeepSeek-V2.5](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2.5) | transformers>=4.39.3 | | [deepseek-ai/DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5) |
119-
| | [deepseek-ai/DeepSeek-R1](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1) | transformers>=4.39.3 | | [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) |
120-
| deepSeek-r1-distill | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) ~32B | transformers>=4.37 | | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
121+
| deepseek_v1 | [deepseek-ai/deepseek-vl-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-vl-7b-chat) | transformers>=4.39.4 | | —— |
122+
| | [deepseek-ai/DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite) | transformers>=4.39.3 | | [deepseek-ai/DeepSeek-V2-Lite](https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite) |
123+
| | [deepseek-ai/DeepSeek-V2.5](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2.5) | transformers>=4.39.3 | | [deepseek-ai/DeepSeek-V2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5) |
124+
| | [deepseek-ai/DeepSeek-R1](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1) | transformers>=4.39.3 | | [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) |
125+
| deepSeek-r1-distill | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) ~32B | transformers>=4.37 | | [deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
121126

122127
For a more detailed model support list 👉 [Quick Start.md](https://github.com/modelscope/twinkle/blob/dev/docs/source/%E4%BD%BF%E7%94%A8%E6%8C%87%E5%BC%95/%E5%BF%AB%E9%80%9F%E5%BC%80%E5%A7%8B.md)
123128

@@ -141,18 +146,20 @@ twinkle.initialize(mode='ray', groups=device_group, global_device_mesh=device_me
141146

142147

143148
def train():
149+
# to load model from Hugging Face, use 'hf://...'
150+
base_model = 'ms://Qwen/Qwen2.5-7B-Instruct'
144151
# 1000 samples
145152
dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(1000)))
146153
# Set template to prepare encoding
147-
dataset.set_template('Template', model_id='ms://Qwen/Qwen2.5-7B-Instruct')
154+
dataset.set_template('Template', model_id=base_model)
148155
# Preprocess the dataset to standard format
149156
dataset.map(SelfCognitionProcessor('twinkle LLM', 'ModelScope Community'))
150157
# Encode dataset
151158
dataset.encode()
152159
# Global batch size = 8, for GPUs, so 1 sample per GPU
153160
dataloader = DataLoader(dataset=dataset, batch_size=8, min_batch_size=8)
154161
# Use a TransformersModel
155-
model = TransformersModel(model_id='ms://Qwen/Qwen2.5-7B-Instruct', remote_group='default')
162+
model = TransformersModel(model_id=base_model, remote_group='default')
156163

157164
lora_config = LoraConfig(
158165
r=8,
@@ -184,7 +191,7 @@ if __name__ == '__main__':
184191
train()
185192
```
186193

187-
### Tinker-Like Remote API
194+
### Using Tinker-Like API
188195

189196
```python
190197
import os
@@ -196,17 +203,19 @@ from twinkle.dataset import Dataset, DatasetMeta
196203
from twinkle.preprocessor import SelfCognitionProcessor
197204
from twinkle.server.tinker.common import input_feature_to_datum
198205

199-
base_model = "Qwen/Qwen2.5-0.5B-Instruct"
206+
base_model = 'ms://Qwen/Qwen2.5-0.5B-Instruct'
207+
base_url='http://www.modelscope.cn/twinkle'
208+
api_key=os.environ.get('MODELSCOPE_TOKEN')
200209

201210
# Use twinkle dataset to load the data
202211
dataset = Dataset(dataset_meta=DatasetMeta('ms://swift/self-cognition', data_slice=range(500)))
203-
dataset.set_template('Template', model_id=f'ms://{base_model}', max_length=256)
212+
dataset.set_template('Template', model_id=base_model, max_length=256)
204213
dataset.map(SelfCognitionProcessor('twinkle Model', 'twinkle Team'), load_from_cache_file=False)
205214
dataset.encode(batched=True, load_from_cache_file=False)
206215
dataloader = DataLoader(dataset=dataset, batch_size=8)
207216

208217
# Initialize tinker client
209-
service_client = init_tinker_compat_client(base_url='http://www.modelscope.cn/twinkle', api_key=os.environ.get('MODELSCOPE_SDK_TOKEN'))
218+
service_client = init_tinker_compat_client(base_model, api_key)
210219
training_client = service_client.create_lora_training_client(base_model=base_model, rank=16)
211220

212221
# Training loop: use input_feature_to_datum to transfer the input format
@@ -223,12 +232,6 @@ for epoch in range(3):
223232
training_client.save_state(f"twinkle-lora-{epoch}").result()
224233
```
225234

226-
Launch training:
227-
228-
```shell
229-
python3 train.py
230-
```
231-
232235
## Architecture Design
233236

234237
<img src="assets/framework.jpg" style="max-width: 500px; width: 100%;" />

README_ZH.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ pip install -e .
6969

7070
## 魔搭社区官方环境
7171

72-
魔搭社区提供了Twinkle运行的官方环境,调用端点为:[base_url](https://www.modelscope.cn/twinkle),开发者可以参考我们的[文档](docs/source_zh/使用指引/魔搭官方环境.md)来进行使用。
72+
魔搭社区提供了Twinkle运行的官方环境,调用端点为:[base_url](https://www.modelscope.cn/twinkle),开发者可以参考我们的[文档](docs/source_zh/使用指引/训练服务.md)来进行使用。
7373

7474
## 支持的硬件
7575

docs/source_zh/使用指引/快速开始.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,12 @@
1-
<div align="center">
21

32
## ✨ Twinkle 是什么?
43

54
大模型训练组件库。基于 PyTorch,更简洁、更灵活、生产就绪。
65

7-
<p align="center">
86
🧩 <b>松耦合架构</b> · 标准化接口<br>
97
🚀 <b>多运行模式</b> · torchrun / Ray / HTTP<br>
108
🔌 <b>多框架兼容</b> · Transformers / Megatron<br>
119
👥 <b>多租户支持</b> · 单基座模型部署
12-
</p>
13-
14-
</div>
1510

1611
## Twinkle 适配性
1712

@@ -23,7 +18,6 @@ Twinkle 和 [ms-swift](https://github.com/modelscope/ms-swift) 都是模型训
2318
- 如果你是大模型研究者,希望定制模型或训练方法
2419
- 如果你善于编写 training loop,希望定制训练过程
2520
- 如果你希望提供企业级或商业化训练平台
26-
- 如果你缺少训练硬件,希望使用社区资源
2721

2822
### 何时选择ms-swift
2923

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
1-
# 魔搭官方环境
1+
# ModelScope上的Twinkle训练服务
22

3-
在 Twinkle 框架开源的同时,我们在[魔搭社区官网](https://www.modelscope.cn)上提供了样例训练资源。开发者可以通过ModelScope Token进行训练。
3+
在 Twinkle 框架开源的同时,我们依托ModelScope的后台服务,也提供了托管的模型训练服务(Training as a Service),开发者可以通过这一服务,
4+
免费体验Twinkle的训练API。
45

56
目前在集群中运行的模型是[Qwen/Qwen3-30B-A3B-Instruct-2507](https://www.modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)。下面介绍具体的使用方法:
67

7-
## Step 1. 注册魔搭用户
8+
## Step 1. 注册ModelScope用户并申请加入 twinkle-explorers 组织
89

9-
开发者首先需要注册成为魔搭用户,使用魔搭社区的 token 进行调用。
10+
开发者首先需要注册成为ModelScope用户,并申请加入 [Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) 组织,
11+
来获取访问权限。当前免费的Serverless训练体验,还在灰度测试中,暂时只向组织内的用户开放。您也可以通过本地部署服务,来使用Twinkle✨。
1012

1113
注册地址:https://www.modelscope.cn/
1214

13-
调用端点:https://www.modelscope.cn/twinkle
15+
在注册并获批加入[Twinkle-Explorers](https://modelscope.cn/organization/twinkle-explorers) 组织后,在此页面获取
16+
访问的API-Key(即ModelScope平台的访问Token):https://www.modelscope.cn/my/access/token
1417

15-
token 在这里获取:https://www.modelscope.cn/my/access/token 拷贝访问令牌。
18+
调用端点:`base_url="https://www.modelscope.cn/twinkle"`
1619

17-
## Step 2. 加入 twinkle-explorers 组织
18-
19-
目前 twinkle-kit 的远程训练能力在灰度测试中,开发者需要加入 [twinkle-explorers](https://www.modelscope.cn/models/twinkle-explorers) 组织,组织内的用户可以进行使用和测试。
20-
21-
## Step 3. 查看 Cookbook 并二次定制开发
20+
## Step 2. 查看 Cookbook 并二次定制开发
2221

2322
我们强烈推荐开发者查看我们的 [cookbook](https://github.com/modelscope/twinkle/tree/main/cookbook/client/),并根据其中的训练代码进行二次开发。
2423

0 commit comments

Comments
 (0)