Skip to content

Commit 993c049

Browse files
committed
Merge remote-tracking branch 'origin' into fix_request
2 parents f7a3d0e + 8ada7d3 commit 993c049

File tree

30 files changed

+299
-127
lines changed

30 files changed

+299
-127
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ supported on Twinkle✨ framework.
129129
> For serverless training service accessed via `base_url=https://www.modelscope.cn/twinkle`, it
130130
> is currently provided via the Tinker-compatible APIs. We will be rolling out services that support
131131
> both Tinker APIs, as well as the full-fledged Twinkle✨ native APIs. The serverless endpoint is backed
132-
> by one training base at a time, and currently it is [Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507).
132+
> by one training base at a time, and currently it is [Qwen3.5-4B](https://modelscope.cn/models/Qwen/Qwen3.5-4B).
133133
134134
| Model Type | Model ID on [ModelScope](https://modelscope.cn) | Model Size | Requires | Support Megatron | HF Model ID |
135135
|---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
@@ -234,7 +234,7 @@ from twinkle.dataset import Dataset, DatasetMeta
234234
from twinkle.preprocessor import SelfCognitionProcessor
235235
from twinkle.server.common import input_feature_to_datum
236236

237-
base_model = 'ms://Qwen/Qwen3-30B-A3B-Instruct-2507'
237+
base_model = 'ms://Qwen/Qwen3.5-4B'
238238
base_url='your-base-url'
239239
api_key='your-api-key'
240240

README_ZH.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ Twinkle✨支持相同的算法接口运行在单GPU、torchrun多机、Ray、Cl
112112
随着新模型的发布,我们将添加对更多模型的支持。下表列出了 Twinkle✨ 框架当前支持的模型。
113113

114114
>[!Note]
115-
> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务,目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持,目前使用的是[Qwen3-30B-A3B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-30B-A3B-Instruct-2507)
115+
> 通过 `base_url=https://www.modelscope.cn/twinkle` 访问的无服务器训练服务,目前是通过兼容Tinker的API提供的。我们将陆续推出同时支持Tinker API和完整Twinkle✨原生 API的服务。无服务器端点每次由一个训练基座支持,目前使用的是[Qwen3.5-4B](https://modelscope.cn/models/Qwen/Qwen3.5-4B)
116116
117117
| Model Type | Model ID 举例 | Model Size | Requires | Support Megatron | HF Model ID |
118118
|---------------------|-----------------------------------------------------------------------------------------------------------------|:---------------------------------------:|----------------------|:----------------:|:---------------------------------------------------------------------------------------------------------:|
@@ -216,7 +216,7 @@ from twinkle.dataset import Dataset, DatasetMeta
216216
from twinkle.preprocessor import SelfCognitionProcessor
217217
from twinkle.server.common import input_feature_to_datum
218218

219-
base_model = 'ms://Qwen/Qwen3-30B-A3B-Instruct-2507'
219+
base_model = 'ms://Qwen/Qwen3.5-4B'
220220
base_url='your-base-url'
221221
api_key='your-api-key'
222222

cookbook/client/server/megatron/server_config.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,11 @@ applications:
3636

3737
# 3. Sampler Service - Runs inference / sampling using vLLM engine
3838
# Used for generating text from the model (e.g., evaluating LoRA results).
39-
- name: sampler-Qwen3-30B-A3B-Instruct-2507
40-
route_prefix: /api/v1/sampler/Qwen/Qwen3-30B-A3B-Instruct-2507
39+
- name: sampler-Qwen3.5-4B
40+
route_prefix: /api/v1/sampler/Qwen/Qwen3.5-4B
4141
import_path: sampler
4242
args:
43-
model_id: "ms://Qwen/Qwen3-30B-A3B-Instruct-2507" # ModelScope model identifier
43+
model_id: "ms://Qwen/Qwen3.5-4B" # ModelScope model identifier
4444
nproc_per_node: 4 # Number of GPU processes per node
4545
sampler_type: vllm # Inference engine: 'vllm' (fast) or 'torch' (TorchSampler)
4646
engine_args: # vLLM engine-specific settings
@@ -73,12 +73,12 @@ applications:
7373

7474
# 2. Model Service (commented out) - Would host the base model for training.
7575
# Uncomment and configure if you need a training model worker.
76-
- name: models-Qwen3-30B-A3B-Instruct-2507
77-
route_prefix: /api/v1/model/Qwen/Qwen3-30B-A3B-Instruct-2507
76+
- name: models-Qwen3.5-4B
77+
route_prefix: /api/v1/model/Qwen/Qwen3.5-4B
7878
import_path: model
7979
args:
8080
use_megatron: true # Use HuggingFace Transformers backend
81-
model_id: "ms://Qwen/Qwen3-30B-A3B-Instruct-2507" # ModelScope model identifier
81+
model_id: "ms://Qwen/Qwen3.5-4B" # ModelScope model identifier
8282
max_length: 16000 # model max length
8383
max_loras: 5 # model max loras
8484
nproc_per_node: 4 # Number of GPU processes per node

cookbook/client/server/megatron/server_config_4b.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ applications:
3939
import_path: model
4040
args:
4141
use_megatron: true
42+
model_cls: Qwen3_5ForConditionalGeneration
4243
model_id: "ms://Qwen/Qwen3.5-4B" # ModelScope model identifier
4344
max_length: 10240
4445
nproc_per_node: 2 # Number of GPU processes per node

cookbook/client/tinker/modelscope/sample.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
from tinker import ServiceClient
1818

19-
base_model = 'Qwen/Qwen3-30B-A3B-Instruct-2507'
19+
base_model = 'Qwen/Qwen3.5-4B'
2020
base_url = 'http://www.modelscope.cn/twinkle'
2121

2222
# Step 2: Define the base model and connect to the server
@@ -29,7 +29,7 @@
2929
# The model_path is a twinkle:// URI pointing to a previously saved LoRA checkpoint.
3030
# The server will load the base model and apply the LoRA adapter weights.
3131
sampling_client = service_client.create_sampling_client(
32-
model_path='twinkle://xxx-Qwen_Qwen3-30B-A3B-Instruct-2507-xxx/weights/twinkle-lora-1',
32+
model_path='twinkle://xxx-Qwen_Qwen3.5-4B-xxx/weights/twinkle-lora-1',
3333
base_model=base_model
3434
)
3535

cookbook/client/tinker/modelscope/self_cognition.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
from tinker import ServiceClient
2424

2525
# The base model to fine-tune / evaluate
26-
base_model = 'Qwen/Qwen3-30B-A3B-Instruct-2507'
26+
base_model = 'Qwen/Qwen3.5-4B'
2727
base_url = 'http://www.modelscope.cn/twinkle'
2828

2929

cookbook/client/tinker/modelscope/short_math_grpo.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
logger = get_logger()
3939

4040
# ========== Configuration ==========
41-
BASE_MODEL = 'Qwen/Qwen3-30B-A3B-Instruct-2507'
41+
BASE_MODEL = 'Qwen/Qwen3.5-4B'
4242
NUM_GENERATIONS = 8
4343
MAX_NEW_TOKENS = 4096
4444
LEARNING_RATE = 1e-4

cookbook/client/tinker/self_host/sample.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
# The model_path is a twinkle:// URI pointing to a previously saved LoRA checkpoint.
2828
# The server will load the base model and apply the LoRA adapter weights.
2929
sampling_client = service_client.create_sampling_client(
30-
model_path='twinkle://xxx-Qwen_Qwen3-30B-A3B-Instruct-2507-xxx/weights/twinkle-lora-1',
30+
model_path='twinkle://xxx-Qwen_Qwen3.5-4B-xxx/weights/twinkle-lora-1',
3131
base_model=base_model
3232
)
3333

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
# Twinkle Client - Transformers LoRA Training Example
2+
#
3+
# This script demonstrates how to fine-tune a language model using LoRA
4+
# (Low-Rank Adaptation) through the Twinkle client-server architecture.
5+
# The server must be running first (see server.py and server_config.yaml).
6+
7+
# Step 1: Load environment variables from a .env file (e.g., API tokens)
8+
import dotenv
9+
import os
10+
from twinkle.data_format import Trajectory, Message
11+
from twinkle.preprocessor import Preprocessor
12+
13+
dotenv.load_dotenv('.env')
14+
import numpy as np
15+
import torch
16+
from peft import LoraConfig
17+
18+
from twinkle import get_logger
19+
from twinkle.dataset import DatasetMeta
20+
from twinkle_client import init_twinkle_client
21+
from twinkle.dataloader import DataLoader
22+
from twinkle.dataset import LazyDataset
23+
from twinkle_client.model import MultiLoraTransformersModel
24+
25+
logger = get_logger()
26+
27+
base_model = 'Qwen/Qwen3.5-4B'
28+
base_url = 'http://www.modelscope.cn/twinkle'
29+
30+
# Step 2: Initialize the Twinkle client to communicate with the remote server.
31+
# - base_url: the address of the running Twinkle server
32+
# - api_key: authentication token (loaded from environment variable)
33+
client = init_twinkle_client(base_url=base_url, api_key=os.environ.get('MODELSCOPE_TOKEN'))
34+
35+
# Step 3: Query the server for existing training runs and their checkpoints.
36+
# This is useful for resuming a previous training session.
37+
runs = client.list_training_runs()
38+
39+
resume_path = None
40+
for run in runs:
41+
logger.info(run.model_dump_json(indent=2))
42+
# List all saved checkpoints for this training run
43+
checkpoints = client.list_checkpoints(run.training_run_id)
44+
45+
for checkpoint in checkpoints:
46+
logger.info(checkpoint.model_dump_json(indent=2))
47+
# Uncomment the line below to resume from a specific checkpoint:
48+
# resume_path = checkpoint.twinkle_path
49+
50+
51+
class LatexOCRProcessor(Preprocessor):
52+
53+
def __call__(self, rows):
54+
rows = self.map_col_to_row(rows)
55+
rows = [self.preprocess(row) for row in rows]
56+
rows = self.map_row_to_col(rows)
57+
return rows
58+
59+
def preprocess(self, row) -> Trajectory:
60+
return Trajectory(
61+
messages=[
62+
Message(role='user', content='<image>Using LaTeX to perform OCR on the image.', images=[row['image']]),
63+
Message(role='assistant', content=row['text']),
64+
]
65+
)
66+
67+
68+
def train():
69+
# Step 4: Prepare the dataset
70+
71+
# Load the latex dataset from ModelScope
72+
dataset = LazyDataset(dataset_meta=DatasetMeta('ms://AI-ModelScope/LaTeX_OCR', data_slice=range(500)))
73+
74+
# Apply a chat template so the data matches the model's expected input format
75+
dataset.set_template('Qwen3_5Template', model_id=f'ms://{base_model}', max_length=512)
76+
77+
# Replace placeholder names in the dataset with custom model/author names
78+
dataset.map(LatexOCRProcessor)
79+
80+
# Tokenize and encode the dataset into model-ready input features
81+
dataset.encode(batched=True)
82+
83+
# Wrap the dataset into a DataLoader that yields batches of size 4
84+
dataloader = DataLoader(dataset=dataset, batch_size=4)
85+
86+
# Step 5: Configure the model
87+
88+
# Create a multi-LoRA Transformers model pointing to the base model on ModelScope
89+
model = MultiLoraTransformersModel(model_id=f'ms://{base_model}')
90+
91+
# Define LoRA configuration: apply low-rank adapters to all linear layers
92+
lora_config = LoraConfig(target_modules='all-linear')
93+
94+
# Attach the LoRA adapter named 'default' to the model.
95+
# gradient_accumulation_steps=2 means gradients are accumulated over 2 micro-batches
96+
# before an optimizer step, effectively doubling the batch size.
97+
model.add_adapter_to_model('default', lora_config, gradient_accumulation_steps=2)
98+
99+
# Set the same chat template used during data preprocessing
100+
model.set_template('Qwen3_5Template')
101+
102+
# Set the input processor (pads sequences on the right side)
103+
model.set_processor('InputProcessor', padding_side='right')
104+
105+
# Use cross-entropy loss for language modeling
106+
model.set_loss('CrossEntropyLoss')
107+
108+
# Use Adam optimizer with a learning rate of 1e-4 (Only support Adam optimizer if server use megatron)
109+
model.set_optimizer('Adam', lr=1e-4)
110+
111+
# Use a linear learning rate scheduler (Do not support LR scheduler if server use megatron)
112+
# model.set_lr_scheduler('LinearLR')
113+
114+
# Step 6: Optionally resume from a previous checkpoint
115+
if resume_path:
116+
logger.info(f'Resuming training from {resume_path}')
117+
model.load(resume_path, load_optimizer=True)
118+
119+
# Step 7: Run the training loop
120+
logger.info(model.get_train_configs().model_dump())
121+
122+
for epoch in range(3):
123+
logger.info(f'Starting epoch {epoch}')
124+
for step, batch in enumerate(dataloader):
125+
for sample in batch:
126+
for key in sample:
127+
if isinstance(sample[key], np.ndarray):
128+
sample[key] = sample[key].tolist()
129+
elif isinstance(sample[key], torch.Tensor):
130+
sample[key] = sample[key].cpu().numpy().tolist()
131+
# Forward pass + backward pass (computes gradients)
132+
model.forward_backward(inputs=batch)
133+
134+
# Step
135+
model.clip_grad_and_step()
136+
# Equal to the following steps:
137+
# # Clip gradients to prevent exploding gradients (max norm = 1.0)
138+
# model.clip_grad_norm(1.0)
139+
# # Perform one optimizer step (update model weights)
140+
# model.step()
141+
# # Reset gradients to zero for the next iteration
142+
# model.zero_grad()
143+
# # Advance the learning rate scheduler by one step
144+
# model.lr_step()
145+
146+
# Log the loss every 2 steps (aligned with gradient accumulation)
147+
if step % 2 == 0:
148+
# Print metric
149+
metric = model.calculate_metric(is_training=True)
150+
logger.info(f'Current is step {step} of {len(dataloader)}, metric: {metric.result}')
151+
152+
# Step 8: Save the trained checkpoint
153+
twinkle_path = model.save(name=f'twinkle-epoch-{epoch}', save_optimizer=True)
154+
logger.info(f'Saved checkpoint: {twinkle_path}')
155+
156+
# Step 9: Upload the checkpoint to ModelScope Hub
157+
# YOUR_USER_NAME = "your_username"
158+
# hub_model_id = f'{YOUR_USER_NAME}/twinkle-multi-modal'
159+
# model.upload_to_hub(
160+
# checkpoint_dir=twinkle_path,
161+
# hub_model_id=hub_model_id,
162+
# async_upload=False
163+
# )
164+
# logger.info(f"Uploaded checkpoint to hub: {hub_model_id}")
165+
166+
167+
if __name__ == '__main__':
168+
train()

cookbook/client/twinkle/modelscope/self_congnition.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121

2222
logger = get_logger()
2323

24-
base_model = 'Qwen/Qwen3-30B-A3B-Instruct-2507'
24+
base_model = 'Qwen/Qwen3.5-4B'
2525
base_url = 'http://www.modelscope.cn/twinkle'
2626

2727
# Step 2: Initialize the Twinkle client to communicate with the remote server.

0 commit comments

Comments
 (0)