Request for SFT Training scripts and implementation details

Thank you for sharing your research work. I have a question related to the supervised fine-tuning step, which, according to the paper, is used to initialize the `base` model before running SimPO. While the SFT configuration file is provided at `training_configs/llama-3-8b-base-sft.yaml`, may I ask for the SFT training script itself?

In issue #27, there is 1 comment asking about how `HuggingFaceH4/ultrachat_200k` is processed for SFT. I would like to know this too. `HuggingFaceH4/ultrachat_200k` samples are multi-turn dialogues. Therefore, I am curious about what labels are used for SFT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for SFT Training scripts and implementation details #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for SFT Training scripts and implementation details #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions