data collator for PO and SFT each

When running run_script.sh for DPO or SFT, which data collator should I use?

The train_dpo_lora.py file is implemented with [DataCollatorForSupervisedDataset,](https://github.com/YiyangZhou/CSR/blob/353de42cfffed932d7b9559b0af33c4d3c696744/train_csr/train_dpo_lora.py#L731) but I get a warning message from dpo_trainer.py suggesting that I should use DPODataCollatorWithPadding. https://github.com/YiyangZhou/CSR/blob/353de42cfffed932d7b9559b0af33c4d3c696744/train_csr/dpo_trainer.py#L683

Could you help me reproduce your work?

Additionally, I noticed that the DPODataCollatorWithPadding implementation does not accept arguments like max_length or max_prompt_length as input, but your code attempts to pass these arguments.

I would appreciate it if you could assist me with these issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data collator for PO and SFT each #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

data collator for PO and SFT each #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions