Skip to content

Release training dataset and loss log to help reproduce results #26

@srzhu97

Description

@srzhu97

I am trying to reproduce the Mistral-7B-SPPO Iter1 model. However, after my first iteration, the model I trained diverged significantly from the published Mistral-7B-SPPO Iter1 model when comparing the results on a benchmark dataset.
To help with diagnosing the issue and improving my training, could you kindly provide the training dataset and the loss log so that I can compare my run to?

I was using this prompt dataset, but it doesn't contains the columns chosen, rejected, chosen_probs, chosen_probs_win, chosen_probs_lose. While running the generate.sh script can produce these columns, it would be incredibly helpful if the actual training dataset could be released. This would make it much easier for me to debug and identify where things might have gone wrong in my training process.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions