Skip to content

About the training time #55

@zeyuanyin

Description

@zeyuanyin

Thanks for your great work.

I am surprised by the outstanding performance on such a lightweight design.

In the paper, the model is trained over 3000 steps on Nvidia H20 GPUs with a batch size of 48 with only ∼1% additional trained parameters and just 2000 training pairs.

Could you please disclose the rough training time? Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions