-
Notifications
You must be signed in to change notification settings - Fork 52
Open
Description
Thanks for your great work.
I am surprised by the outstanding performance on such a lightweight design.
In the paper, the model is trained over 3000 steps on Nvidia H20 GPUs with a batch size of 48 with only ∼1% additional trained parameters and just 2000 training pairs.
Could you please disclose the rough training time? Thanks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels