-
Notifications
You must be signed in to change notification settings - Fork 9
Training R4 #17
Copy link
Copy link
Open
Description
Hi,
I have had good success so far reproducing results training R1/R2 & using KL-Top loss over cross-entropy (still need to reproduce training additional diagonal matrices).
I am wondering if you attempted to train per-layer R4 rotation, successfully or not?
Thank you!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels