[FEATURE] - EfficientQAT?   Supposedly allows for a 123b to be 35% of the size, with 4% accuracy loss. 

Apparently it is a new method for doing quantization?  Here is the reddit and Github, so that you can see whether it is worth rolling into AutoGGUF.

[Quantize 123b to 35%](https://www.reddit.com/r/LocalLLaMA/comments/1elbn3q/quantize_123b_mistrallargeinstruct2407_to_35_gb/)

[EfficientQAT Github](https://github.com/OpenGVLab/EfficientQAT)


Thank you for AutoGGUF, I am looking forward to handling quantizations without being an acolyte of the command-line.  :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] - EfficientQAT? Supposedly allows for a 123b to be 35% of the size, with 4% accuracy loss. #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[FEATURE] - EfficientQAT? Supposedly allows for a 123b to be 35% of the size, with 4% accuracy loss. #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions