Skip to content

posts/modelquantization/quantization-impact-on-model-accuracy/ #1

@utterances-bot

Description

@utterances-bot

Quantization Impact on Model Accuracy | Slightwind

Mistral-7B’s performance on 5-shot MMLU 如果对测试细节不感兴趣,只需要看下面给出的汇总表格即可。
Overview 量化/非量化版本的 Mistral-7B-v0.1 模型在 5-shot MMLU 上的表现:
Quant Type Compute Dtype Double Quant Group Size Avg (%) Total Time (s) BF16 BF16 N/A N/A $61.00$ $312.79$ INT8 BF16&INT8 N/A N/A $60.87$ $614.43$ FP4 FP16 False 64 $59.37$ $347.00$ FP4 FP16 True 64 $59.17$ $353.22$ FP4 FP32 False 64 $59.50$ $1061.27$ NF4 FP16 False 64 $59.04$ $361.19$ NF4 FP16 False 128 $58.78$ $352.65$ A

https://slightwind.cn/posts/modelquantization/quantization-impact-on-model-accuracy/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions