How to dequantize a model with 4 groups and centroids greater than 4096?

I've been trying to quantize and run the Meta-Llama-3.1-8B-Instruct-2.3bit model with group number set to 4, and successfully run the model when k1(centroids) is 4096 as in the paper. However, anything k1 setting above that(8192, 16384, 65536) would lead to a successful quantization but a model running failure. The error logs show that the reason could be some illegal memory access during the dequant function.

So here's what I want to ask, does the code support running a model with group number option on and centroids set to 8k and greater? Or do I need to do some adjustment to make it work?

Looking forward to your reply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to dequantize a model with 4 groups and centroids greater than 4096? #128

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to dequantize a model with 4 groups and centroids greater than 4096? #128

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions