How to get vector quantized in output_dim?

Dear developers, I wonder have you tried apply AQLM for VQ in output_dim? Namely quantized in different dimensions rather than along input dimension. I noticed that your code allows configuring out_group_size. Thereby, I try out the configuration below. However, this config leads  to bad PPL for Llama-2-7b-hf: wikitext 40.2, c4 70.0. Do you have any idea about it?

 --num_codebooks=2 \
 --nbits_per_codebook=8 \
 --out_group_size=8 \
 --in_group_size=1 \

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to get vector quantized in output_dim? #184

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to get vector quantized in output_dim? #184

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions