Skip to content

How to get vector quantized in output_dim? #184

@shanhx2000

Description

@shanhx2000

Dear developers, I wonder have you tried apply AQLM for VQ in output_dim? Namely quantized in different dimensions rather than along input dimension. I noticed that your code allows configuring out_group_size. Thereby, I try out the configuration below. However, this config leads to bad PPL for Llama-2-7b-hf: wikitext 40.2, c4 70.0. Do you have any idea about it?

--num_codebooks=2
--nbits_per_codebook=8
--out_group_size=8
--in_group_size=1 \

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions