Large PPL Fluctuations in VPTQ 4.05-bit Quantization Despite Fixed Random Seed and Identical Setup

Hi, I'm encountering significant fluctuations in perplexity (PPL) when reproducing the VPTQ 4.05-bit quantization results using the following configuration:

```python
"--vector_lens", "-1", "6",
"--group_num", "1",
"--num_centroids", "-1", "4096",
"--num_res_centroids", "-1", "4096",
"--npercent", "0",
"--blocksize", "128",
"--new_eval",
"--seq_len", "2048",
"--kmeans_mode", "hessian",
"--num_gpus", "8",
# "--enable_perm",
"--enable_norm",
"--save_model",
"--save_packed_model",
"--hessian_path", "/workshop/Hessians/H",
"--inv_hessian_path", "/workshop/Hessians/INVH",
"--ktol", "1e-5",
"--kiter", "100"
```

Setup details:

- Model: Llama3-8B
- Dataset: wikitext-2
- Hardware: 8× A100 GPUs
- Random seed: default (0)
- Hessian files are precomputed and reused across runs

Observed behavior:
Across multiple independent runs with the exact same command and environment, I obtained widely varying PPL scores: 29.83, 15.52, and 50.56.

To debug, I verified that:
The inference code itself is deterministic: when I load a saved quantized model and run evaluation, the PPL is consistent across repeated evaluations of the same quantized checkpoint.
However, different quantization runs (even with identical seeds and inputs) produce quantized models with drastically different PPLs.
This suggests that non-determinism is introduced during the quantization process, possibly in the k-means clustering step (--kmeans_mode hessian). Could this be due to:

Non-deterministic behavior in PyTorch/CUDA operations despite a fixed seed?
Initialization sensitivity in k-means when using Hessian-weighted distances?
Race conditions or non-determinism across multi-GPU execution?
Could you please help clarify why such large fluctuations occur and how to achieve reproducible quantization results? Any guidance on ensuring determinism (e.g., additional seeding, disabling certain optimizations, or adjusting k-means parameters) would be greatly appreciated.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Large PPL Fluctuations in VPTQ 4.05-bit Quantization Despite Fixed Random Seed and Identical Setup #202

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Large PPL Fluctuations in VPTQ 4.05-bit Quantization Despite Fixed Random Seed and Identical Setup #202

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions