Skip to content

pheonix-delta/llm-isotropic-tradeoff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

The Isotropic Tradeoff: Quantization and Outlier Signal Pollution in LLM Weights

DOI License: MIT Open In Colab

Author: Shubham Dev
Paper: The Isotropic Tradeoff (Zenodo)

This repository contains the empirical evaluation suite accompanying the paper "The Isotropic Tradeoff: How Rotation-Based Quantization Exchanges Structural Weight Integrity for Euclidean Fidelity." We explore the critical boundary in post-training quantization (PTQ), identifying where global rotation is an engineering success and where it induces measurable signal degradation.

📝 Research Scope & Clarification Update

Following community feedback, this repository has been updated to explicitly delineate the boundaries of rotation-based quantization.

1. The KV Cache Distinction (Rotation as a Feature)
Modern SOTA methods (like TurboQuant) utilize global rotation (e.g., Hadamard transforms) to isotropize the KV Cache. For the dense activations of the KV cache, where attention is computed in the rotated space, spreading quantization error as generalized noise across all channels is a highly effective trade-off for memory efficiency. We fully acknowledge the success of rotational quantization in this domain.

2. The Focus: Model Weight Quantization (Rotation as a Tradeoff)
This research focuses on the application of global rotation to Model Weights (Linear layers). While spreading error is acceptable for fluid cache retrieval, applying global rotation to the static knowledge structures of an LLM introduces a structural cost: the smearing of high-magnitude weight outliers into the broader network noise floor.

🧠 Outlier Smearing vs. Structural Integrity

Recent research (e.g., LLM.int8()) has established that LLMs are governed by highly sparse, high-magnitude outliers that are critical for model reasoning and performance. When global orthogonal rotation is applied to these weights, it acts as a distributional mixer:

  • It takes these localized, high-magnitude signals and smears them across all dimensions to minimize global Mean Squared Error (MSE).
  • While this optimizes for a "flat" Euclidean reconstruction, it replaces the precise localized geometry of the weights with a uniform noise floor.
  • This results in Signal Pollution (measured here as "Induced Neuronal Firings"), where previously silent neurons are forced to fire due to the redistribution of outlier energy.

📊 Clarification on Community Evidence (AIME25)

In our manuscript, we reference AIME25 benchmark results published by Georgi Gerganov in the llama.cpp repository. To clarify, these results are not represented as original experimental data of this paper. They are cited as external, supportive community evidence showing that at aggressive quantization levels (Q4), rotation provides only partial recovery. We hypothesize that this residual degradation is a symptom of the weight saliency degradation our code measures. This distinction is made to clarify the division between our core theoretical/mathematical contribution and contextual third-party benchmarks.

🔬 Empirical Results

The evaluation notebook (evaluate_isotropic_fallacy.ipynb) tests Qwen/Qwen2.5-1.5B over a 2048-token WikiText sequence.

1. The Euclidean Illusion

Rotation successfully optimizes standard geometric metrics, dropping outlier reconstruction error by over 98%, at the cost of baseline noise floor degradation.

Metric Naive 3-bit Global Rotation (3-bit) Delta
Outlier MSE (Top 1%) 157.8024 2.6782 -98.3%
Noise Floor MSE (Bottom 90%) 2.2721 2.5339 +11.5%

2. Signal Pollution (Induced Firings)

Despite improved Euclidean metrics, rotation generates a surge in false firings (>1.0 magnitude) in previously silent neuronal dimensions due to the "Blender Effect" of global mixing.

Metric Naive 3-bit Global Rotation (3-bit)
Induced False Firings (>1.0) 0 367,539

🚀 Usage

Local Execution

git clone https://github.com/pheonix-delta/llm-isotropic-tradeoff.git
cd llm-isotropic-tradeoff
pip install torch transformers datasets scipy tqdm
jupyter notebook evaluate_isotropic_fallacy.ipynb

📚 Citation

@article{dev2026isotropic,
  title={The Isotropic Tradeoff: Quantization and Outlier Signal Pollution in LLM Weights},
  author={Dev, Shubham},
  journal={arXiv preprint},
  year={2026},
  doi={10.5281/zenodo.19338651},
  url={https://doi.org/10.5281/zenodo.19338651}
}

About

Evaluation suite for "The Isotropic Tradeoff", measuring semantic sparsity destruction and ghost activations in rotation-preconditioned LLM quantization.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors