Releases: FujitsuResearch/OneCompression
Releases · FujitsuResearch/OneCompression
v1.0.2
v1.0.1
Packaging
- Moved
matplotlibfromdevextra to newvisualizeextra inpyproject.toml - Made
visualize_bit_assignmentimport lazy inonecomp/quantizer/autobit/__init__.pyto avoid requiring matplotlib at import time - Updated installation instructions in
README.mdanddocs/getting-started/installation.mdto reflect the newvisualizeextra - Updated
uv.lock
v1.0.0
Fujitsu One Compression (OneComp)
A Python package for LLM compression.
Full documentation: https://FujitsuResearch.github.io/OneCompression/
Features
- Quantization Error Propagation (QEP): Post-training quantization with error propagation to subsequent layers (Arai & Ichikawa, NeurIPS 2025)
- vLLM Plugin Integration: Serve quantized models with vLLM via built-in DBF and Mixed-GPTQ plugins
- AutoBit: Mixed-precision quantization with ILP-based bitwidth assignment and automatic VRAM estimation
- JointQ: Joint quantization optimizing weight assignments and scale parameters simultaneously
- LoRA SFT Post-Process: Fine-tune quantized models with LoRA for accuracy recovery or knowledge injection
- Rotation Preprocessing: SpinQuant/OstQuant-based rotation preprocessing (Llama, Qwen3)
Supported Models
| Architecture | Verified Models | Status |
|---|---|---|
| Llama | TinyLlama, Llama-2, Llama-3 | Verified |
| Qwen3 | Qwen3-0.6B ~ 32B | Verified |
Installation
pip install onecomp