Skip to content

v1.0.0

Choose a tag to compare

@FKKimura FKKimura released this 31 Mar 14:13
· 4 commits to main since this release
4f982df

Fujitsu One Compression (OneComp)

A Python package for LLM compression.
Full documentation: https://FujitsuResearch.github.io/OneCompression/

Features

  • Quantization Error Propagation (QEP): Post-training quantization with error propagation to subsequent layers (Arai & Ichikawa, NeurIPS 2025)
  • vLLM Plugin Integration: Serve quantized models with vLLM via built-in DBF and Mixed-GPTQ plugins
  • AutoBit: Mixed-precision quantization with ILP-based bitwidth assignment and automatic VRAM estimation
  • JointQ: Joint quantization optimizing weight assignments and scale parameters simultaneously
  • LoRA SFT Post-Process: Fine-tune quantized models with LoRA for accuracy recovery or knowledge injection
  • Rotation Preprocessing: SpinQuant/OstQuant-based rotation preprocessing (Llama, Qwen3)

Supported Models

Architecture Verified Models Status
Llama TinyLlama, Llama-2, Llama-3 Verified
Qwen3 Qwen3-0.6B ~ 32B Verified

Installation

pip install onecomp