Skip to content

[fnuz] transform ocp e4m3 to e4m3_fnuz during loading#1

Open
ZhiweiYan-96 wants to merge 1 commit intomainfrom
zhiwei/extension
Open

[fnuz] transform ocp e4m3 to e4m3_fnuz during loading#1
ZhiweiYan-96 wants to merge 1 commit intomainfrom
zhiwei/extension

Conversation

@ZhiweiYan-96
Copy link
Owner

@ZhiweiYan-96 ZhiweiYan-96 commented Feb 3, 2026

Motivation: Reusing public float8_e4m3 models

TorchAO official release FP8 checkpoint with float8_e4m3. Popular models like DeepSeek-R1 also release model with float8_e4m3, since it is directly trained with FP8.

Reusing existing model by transforming the weight to float8_e4m3fnuz has value, especially the model trained with fp8.

Design

I have two method for reusing.

Method 1: : Hook the fp8 subtensor initialization.

This is based on the fact that, AO checkpoint binds kernel dispatch behavior with weight tensor. Hook the subclass the initialization and modify the raw data without re-quantizing the model from scratch. The change is intrusive. Model conversion happens quietly when loading, small memory overhead.

image

I have verified with the checkpoint released at https://huggingface.co/pytorch/Qwen3-32B-FP8. The inference result is
image

Method 2 Dequantize the fp8_e4m3 linear weight, and then re-quantized the weight using float8_e4m3fnuz.

Using this method, we do not introduce any intrusive change in TorchAO. I verify that, the inference can work well. However, the user need write scripts for dequantizing and re-quantization, which is tricky for users.
What's worse, float8_e4m3fnuz serialization is
not supported in safetensor. Even though the user can write the scripts, they cannot save a model with float8_e4m3fnuz weight.

image

@ZhiweiYan-96
Copy link
Owner Author

@wuhuikx @zejunchen-zejun @

@ZhiweiYan-96 ZhiweiYan-96 changed the title [fnuz] transform ocp format to fnuz during loading [fnuz] transform ocp e4m3 to e4m3_fnuz during loading Feb 3, 2026
@ZhiweiYan-96
Copy link
Owner Author

@xytpai

@ZhiweiYan-96
Copy link
Owner Author

@XiaobingSuper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant