GGUF format add support for MoE models with non-linear expert layers. #1244

n1ck-guo · 2026-01-08T06:54:31Z

This pull request refactors the _quant_data method in auto_round/export/export_to_gguf/convert.py to improve support for MOE models, streamline attribute handling, and clean up the quantization logic. The changes mainly focus on making the code more robust for different model architectures and removing legacy or redundant quantization branches.

Support for MOE models and quantization logic cleanup:

Improved handling for MOE models by updating the attribute check to support modules with "exps" in their name and 3D tensor shapes, making the code more flexible for non-linear exporters.
Refactored the quantization logic to remove legacy branches and commented code, simplifying the decision flow for quantization type selection and ensuring FP16 issues are documented but not used.

General code cleanup:

Removed an unnecessary suffix check from the beginning of the function, streamlining the code for extracting layer names.

Signed-off-by: n1ck-guo <heng.guo@intel.com>

auto_round/export/export_to_gguf/convert.py

Signed-off-by: n1ck-guo <heng.guo@intel.com>

n1ck-guo requested review from wenhuach21 and xin3he January 8, 2026 06:54

add support for moe model with non-linear exports layer for gguf

813459c

Signed-off-by: n1ck-guo <heng.guo@intel.com>

n1ck-guo changed the title ~~add support for moe model with non-linear exports layer for gguf~~ GGUF format add support for MoE models with non-linear expert layers. Jan 8, 2026

wenhuach21 reviewed Jan 8, 2026

View reviewed changes

auto_round/export/export_to_gguf/convert.py Show resolved Hide resolved

n1ck-guo added 4 commits January 8, 2026 23:14

update

23a9847

Signed-off-by: n1ck-guo <heng.guo@intel.com>

merge

a0bc64c

Signed-off-by: n1ck-guo <heng.guo@intel.com>

codescan

81f45d6

Signed-off-by: n1ck-guo <heng.guo@intel.com>

fix merge

1a5693c

Signed-off-by: n1ck-guo <heng.guo@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GGUF format add support for MoE models with non-linear expert layers. #1244

GGUF format add support for MoE models with non-linear expert layers. #1244

Uh oh!

n1ck-guo commented Jan 8, 2026 •

edited by xin3he

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GGUF format add support for MoE models with non-linear expert layers. #1244

Are you sure you want to change the base?

GGUF format add support for MoE models with non-linear expert layers. #1244

Uh oh!

Conversation

n1ck-guo commented Jan 8, 2026 • edited by xin3he Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

n1ck-guo commented Jan 8, 2026 •

edited by xin3he

Loading