Skip to content

GroupedGemm: NVFP4 via cuBLAS #2455

@ptrendx

Description

@ptrendx

Implement GroupedGemm for NVFP4 format using cuBLAS kernels. Ensure integration with GroupedTensor utilities and grouped quantization pathways for Sync-Free MoE training.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions