perf(MoE): Use TE quant/dequant for SwiGLU fp8 input store to improve performance and stability#1753
Draft
xiaoxi-wangfj wants to merge 3 commits intoNVIDIA:mainfrom
Draft
perf(MoE): Use TE quant/dequant for SwiGLU fp8 input store to improve performance and stability#1753xiaoxi-wangfj wants to merge 3 commits intoNVIDIA:mainfrom
xiaoxi-wangfj wants to merge 3 commits intoNVIDIA:mainfrom