using fp32 data format to simulate mx data format is not equivalent？

It's nice work but I have some questions：
we see that we use template T=float （see following code）
（1）to use fp32 to represent mx data format and 
（2）simulating mx format calculation operation using fp32
 so I think this may exist in-equivalent with real mx data format representation and operation. Do you use FPGA to evaluate how much error between the fp32-simuation and real mx data format？
`template<typename T>
__global__ void quantize_mx_cuda_kernel(
    const T* __restrict__ input,
    const int scale_bits,
    const int elem_ebits,
    const int elem_mbits,
    const float elem_max_norm,
    const float* __restrict__ max_values,
    const long total_size,
    const int axis_size,
    const int post_axis_size,
    const bool flush_fp32_subnorms,
    const RoundingMode rounding_mode,
    T* __restrict__ output
) {`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using fp32 data format to simulate mx data format is not equivalent？ #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

using fp32 data format to simulate mx data format is not equivalent？ #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions