Skip to content

fma_poly_scale_slice_hexl performance times does not increase linearly with mod_size #2

@Janmajayamall

Description

@Janmajayamall

fma_poly_scale_slice_hexl performance times does not increase linearly with mod_size (ie moduli count in poly). Instead as mod_size increases time blows up.

Poly stores its coefficients in row major form and all fma_poly_scale_slice_hexl does is that it calls hexl_rs::elwise_fma_mod for each row in poly. This means fma_poly_scale_slice_hexl calls elwise_fma_mod mod_size times (row count equals mod_size, since there is a single row for each moduli). Hence, time taken when mod_size = 15 must be 15x of time taken when mod_size = 1. But this isn't the case.

For example on r6i.8xlarge instance, following are benchmarks for fma_poly_scale_slice_hexl:

range_fn/fma_poly_scale_slice_hexl/n=32768/mod_size=1
                        time:   [8.5271 µs 8.5275 µs 8.5278 µs]

range_fn/fma_poly_scale_slice_hexl/n=32768/mod_size=3
                        time:   [36.679 µs 36.695 µs 36.709 µs]

range_fn/fma_poly_scale_slice_hexl/n=32768/mod_size=7
                        time:   [107.51 µs 107.51 µs 107.52 µs]

range_fn/fma_poly_scale_slice_hexl/n=32768/mod_size=15
                        time:   [254.32 µs 254.35 µs 254.37 µs]

Time taken clearly does not increase linearly with mod_size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions