Skip to content

Estimation of instruction latency #1

@marshallward

Description

@marshallward

Instruction latency is currently hardcoded in the scalar tests (avx_add, etc). This can vary across architectures, e.g. 3 cycle on Sandy Bridge, 7 cycle on KNL. The latency is used to determine the loop unroll factor in these tests. Values which are too small can cause pipeline stalls and significantly reduce the peak performance.

Additionally, choosing a value that is too large can cause a slowdown. It is usually small, but is nonetheless suboptimal. Very large latency values can also consume registers and indirectly cause large slowdowns.

Agner Fog's tool will dynamically compute the latency of every instruction, albeit with a kernel module. It may be worth looking into this and seeing if we can use a similar method to estimate latency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions