Skip to content

[Perf] Expand apples-to-apples kernel suite and scorecard #2

@Simonsbs

Description

@Simonsbs

Goal

Improve runtime competitiveness beyond single-kernel parity by expanding apples-to-apples coverage.

Scope

Add at least 8 additional kernel shapes, each with L0+C equivalents and a shared harness:

  1. integer arithmetic chain
  2. bitwise-heavy kernel
  3. branch-heavy kernel
  4. memory load/store roundtrip
  5. pointer arithmetic loop
  6. function call chain
  7. mixed arithmetic+branch kernel
  8. small struct/aggregate pass

Acceptance

  • All kernels benchmarked in CI.
  • Median-of-N reporting per kernel.
  • Results page includes per-kernel winner and geometric mean.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/benchmarkBenchmark methodology and automationarea/perfCompiler/runtime performance workstreamspriority/highHigh prioritytype/taskImplementation task

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions