Skip to content

tracel-ai/cubek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

459 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Discord Current Crates.io Version Minimum Supported Rust Version Test Status license


CubeK: high-performance multi-platform kernels in CubeCL

Algorithms

Algorithms Variants
Random bernoulli normal uniform
Quantization symmetric per-block per-tensor q2 q4 q8 fp4
Reduction mean sum prod max min arg[max|min] per-cube per-plane
Matmul mma unit tma multi-stage specialization ordered multi-rows
Convolution mma unit tma multi-stage im2col
Attention mma unit multi-rows

Contributing

If you want to contribute new kernels, please read the GUIDE.md.

Running tests

Note: This applies to most kernels, but reduce works slightly differently for now, see its README.

Command

Three test suites are available:

  • Smoke test suite: a tractable subset of representative tests that run on the CI.
  • Extended test suite: usually auto-generated combinatorial tests covering many configurations. Good to run when developing kernels. Normally kept tractable.
  • Full test suite: all generable test combinations; may be too large to compile or run practically.

Run tests with

# Replace <runtime> with cpu, cuda, rocm, wgpu, vulkan or metal

# Smoke test suite
cargo test-<runtime>

# Extended test suite
cargo test-<runtime>-extended

# Full test suite
cargo test-<runtime>-full

Cube test mode

You can control test behavior by setting the CUBE_TEST_MODE environment variable.
For more details, see Test Mode.

Modes

  • CUBE_TEST_MODE=correct (default)
    Tests pass if results are numerically correct or if the kernel was launched with an invalid configuration.

    • Useful when tests are auto-generated from multiple parameter combinations, where some invalid configurations are expected.
    • Failing tests display only the first index with a discrepancy.
  • CUBE_TEST_MODE=strict
    Tests pass only if they compile, run, and produce numerically accurate results.

    • Ideal for debugging to avoid false positives that can occur in correct mode.
  • CUBE_TEST_MODE=printfail
    Similar to correct mode: tests pass if results are correct or if the kernel is invalid.

    • Failing tests show all tensor discrepancies.
    • Supports filtering, e.g.: CUBE_TEST_MODE=printfail:0,.,10-20 shows elements from the 0th first dimension, all of the second, and elements 10–20 in the third.
  • CUBE_TEST_MODE=printall
    All tests fail, displaying all tensor discrepancies.

    • Filtering works the same as in printfail.
  • CUBE_TEST_MODE=failifrun
    Only tests that compile and run will fail; others succeed.

    • Useful for tracking critical tests in large suites.

About

CubeK: high-performance multi-platform kernels in CubeCL

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages