Skip to content

Conversation

@kaiming-cheng
Copy link

@kaiming-cheng kaiming-cheng commented Nov 13, 2025

This PR introduced an updated version of kernel-agent-generated kernels on a difficult set of problem. Previously, KernelAgent failed to generate correct kernels on the 25 ops below. In this version, the average correctness score is 0.69, with 14 ops reached 100% correctness (56% pass rate). This PR serves as a reference for the generated kernel implementations.

Experiment Results

Average Correctness Ratio: 0.69
14 out of 25 kernels reached 100% correctness

Index Operator Correctness Ratio Speedup vs Eager
1 _log_softmax_backward_data.default 0.0000% N/A
2 _patched_sub_tensor 0.0000% (Failed to generate)
3 _softmax.default 0.0000% (Failed to generate)
4 add.Tensor 90.0000% N/A
5 bmm.default 100.0000% N/A
6 add_.Tensor 0.0000% (Failed to generate)
7 div.Tensor 100.0000% N/A
8 fill_.Tensor 0.0000% (Failed to generate)
9 eq.Tensor 100.0000% N/A
10 ge.Scalar 0.0000% N/A
11 gt.Tensor 100.0000% N/A
12 lt.Tensor 100.0000% N/A
13 masked_fill.Scalar 100.0000% N/A
14 max.dim 100.0000% N/A
15 maximum.default 100.0000% N/A
16 mean.dim 37.5000% N/A
17 minimum.default 100.0000% N/A
18 mm.default 100.0000% N/A
19 mul.Tensor 57.8947% N/A
20 pow.Scalar 100.0000% 2.0489x
21 reciprocal.default 100.0000% N/A
22 std.correction 75.0000% N/A
23 sum.default 100.0000% N/A
24 sum.dim_IntList 56.2500% N/A
25 where.self 100.0000% 1.1503x

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 13, 2025
@Jack-Khuu
Copy link

For tracking, can you share the commit hash these were tested on?

@Laurawly
Copy link
Contributor

Laurawly commented Nov 14, 2025

@kaiming-cheng We focus on testing fp16/bf16 dtypes for triton kernels on GPU. Pls reference this PR: #111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants