Skip to content

[gluon] Add fused_sigmoid_mul_add#6

Open
apinge wants to merge 2 commits intotingqli:mainfrom
apinge:gluon
Open

[gluon] Add fused_sigmoid_mul_add#6
apinge wants to merge 2 commits intotingqli:mainfrom
apinge:gluon

Conversation

@apinge
Copy link
Collaborator

@apinge apinge commented Mar 3, 2026

fix reduce bug and add fused_sigmoid_mul_add

Current performance on cdna4 with trition 3.6.0

 python -m pytest test_fused_sigmoid_mul_add.py -v -k benchmark -s
========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 -- /opt/venv/bin/python
cachedir: .pytest_cache
hypothesis profile 'default'
rootdir: /root/workspace/pyhip
configfile: pytest.ini
plugins: hypothesis-6.150.2
collected 6 items / 3 deselected / 3 selected                                                                                                                                                            

test_fused_sigmoid_mul_add.py::test_benchmark[1x4096_bf16] 
[benchmark] 1x4096 torch.bfloat16
  fused: 15.74 us/iter, 1.56 GB/s
  ref:   11.70 us/iter, 2.10 GB/s  speedup 0.74x
PASSED
test_fused_sigmoid_mul_add.py::test_benchmark[2x4096_bf16] 
[benchmark] 2x4096 torch.bfloat16
  fused: 17.09 us/iter, 2.88 GB/s
  ref:   12.06 us/iter, 4.08 GB/s  speedup 0.71x
PASSED
test_fused_sigmoid_mul_add.py::test_benchmark[8000x4096_bf16] 
[benchmark] 8000x4096 torch.bfloat16
  fused: 53.86 us/iter, 3650.42 GB/s
  ref:   76.33 us/iter, 2575.99 GB/s  speedup 1.42x
PASSED

==================================================================================== 3 passed, 3 deselected in 1.40s =====================================================================================

Signed-off-by: apinge <apingelqe@outlook.com>
@apinge apinge changed the title [gluon] Add fused_sigmoid_mul_add_gluon_kernel [gluon] Add fused_sigmoid_mul_add Mar 3, 2026
Signed-off-by: apinge <apingelqe@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant