Release Prep JAX 0.9.1 by gulsumgudukbay · Pull Request #735 · ROCm/jax

gulsumgudukbay · 2026-03-06T22:58:43Z

PR to prepare for 0.9.1 release.

AMD CDNA3 (MI300X/gfx942) does not have a hardware tanh instruction like NVIDIA's PTX tanh.approx. This implements approx_tanh for ROCm using: - For f32 (and f16/bf16 via casting): Triton's __triton_hip_fast_tanhf which uses a fast exp-based formula: tanh(x) = (exp(2x) - 1) / (exp(2x) + 1) - For f64: OCML's __ocml_tanh_f64 (AMD's Open Compute Math Library) Changes: - Add f64 support to approx_tanh function - Add ROCm platform detection in _elementwise_inline_asm_lowering - Add _approx_tanh_rocm_lowering function for ROCm-specific lowering - Add test_approx_tanh test with f16/bf16/f32/f64 support See: triton-lang/triton#7780 (cherry picked from commit 39ceb95)

- Remove verbose comment in _elementwise_inline_asm_lowering - Inline dtype_to_ir_type helper, use mlir.dtype_to_ir_type directly - Move ir and arith_dialect imports to top-level - Add TypeError for float64 on non-ROCm platforms - Simplify _approx_tanh_rocm_lowering with needs_cast pattern - Move test_approx_tanh from ops_test.py to triton_pallas_test.py - Fix triton_pallas_test setUp to allow ROCm devices (cherry picked from commit 600fbd3)

The test is already skipped on CUDA (b/442353988) due to HLO debug metadata (source column numbers) being embedded in compiled output, causing semantically identical compilations to produce different as_text() results. The same issue occurs on ROCm. (cherry picked from commit 70b2b99)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> (cherry picked from commit 5ec8419)

gulsumgudukbay and others added 4 commits March 6, 2026 22:52

Update tests/aot_test.py

d2c6cca

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> (cherry picked from commit 5ec8419)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release Prep JAX 0.9.1#735

Release Prep JAX 0.9.1#735
gulsumgudukbay wants to merge 4 commits intorocm-jaxlib-v0.9.1from
release-prep-091

gulsumgudukbay commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gulsumgudukbay commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants