Problem Description
sglang bench show long TTFT randomly. It's related with triton kernel re-JIT.
When ran without cudagraph, triton jit is trigered sometimes. Need to set a huge triton cache size. Need to fix.
Operating System
linux
CPU
all
GPU
all
ROCm Version
all
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response