v0.2.5
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.5-cp312-abi3-win_amd64) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.5-py3-none-any.whl) compiles the required kernels at runtime (for development).