Releases: SystemPanic/flashinfer-windows
v0.6.7.post1
Windows build for CUDA 12.4, Arch 8.6 and Pytorch 2.11
v0.6.3
Windows build for CUDA 12.4, Arch 8.6 and Pytorch 2.10 / 2.11
Default wheels are for Pytorch 2.11
For Pytorch 2.10 wheels, remove -torch210 from the name before install.
v0.4.1
Windows build for CUDA 12.4, Arch 8.6 and Pytorch 2.7.1 / 2.8.0
Remove -torch271 / -torch28 from wheel name before install.
v0.3.0
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.3.0-cp39-abi3-win_amd64.whl) has all the kernels precompiled inside the wheel (for production).
Built with MOE, Gemma, OAI-Oss, misc and activation kernels.
JIT version (flashinfer_python-0.3.0-py3-none-any.whl) compiles the required kernels at runtime (for development).
v0.2.14.post1
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.14.post1-cp39-abi3-win_amd64.whl) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.14.post1-py3-none-any.whl) compiles the required kernels at runtime (for development).
v0.2.8
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.8-cp39-abi3-win_amd64.whl) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.8-py3-none-any.whl) compiles the required kernels at runtime (for development).
v0.2.7.post1
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.7.post1-cp39-abi3-win_amd64.whl) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.7.post1-py3-none-any.whl) compiles the required kernels at runtime (for development).
v0.2.6.post1
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.6.post1-cp39-abi3-win_amd64.whl) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.6.post1-py3-none-any.whl) compiles the required kernels at runtime (for development).
v0.2.2.post1
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.2.post1-cp312-abi3-win_amd64.wheel) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.2.post1-py3-none-any.whl) compiles the required kernels at runtime (for development).
v0.2.5
Windows build for Cuda 12.4, Pytorch 2.6.0 nightly with Gloo distributed backend, cuDSS, cuDNN and cuBLAS.
AOT version (flashinfer_python-0.2.5-cp312-abi3-win_amd64) has all the kernels precompiled inside the wheel (for production).
JIT version (flashinfer_python-0.2.5-py3-none-any.whl) compiles the required kernels at runtime (for development).