Summary
Installation fails for flash-attn because the wheel referenced in requirements.txt does not match:
- the ABI used by publicly available PyTorch wheels,
- the CUDA / Torch version combinations available to external users.
Public wheels exist only for specific builds (e.g., torch 2.5.1 + cu121 + cxx11abi=FALSE), while the provided Flash-Attention wheel targets torch 2.6 + cu12 + cxx11abi=TRUE, which is unavailable publicly.
What happens
- Pip/uv cannot install the wheel.
- Building from source fails unless the system has:
- full CUDA toolkit installed,
- gcc ≤ 12,
- correct ABI matching.
- Fresh installs fail unless users manually replace the flash-attn dependency with a different wheel.
Why this matters
Users cannot install the environment from requirements.txt as written. It requires NVIDIA-internal wheels or manual guesswork.
Requested Fixes
- Provide Flash-Attention wheels built against public torch binaries (e.g., torch 2.5.1 cu121).
- OR pin torch to a version that does match existing Flash-Attention wheels.
- OR make Flash-Attention optional with a graceful fallback.
- Update the README to document the needed CUDA/GCC toolchain if source build is required.
Summary
Installation fails for
flash-attnbecause the wheel referenced inrequirements.txtdoes not match:Public wheels exist only for specific builds (e.g.,
torch 2.5.1 + cu121 + cxx11abi=FALSE), while the provided Flash-Attention wheel targetstorch 2.6 + cu12 + cxx11abi=TRUE, which is unavailable publicly.What happens
Why this matters
Users cannot install the environment from
requirements.txtas written. It requires NVIDIA-internal wheels or manual guesswork.Requested Fixes