[FEATURE] Add FP8 Quantization to FlashAttention3

> ‼️ If you want to work on this issue: please comment below and wait until a maintainer assigns this issue to you before opening a PR to avoid several contributions on the same issue. Thanks! 😊

## **✨ What You’ll Do**

Right now, we have an implementation of flash attention 3 in Pruna. However, quantization has not been included. So, your task will be to include FP8 quantization if applies. 

## 🤖 Useful Resources

- [ ] https://github.com/Dao-AILab/flash-attention
- [ ] Add a hyperparameter to FA3 (`src/pruna/algorithms/kernels/flash_attn3.py`) along the lines of "quantize=True/False"
- [ ] If the hyperparameter is activated, quantize the attention inputs to the FP8 data format
- [ ] Update the documentation

---

## **✅ Acceptance Criteria**

- It follows the style guidelines.
- **Tests & Docs:** All existing and new unit tests pass, and the documentation is updated

And don’t forget to give us a ⭐️!

---

## **❓ Questions?**

Feel free to jump into the [#contributing Discord channel](https://discord.com/invite/JFQmtFKCjd) if you hit any roadblocks. Can’t wait to see your contribution! 🚀

---

## Share on Socials

[![Share on X](https://img.shields.io/badge/Share_on_X-000?logo=x&logoColor=white)](https://twitter.com/intent/tweet?text=I%20just%20contributed%20to%20%40PrunaAI%20open%20source%20project!%20%F0%9F%9A%80%20Check%20out%20my%20PR%20%F0%9F%91%89%20%5BADD%20LINK%20TO%20PR%5D)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add FP8 Quantization to FlashAttention3 #373

✨ What You’ll Do

🤖 Useful Resources

✅ Acceptance Criteria

❓ Questions?

Share on Socials

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Add FP8 Quantization to FlashAttention3 #373

Description

✨ What You’ll Do

🤖 Useful Resources

✅ Acceptance Criteria

❓ Questions?

Share on Socials

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions