v0.2.10

begumcig released this 17 Sep 09:57

· 126 commits to main since this release

5881538

The juiciest bits 🚀

feat: add unconstrained hyperparameter by @gsprochette in #263

Introduces target modules, so you can pass custom configs while still keeping dependencies intact.

feat: new quantizer for vllm by @llcnt in #239

Adds new config options (patch_for_inference, default_to_hf) so vLLM models play nicer with quantization workflows.

feat: add pre-smash-hook for model preparation by @simlang in #309

Adds a hook so algorithms can prep or tweak models before smashing, making customization easier.

Our documentation got a huge glow up 💅 thanks to @sdiazlor and @davidberenstein1957:

docs: create end to end reasoning tutorial by @sdiazlor in #283
docs: create end to end video tutorial by @sdiazlor in #233
docs: fix discord broken link by @sdiazlor in #305
docs: updates gtm by @davidberenstein1957 in #316

Pruning some bugs 🐞 and maintenance 🧑‍🌾

torch.load always to cpu first by @simlang in #308
Rework Model Context by @simlang in #323
fix: make qkv compatible with torch.compile in next diffusers release by @llcnt in #302
fix: hqq diffusers saving and loading forget non linear layers by @llcnt in #275
fix: namespace package conflict of optimum and optimum-quanto by @ParagEkbote in #298
fix: deprecated call types & fixture bug by @begumcig in #313
fix: nightly tests llmcompressor and gptq by @llcnt in #315
fix: update datamodules for datasets v4.0.0 by @begumcig in #328
fix: update model card tags to include 'pruna-ai' by default by @davidberenstein1957 in #334

Full Changelog: v0.2.9...v0.2.10

Contributors

begumcig, davidberenstein1957, and 5 other contributors

Assets 2