v0.2.10
The juiciest bits 🚀
feat: add unconstrained hyperparameter by @gsprochette in #263
Introduces target modules, so you can pass custom configs while still keeping dependencies intact.
feat: new quantizer for vllm by @llcnt in #239
Adds new config options (patch_for_inference, default_to_hf) so vLLM models play nicer with quantization workflows.
feat: add pre-smash-hook for model preparation by @simlang in #309
Adds a hook so algorithms can prep or tweak models before smashing, making customization easier.
Our documentation got a huge glow up 💅 thanks to @sdiazlor and @davidberenstein1957:
- docs: create end to end reasoning tutorial by @sdiazlor in #283
- docs: create end to end video tutorial by @sdiazlor in #233
- docs: fix discord broken link by @sdiazlor in #305
- docs: updates gtm by @davidberenstein1957 in #316
Pruning some bugs 🐞 and maintenance 🧑🌾
- torch.load always to cpu first by @simlang in #308
- Rework Model Context by @simlang in #323
- fix: make qkv compatible with torch.compile in next diffusers release by @llcnt in #302
- fix: hqq diffusers saving and loading forget non linear layers by @llcnt in #275
- fix: namespace package conflict of optimum and optimum-quanto by @ParagEkbote in #298
- fix: deprecated call types & fixture bug by @begumcig in #313
- fix: nightly tests llmcompressor and gptq by @llcnt in #315
- fix: update datamodules for datasets v4.0.0 by @begumcig in #328
- fix: update model card tags to include 'pruna-ai' by default by @davidberenstein1957 in #334
Full Changelog: v0.2.9...v0.2.10