Skip to content

v0.2.10

Choose a tag to compare

@begumcig begumcig released this 17 Sep 09:57
· 126 commits to main since this release
5881538

The juiciest bits 🚀

feat: add unconstrained hyperparameter by @gsprochette in #263

Introduces target modules, so you can pass custom configs while still keeping dependencies intact.

feat: new quantizer for vllm by @llcnt in #239

Adds new config options (patch_for_inference, default_to_hf) so vLLM models play nicer with quantization workflows.

feat: add pre-smash-hook for model preparation by @simlang in #309

Adds a hook so algorithms can prep or tweak models before smashing, making customization easier.

Our documentation got a huge glow up 💅 thanks to @sdiazlor and @davidberenstein1957:

Pruning some bugs 🐞 and maintenance 🧑‍🌾

  • torch.load always to cpu first by @simlang in #308
  • Rework Model Context by @simlang in #323
  • fix: make qkv compatible with torch.compile in next diffusers release by @llcnt in #302
  • fix: hqq diffusers saving and loading forget non linear layers by @llcnt in #275
  • fix: namespace package conflict of optimum and optimum-quanto by @ParagEkbote in #298
  • fix: deprecated call types & fixture bug by @begumcig in #313
  • fix: nightly tests llmcompressor and gptq by @llcnt in #315
  • fix: update datamodules for datasets v4.0.0 by @begumcig in #328
  • fix: update model card tags to include 'pruna-ai' by default by @davidberenstein1957 in #334

Full Changelog: v0.2.9...v0.2.10