Skip to content

Releases: kai-scheduler/KAI-Scheduler

v0.14.0

30 Mar 14:36
d6ea335

Choose a tag to compare

What's Changed

Added

  • Added queue validation webhook to queuecontroller with optional quota validation for parent-child relationships AdheipSingh
  • Added support for VPA configuration for the different components of the KAI Scheduler - jrosenboimnvidia
  • Users that have VPA installed on their cluster can now utilize it for proper vertical autoscaling
  • Added FOSSA scanning for the repository context. Scans will also be performed for submitted PRs. The results can be found here. #1178 - davidLif
  • Added support for Ray subgroup topology-aware scheduling by specifying kai.scheduler/topology, kai.scheduler/topology-required-placement, and kai.scheduler/topology-preferred-placement annotations.
  • Allow subgroups to have a 0 value for "minAvailable". This means that all pods in this subgroup are "elastic extra pods". #1216 davidLif

Changed

  • Auto-enable leader election when operator.replicaCount > 1 to prevent concurrent reconciliation #1218
  • Update go version to v1.26.1, With appropriate upgrades to the base docker images, linter, and controller generator. #1222 - davidLif

Fixed

  • Updated resource enumeration logic to exclude resources with count of 0. #1120
  • Fixed scheduler on k8s < 1.34 with DRA disabled.
  • Fixed pod group controller failing to track DRA GPU resources on Kubernetes 1.32-1.33 clusters. #1214
  • Fixed scheduling-constraints signature hashing for Priority and container HostPort by encoding full int32 values, preventing byte-truncation collisions and flaky signature tests.
  • Fixed rollback in scheduling simulations with DRA #1168 itsomri
  • Fixed a potential state corruption in DRA scheduling simulations #1219 itsomri
  • Fixed operator reconcile loop caused by status-only updates triggering re-reconciliation. #1229 cypres
  • Fixed scheduler not starting on k8s clusters with DRA disabled, due to the ResourceSliceTracker not syncing. #1241 cypres
  • Fixed webhook reconcile loop on AKS, by retaining the cloud-provider-injected namespaceSelector rules during reconciliation. #1292 cypres

New Contributors

Full Changelog: v0.13.4...v0.14.0

v0.6.18

24 Mar 14:56
5a1a15f

Choose a tag to compare

What's Changed

Fixed

Changed

  • build: upgrade Go to 1.25.6, golangci-lint to v2.11.3, controller-gen to v0.20.1, mockgen to v0.6.0 - v0.6 by @davidLif in #1281
  • ci: add approval gatekeeper workflow for external contributor PRs by @KaiPilotBot in #1004

Added

Full Changelog: v0.6.17...v0.6.18

v0.13.4

19 Mar 20:32
e7bafb9

Choose a tag to compare

What's Changed

Fixed

  • Fixed operator reconcile loop caused by status-only updates triggering re-reconciliation. #1229 cypres

Full Changelog: v0.13.3...v0.13.4

v0.13.3

18 Mar 14:22
032d764

Choose a tag to compare

What's Changed

  • Fixes an issue where the scheduler failed to start on non-dra enabled clusters #1240 itsomri

Full Changelog: v0.13.2...v0.13.3

v0.13.2

17 Mar 17:07
96564f6

Choose a tag to compare

Fixed

  • Fixed rollback in scheduling simulations with DRA #1168 itsomri
  • Allow subgroups to have a 0 value for "minAvailable". This means that all pods in this subgroup are "elastic extra pods". #1216 davidLif
  • Fixed pod group controller failing to track DRA GPU resources on Kubernetes 1.32-1.33 clusters. #1214
  • Fixed a potential state corruption in DRA scheduling simulations #1225 itsomri

Full Changelog: v0.13.1...v0.13.2

v0.12.18

11 Mar 10:59
8637d54

Choose a tag to compare

What's Changed

Full Changelog: v0.12.17...v0.12.18

v0.13.1

09 Mar 15:34
a76bede

Choose a tag to compare

What's Changed

Fixed

  • Updated resource enumeration logic to exclude resources with count of 0. #1120
  • Fixed scheduler on k8s < 1.34 with DRA disabled.

Full Changelog: v0.13.0...v0.13.1

v0.9.15

10 Mar 09:20
46a291a

Choose a tag to compare

What's Changed

  • fix(scheduler): podGroup status update loop on conflict - v0.9 by @enoodle in #1166

Full Changelog: v0.9.14...v0.9.15

v0.12.17

04 Mar 16:52
6f8c8d7

Choose a tag to compare

What's Changed

  • fix: skip runtimeClassName injection when gpuPodRuntimeClassName is e… by @enoodle in #1130

Full Changelog: v0.12.16...v0.12.17

v0.9.14

04 Mar 16:51
29b76ef

Choose a tag to compare

What's Changed

  • refactor: Represent podreferences as strings v0.9 by @itsomri in #985
  • fix(scheduler): bind plugin server to localhost by @gshaibi in #996
  • ci: add approval gatekeeper workflow for external contributor PRs by @KaiPilotBot in #1003
  • fix(queue-controller): use Spec.Queue field indexer for resource aggregation (#1049) by @gshaibi in #1053
  • chore: auto-resolve CHANGELOG.md merge conflicts with union strategy by @KaiPilotBot in #1054
  • fix: skip runtimeClassName injection when gpuPodRuntimeClassName is e… by @enoodle in #1131

Full Changelog: v0.9.13...v0.9.14