Releases: kubernetes-sigs/dra-driver-cpu
Releases · kubernetes-sigs/dra-driver-cpu
v0.1.0
What's Changed
- Initial implementation of the CPU DRA driver
Highlights
- Deployment: The driver runs as a DaemonSet to manage CPU resources on every node. It requires privileged access to interact with the container runtime.
- Runtime Setup: An init container automatically enables NRI and CDI in containerd. It restarts the runtime if configuration changes are necessary.
- Resource Publication: The driver discovers the node's CPU layout. It publishes ResourceSlice objects so the Kubernetes scheduler can see available CPUs.
- Grouped Mode (Default): This mode uses DRA consumable capacity (beta in Kubernetes 1.36) to group CPUs based on NUMA node or socket. Select this mode using the
--cpu-device-mode=groupedflag. The grouping can be selected by setting the--cpu-device-group-byflag tonumanodeorsocket. - Individual Mode: This mode exposes every physical CPU as a separate device. This is useful when an external controller (like Slurm) wants fine-grained control over CPU placement. Select this mode using the
--cpu-device-mode=individualflag. - Topology Awareness: The driver detects sockets, NUMA nodes, cores, SMT siblings, and L3 cache boundaries. It also identifies hybrid Performance and Efficiency core types. In individual mode, these are exposed as attributes. In grouped mode, the internal allocator is fully topology-aware and performs optimized placement within the allocation device (socket or NUMA node).
- NRI Plugin: The driver uses the Node Resource Interface to pin containers to exclusive CPUs. It uses the cpuset cgroup controller to enforce isolation.
- Shared CPU Pool: All containers without a DRA claim are confined to a shared pool of remaining CPUs. The driver dynamically updates this pool when exclusive workloads start or stop.
- State Recovery: The driver rebuilds its internal allocation map after a restart. It synchronizes with existing containers by reading environment variables injected via CDI.
Requirements
- Kubelet Static CPU Policy to be disabled on the node.
- Currently, claim-based requests also need to be specified in the Pod spec. Review the Workload Configuration Requirements section for more details.
New Contributors
- @pravk03 made their first contribution in #2
- @catblade made their first contribution in #9
- @ffromani made their first contribution in #19
- @swatisehgal made their first contribution in #32
- @AutuSnow made their first contribution in #54
- @fmuyassarov made their first contribution in #92
Full Changelog: https://github.com/kubernetes-sigs/dra-driver-cpu/commits/v0.1.0