Skip to content

Releases: kubernetes-sigs/dra-driver-cpu

v0.1.0

31 Mar 14:53
a30bb34

Choose a tag to compare

What's Changed

  • Initial implementation of the CPU DRA driver

Highlights

  • Deployment: The driver runs as a DaemonSet to manage CPU resources on every node. It requires privileged access to interact with the container runtime.
  • Runtime Setup: An init container automatically enables NRI and CDI in containerd. It restarts the runtime if configuration changes are necessary.
  • Resource Publication: The driver discovers the node's CPU layout. It publishes ResourceSlice objects so the Kubernetes scheduler can see available CPUs.
  • Grouped Mode (Default): This mode uses DRA consumable capacity (beta in Kubernetes 1.36) to group CPUs based on NUMA node or socket. Select this mode using the --cpu-device-mode=grouped flag. The grouping can be selected by setting the --cpu-device-group-by flag to numanode or socket.
  • Individual Mode: This mode exposes every physical CPU as a separate device. This is useful when an external controller (like Slurm) wants fine-grained control over CPU placement. Select this mode using the --cpu-device-mode=individual flag.
  • Topology Awareness: The driver detects sockets, NUMA nodes, cores, SMT siblings, and L3 cache boundaries. It also identifies hybrid Performance and Efficiency core types. In individual mode, these are exposed as attributes. In grouped mode, the internal allocator is fully topology-aware and performs optimized placement within the allocation device (socket or NUMA node).
  • NRI Plugin: The driver uses the Node Resource Interface to pin containers to exclusive CPUs. It uses the cpuset cgroup controller to enforce isolation.
  • Shared CPU Pool: All containers without a DRA claim are confined to a shared pool of remaining CPUs. The driver dynamically updates this pool when exclusive workloads start or stop.
  • State Recovery: The driver rebuilds its internal allocation map after a restart. It synchronizes with existing containers by reading environment variables injected via CDI.

Requirements

  • Kubelet Static CPU Policy to be disabled on the node.
  • Currently, claim-based requests also need to be specified in the Pod spec. Review the Workload Configuration Requirements section for more details.

New Contributors

Full Changelog: https://github.com/kubernetes-sigs/dra-driver-cpu/commits/v0.1.0