From deeff10d8721aa3d33ecff926eafbf77127d695f Mon Sep 17 00:00:00 2001 From: Qi Wang Date: Thu, 12 Feb 2026 12:44:39 -0500 Subject: [PATCH] Add documentation for MemoryQoS with cgroups v2 for 1.36 Signed-off-by: Qi Wang --- .../manage-resources-containers.md | 11 +++--- .../docs/concepts/workloads/pods/pod-qos.md | 34 ++++++++++++++----- .../feature-gates/MemoryQoS.md | 6 +++- 3 files changed, 37 insertions(+), 14 deletions(-) diff --git a/content/en/docs/concepts/configuration/manage-resources-containers.md b/content/en/docs/concepts/configuration/manage-resources-containers.md index 2155046bd8416..d07d485f2e107 100644 --- a/content/en/docs/concepts/configuration/manage-resources-containers.md +++ b/content/en/docs/concepts/configuration/manage-resources-containers.md @@ -53,11 +53,12 @@ container that over allocates memory may not be immediately killed. This means its `memory` limit, but if it does, it may get killed. {{< note >}} -There is an alpha feature `MemoryQoS` which attempts to add more preemptive -limit enforcement for memory (as opposed to reactive enforcement by the OOM -killer). However, this effort is -[stalled](https://github.com/kubernetes/enhancements/tree/a47155b340/keps/sig-node/2570-memory-qos#latest-update-stalled) -due to a potential livelock situation a memory hungry can cause. +There is an alpha feature `MemoryQoS` which adds preemptive memory throttling +and optional memory reservation on Linux nodes using cgroup v2. Throttling is +controlled by `memoryThrottlingFactor`, and reservation is controlled by +`memoryReservationPolicy` (default `None`, optional `TieredReservation`). +Kubelet logs a warning on kernels older than 5.9 because `memory.high` throttling +can trigger a kernel livelock bug on older kernels. {{< /note >}} {{< note >}} diff --git a/content/en/docs/concepts/workloads/pods/pod-qos.md b/content/en/docs/concepts/workloads/pods/pod-qos.md index c4c3774095874..f729770145c7a 100644 --- a/content/en/docs/concepts/workloads/pods/pod-qos.md +++ b/content/en/docs/concepts/workloads/pods/pod-qos.md @@ -97,14 +97,32 @@ Containers in a Pod can request other resources (not CPU or memory) and still be {{< feature-state feature_gate_name="MemoryQoS" >}} -Memory QoS uses the memory controller of cgroup v2 to guarantee memory resources in Kubernetes. -Memory requests and limits of containers in pod are used to set specific interfaces `memory.min` -and `memory.high` provided by the memory controller. When `memory.min` is set to memory requests, -memory resources are reserved and never reclaimed by the kernel; this is how Memory QoS ensures -memory availability for Kubernetes pods. And if memory limits are set in the container, -this means that the system needs to limit container memory usage; Memory QoS uses `memory.high` -to throttle workload approaching its memory limit, ensuring that the system is not overwhelmed -by instantaneous memory allocation. +Memory QoS uses the memory controller of cgroup v2 to guarantee memory resources in Kubernetes. When enabled, the kubelet can set `memory.high` to throttle workloads +approaching their memory limits, and can optionally reserve memory via `memory.min` or `memory.low`. + +### Configuring Memory QoS + +Memory reservation is controlled via the kubelet configuration field `memoryReservationPolicy`: + +- `None` (default): the kubelet does not set `memory.min` or `memory.low` for containers and pods, + ensuring no hard memory is locked by the kernel. This is the default to maintain node stability. +- `TieredReservation`: the kubelet sets tiered memory protection based on the pod's QoS class: + - Guaranteed pods (hard reservation): `memory.min` is set to memory requests, memory resources are reserved and never reclaimed by the kernel. + - Burstable pods (soft reservation): `memory.low` is set to memory requests, the kernel preferentially retains this memory but may reclaim it under extreme pressure. + - BestEffort pods: no memory protection is set. + +If memory limits are set in the container,this means that the system needs to limit container memory usage; Memory QoS uses `memory.high` to throttle workloads +approaching their memory limit, ensuring that the system is not overwhelmed by instantaneous memory allocation. + + +### System Requirements + +Memory QoS requires: +- Linux with cgroup v2 +- Kernel version 5.9 or higher for safe memory.high throttling. If the MemoryQoS feature gate is enabled on an older kernel, the kubelet logs a warning because of a known kernel livelock bug when using `memory.high` throttling on older kernels. + +Memory QoS requires cgroup v2. When enabled on kernels older than 5.9, kubelet logs a warning +because `memory.high` throttling can trigger a kernel livelock bug on older kernels. Memory QoS relies on QoS class to determine which settings to apply; however, these are different mechanisms that both provide controls over quality of service. diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates/MemoryQoS.md b/content/en/docs/reference/command-line-tools-reference/feature-gates/MemoryQoS.md index 4a08e22d16192..3d392c4639ba7 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates/MemoryQoS.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates/MemoryQoS.md @@ -11,4 +11,8 @@ stages: fromVersion: "1.22" --- Enable memory protection and usage throttle on pod / container using -cgroup v2 memory controller. +cgroup v2 memory controller. This feature allows kubelet to set `memory.high` +for throttling and configure tiered memory protection, when `memoryReservationPolicy` kubelet +configuration set to `TieredReservation`, `memory.min` / `memory.low` for memory protection are enabled. +Requires cgroup v2; kubelet warns on kernels older +than 5.9 because `memory.high` throttling can trigger a livelock bug.