-
Notifications
You must be signed in to change notification settings - Fork 15.4k
KEP-2570: Add documentation for MemoryQoS with cgroups v2 for 1.36 #54417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev-1.36
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -53,11 +53,12 @@ container that over allocates memory may not be immediately killed. This means | |||||
| its `memory` limit, but if it does, it may get killed. | ||||||
|
|
||||||
| {{< note >}} | ||||||
| There is an alpha feature `MemoryQoS` which attempts to add more preemptive | ||||||
| limit enforcement for memory (as opposed to reactive enforcement by the OOM | ||||||
| killer). However, this effort is | ||||||
| [stalled](https://github.com/kubernetes/enhancements/tree/a47155b340/keps/sig-node/2570-memory-qos#latest-update-stalled) | ||||||
| due to a potential livelock situation a memory hungry can cause. | ||||||
| There is an alpha feature `MemoryQoS` which adds preemptive memory throttling | ||||||
| and optional memory reservation on Linux nodes using cgroup v2. Throttling is | ||||||
| controlled by `memoryThrottlingFactor`, and reservation is controlled by | ||||||
| `memoryReservationPolicy` (default `None`, optional `TieredReservation`). | ||||||
| Kubelet logs a warning on kernels older than 5.9 because `memory.high` throttling | ||||||
| can trigger a kernel livelock bug on older kernels. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| {{< /note >}} | ||||||
|
|
||||||
| {{< note >}} | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -97,14 +97,32 @@ Containers in a Pod can request other resources (not CPU or memory) and still be | |||||
|
|
||||||
| {{< feature-state feature_gate_name="MemoryQoS" >}} | ||||||
|
|
||||||
| Memory QoS uses the memory controller of cgroup v2 to guarantee memory resources in Kubernetes. | ||||||
| Memory requests and limits of containers in pod are used to set specific interfaces `memory.min` | ||||||
| and `memory.high` provided by the memory controller. When `memory.min` is set to memory requests, | ||||||
| memory resources are reserved and never reclaimed by the kernel; this is how Memory QoS ensures | ||||||
| memory availability for Kubernetes pods. And if memory limits are set in the container, | ||||||
| this means that the system needs to limit container memory usage; Memory QoS uses `memory.high` | ||||||
| to throttle workload approaching its memory limit, ensuring that the system is not overwhelmed | ||||||
| by instantaneous memory allocation. | ||||||
| Memory QoS uses the memory controller of cgroup v2 to guarantee memory resources in Kubernetes. When enabled, the kubelet can set `memory.high` to throttle workloads | ||||||
| approaching their memory limits, and can optionally reserve memory via `memory.min` or `memory.low`. | ||||||
|
|
||||||
| ### Configuring Memory QoS | ||||||
|
|
||||||
| Memory reservation is controlled via the kubelet configuration field `memoryReservationPolicy`: | ||||||
|
|
||||||
| - `None` (default): the kubelet does not set `memory.min` or `memory.low` for containers and pods, | ||||||
| ensuring no hard memory is locked by the kernel. This is the default to maintain node stability. | ||||||
| - `TieredReservation`: the kubelet sets tiered memory protection based on the pod's QoS class: | ||||||
| - Guaranteed pods (hard reservation): `memory.min` is set to memory requests, memory resources are reserved and never reclaimed by the kernel. | ||||||
| - Burstable pods (soft reservation): `memory.low` is set to memory requests, the kernel preferentially retains this memory but may reclaim it under extreme pressure. | ||||||
| - BestEffort pods: no memory protection is set. | ||||||
|
|
||||||
| If memory limits are set in the container,this means that the system needs to limit container memory usage; Memory QoS uses `memory.high` to throttle workloads | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, this piece applies regardless of memoryReservationPolicy. Should be moved before "### Configuring Memory QoS" or into its own subsection (e.g., "### Memory Throttling") since throttling is independent of reservation policy. |
||||||
| approaching their memory limit, ensuring that the system is not overwhelmed by instantaneous memory allocation. | ||||||
|
|
||||||
|
|
||||||
| ### System Requirements | ||||||
|
|
||||||
| Memory QoS requires: | ||||||
| - Linux with cgroup v2 | ||||||
| - Kernel version 5.9 or higher for safe memory.high throttling. If the MemoryQoS feature gate is enabled on an older kernel, the kubelet logs a warning because of a known kernel livelock bug when using `memory.high` throttling on older kernels. | ||||||
|
|
||||||
| Memory QoS requires cgroup v2. When enabled on kernels older than 5.9, kubelet logs a warning | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks duplicated. You might need to remove these lines |
||||||
| because `memory.high` throttling can trigger a kernel livelock bug on older kernels. | ||||||
|
|
||||||
| Memory QoS relies on QoS class to determine which settings to apply; however, these are different | ||||||
| mechanisms that both provide controls over quality of service. | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -11,4 +11,8 @@ stages: | |||||
| fromVersion: "1.22" | ||||||
| --- | ||||||
| Enable memory protection and usage throttle on pod / container using | ||||||
| cgroup v2 memory controller. | ||||||
| cgroup v2 memory controller. This feature allows kubelet to set `memory.high` | ||||||
| for throttling and configure tiered memory protection, when `memoryReservationPolicy` kubelet | ||||||
| configuration set to `TieredReservation`, `memory.min` / `memory.low` for memory protection are enabled. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| Requires cgroup v2; kubelet warns on kernels older | ||||||
| than 5.9 because `memory.high` throttling can trigger a livelock bug. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you would need to add a blurb about
memoryThrottlingFactorin pod-qos.md as well. Otherwise, Readers will wonder what it does