Doc: Kubelet PSI metrics feature graduates to GA#54328
Doc: Kubelet PSI metrics feature graduates to GA#54328mariafromano-25 wants to merge 8 commits intokubernetes:dev-1.36from
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
/sig node |
lmktfy
left a comment
There was a problem hiding this comment.
I know this PR is draft, but even so I hope the advice helps.
| ### kubelet Pressure Stall Information (PSI) metrics | ||
|
|
||
| {{< feature-state for_k8s_version="v1.34" state="beta" >}} | ||
| {{< feature-state for_k8s_version="v1.36" state="stable" >}} |
There was a problem hiding this comment.
Early feedback: try this
| {{< feature-state for_k8s_version="v1.36" state="stable" >}} | |
| {{< feature-state feature_gate_name="KubeletPSI" >}} |
and then look for instances of KubeletPSI anywhere in the English docs that might need an update.
For other localizations, each localization team - usually 100% volunteers - picks its own way of working.
|
/sig instrumentation |
1755fe1 to
0b780a4
Compare
63886d8 to
0eb9a92
Compare
|
Followed guide: https://github.com/kubernetes/website?tab=readme-ov-file#the-kubernetes-documentation to run the website locally. My main changes in http://localhost:1313/docs/concepts/cluster-administration/system-metrics/#kubelet-pressure-stall-information-psi-metrics look like: |
content/en/docs/concepts/cluster-administration/system-metrics.md
Outdated
Show resolved
Hide resolved
| container_pressure_io_waiting_seconds_total | ||
| ``` | ||
| *Summary API*: Exposed at the `/stats/summary` endpoint, providing both the cumulative `totals` and the moving averages (`avg10`, `avg60`, `avg300`). These averages represent the percentage of time that tasks were stalled on a resource over the respective 10-second, 60-second, and 5-minute intervals. This endpoint reports the metrics in the following format: | ||
| ``` |
There was a problem hiding this comment.
most markdown linters like the empty line before the ```
|
|
||
| This feature is enabled by default, by setting the `KubeletPSI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). The information is also exposed in the | ||
| This feature is enabled by default. Starting with Kubernetes v.1.36, the `KubeletPSI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is locked to true and cannot be disabled. The information is also exposed in the | ||
| [Summary API](/docs/reference/instrumentation/node-metrics#psi). |
There was a problem hiding this comment.
the paragraph above already said about summary API
| ``` | ||
|
|
||
| This feature is enabled by default, by setting the `KubeletPSI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). The information is also exposed in the | ||
| This feature is enabled by default. Starting with Kubernetes v.1.36, the `KubeletPSI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is locked to true and cannot be disabled. The information is also exposed in the |
There was a problem hiding this comment.
I do not know if we need to continue saying about these versions. We have the feature gate tag above already
| container_pressure_io_stalled_seconds_total | ||
| container_pressure_io_waiting_seconds_total | ||
| ``` | ||
| *Summary API*: Exposed at the `/stats/summary` endpoint, providing both the cumulative `totals` and the moving averages (`avg10`, `avg60`, `avg300`). These averages represent the percentage of time that tasks were stalled on a resource over the respective 10-second, 60-second, and 5-minute intervals. This endpoint reports the metrics in the following format: |
There was a problem hiding this comment.
I like the explanation. My suggestion is maybe add an example (with non-zero PSI data) to further explain how to interpret the data. An example can be more comprehensive than the plain documentation.
There was a problem hiding this comment.
I created an example below, let me know what you think!
| [Pressure Stall Information](https://docs.kernel.org/accounting/psi.html) | ||
| (PSI) for CPU, memory, and I/O usage. The information is collected at node, pod and container level. | ||
| This feature is enabled by default by setting the `KubeletPSI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). | ||
| Starting with Kubernetes v.1.36, the `KubeletPSI` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is locked to true and cannot be disabled. |
There was a problem hiding this comment.
Similar to my comment above. I think we can add an example with non-zero PSI data here to help users understand how to interpret the PSI data.
There was a problem hiding this comment.
Should I reuse the same example as the system-metrics.md page or make a new one?
Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
f0962c2 to
5f08b9f
Compare
|
Hello @mariafromano-25 👋! |


Description
Update feature status and wording as the feature goes stable.
Issue
Ref: kubernetes/enhancements#4205
Closes: #