Skip to content

DRA Device Taints Beta in 1.36#54648

Open
pohly wants to merge 1 commit intokubernetes:dev-1.36from
pohly:dra-device-taints-1.36
Open

DRA Device Taints Beta in 1.36#54648
pohly wants to merge 1 commit intokubernetes:dev-1.36from
pohly:dra-device-taints-1.36

Conversation

@pohly
Copy link
Copy Markdown
Contributor

@pohly pohly commented Feb 24, 2026

Description

DRA Device Taints and Tolerations is beta in 1.36.

Issue

KEP: kubernetes/enhancements#5055

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 24, 2026
@k8s-ci-robot k8s-ci-robot added this to the 1.36 milestone Feb 24, 2026
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 24, 2026
@pohly pohly marked this pull request as draft February 24, 2026 16:17
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 24, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Feb 24, 2026

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit cab8aa6
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-io-main-staging/deploys/69ce21fa0754840008179192
😎 Deploy Preview https://deploy-preview-54648--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@pohly pohly force-pushed the dra-device-taints-1.36 branch from bfe252a to 9b0f634 Compare March 31, 2026 08:21
@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Mar 31, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tengqm for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 31, 2026
@pohly pohly changed the title WIP: DRA Device Taints Beta in 1.36 DRA Device Taints Beta in 1.36 Mar 31, 2026
@pohly pohly force-pushed the dra-device-taints-1.36 branch from 9b0f634 to 8ddcad7 Compare March 31, 2026 08:28
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 31, 2026
@pohly pohly marked this pull request as ready for review March 31, 2026 08:29
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 31, 2026
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 31, 2026
You may also be able to mutate the incoming Pod, at admission time, to unset
the `.spec.nodeName` field and to use a node selector instead.

## DRA beta features {#beta-features}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this separation into "GA/beta/alpha" features is not useful and leads to unnecessary churn when features graduate from alpha to beta. It's also misleading because not all beta features are on by default, or on-by-default features could be turned off. "Optional DRA features" looks like a better description.

We still need to move features out of this section when they graduation to GA, though. So perhaps we should instead use "Additional DRA features", which then can include GA features?

/cc @ritazh

#54599

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion @pohly.

  1. I personally +1 and appreciated the Alpha/Beta sections because for a team/user that can only use GA things, that section makes it easy to skip. And for a team/user who is an early adopter, it's easier to find and pay more attention to those to ensure these alpha/beta features work in their environment before the features are GAed.
  2. A generic "Optional and Additional DRA features" section is hard to gate and scale in the future. e.g. which features are considered optional and additional? who makes that decision?
  3. I'm also +1 if we want each feature to just have their own section as long as their graduation status is right below it, it's easier to understand how to use it and easier for the feature owners to go back and update.

Thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with "additional", we can add an introduction like this:

Additional features add advanced functionality to core DRA; usage of them is optional and/or may only be relevant with certain DRA drivers.

Some of the features are in the Alpha or Beta
feature stage.
...

I still find that better than categorizing them by their state. An explicit "this is an alpha feature" in the section is clearer than having to remember how far down the page one has scrolled.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting something like the following?

...

## Additional DRA features
Additional features add advanced functionality to core DRA; usage of them is optional and/or may only be relevant with certain DRA drivers. Some of the features are in the Alpha or Beta feature stage.

### Feature one (GA)

### Feature two (alpha)

### Feature three (beta)
...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a tabular form?

DRA Features Status
Feature Status Kubernetes Version FG Default Notes
Structured Parameters Alpha v1.30+ Off Moves parameter logic from external drivers into the scheduler.
Device Taints Beta v1.36+ On Allows nodes with specific devices to be tainted dynamically.

Copy link
Copy Markdown
Contributor Author

@pohly pohly Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at current content makes it clear that "Beta Features" and "Alpha Features" are not useful sub-sections because not all alpha/beta features are described there. For example, prioritized list is described further up in the section about requesting devices. I think we should follow that pattern and describe features where it makes sense, not based on their status.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature one (GA)

That can work, as long as we explicitly set the anchor to not include the GA part.

We don't need to include the (GA/Alpha/Beta) part because it gets rendered for us automatically:

https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/#admin-access

I don't think we need to be even more explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to "Additional features" and did a pass over how the other features are described. Linking to feature gates was inconsistent or even broken. I'm now linking to the specific feature gate anchor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the change. As features graduate to stable, we should try to find a way to include information about it into the regular flow of the doc, so we don't end up with a long list of "additional features" separate from the overall structure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a better place than "Additional features", then a new feature should be described there immediately. We did that for prioritized list.

"Additional features" then are the things whose usage is less common (the "advanced use cases").

@k8s-ci-robot k8s-ci-robot requested a review from ritazh March 31, 2026 08:33
@pohly pohly force-pushed the dra-device-taints-1.36 branch 2 times, most recently from 9bf6a45 to ebf3053 Compare March 31, 2026 08:34
@ttsuuubasa
Copy link
Copy Markdown
Contributor

ttsuuubasa commented Mar 31, 2026

@pohly
Thanks for the suggestion.

I’m +1 on this format. It removes the need to move items from the alpha section to the beta section, and avoids the conflict resolution work that comes with such moves.
Also, as @lmktfy mentioned in #54541 (comment), even after a feature becomes stable, it would remain in the same place, which improves readability for users.

However, if we adopt this format, I agree with @ritazh that each feature should have its alpha/beta/GA status clearly indicated right below the feature name, in a consistent manner.

Regarding the idea of calling them “Optional” or “Additional” features, I feel those terms give an impression of being extra.
How about naming them “DRA sub-features” instead?
Similar to “Additional,” this naming would also allow GA features to stay in the same place without needing to be moved.

Since this change affects the overall structure and impacts the work of each feature author, I’d like to get it merged sooner rather than later.

the `.spec.nodeName` field and to use a node selector instead.

## DRA beta features {#beta-features}
## Optional DRA features
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Additional.

To me, additional sounds layered and not extra, a feature that is built on top of the core DRA framework and is a powerful extension. Especially, device taints, partitiontable devices etc is not an optional feature, atleast for GPUs. And as I suggested below, may be put all the features in a table.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to "Additional" with:

The following sections describe DRA features that support advanced use
case. Usage of them is optional and/or may only be relevant with DRA
drivers that support them.

@nojnhuh
Copy link
Copy Markdown
Contributor

nojnhuh commented Apr 1, 2026

/wg device-management

@k8s-ci-robot k8s-ci-robot added the wg/device-management Categorizes an issue or PR as relevant to WG Device Management. label Apr 1, 2026
@nojnhuh nojnhuh moved this from 🆕 New to 👀 In review in Dynamic Resource Allocation Apr 1, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this to v1beta2? And also the kubectl describe output? That might help illustrate the new timeAdded field.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I also noticed that the header had something about DeviceTaintRule in v1alpha3. Added v1beta2 there.

The kubectl describe output doesn't change. The timeAdded was already set on create before, it shows up there as Time Added.

@pohly pohly force-pushed the dra-device-taints-1.36 branch from ebf3053 to 5724c70 Compare April 1, 2026 12:01
For stateful applications running on specialized hardware, it is critical to know when a device has failed or become unhealthy.
It is also helpful to find out if the device recovers.

To enable this functionality, the `ResourceHealthStatus` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/ResourceHealthStatus/)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link was broken.

@AnshumanTripathi
Copy link
Copy Markdown
Contributor

Hello @pohly 👋!
I'm reaching out from the Docs team. Just checking in as we approach Docs Freeze on Wednesday 8th April 2026 (AoE) / Thursday 9th April 2026, 12:00 UTC.
This documentation appears to still be under review. To meet the Docs Freeze, this PR must have a technical review as well as lgtm and approve labels applied, without any unaddressed comments or concerns from SIG Docs.
Thank you!

@guptaNswati
Copy link
Copy Markdown
Contributor

LGTM

You may also be able to mutate the incoming Pod, at admission time, to unset
the `.spec.nodeName` field and to use a node selector instead.

## DRA beta features {#beta-features}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the change. As features graduate to stable, we should try to find a way to include information about it into the regular flow of the doc, so we don't end up with a long list of "additional features" separate from the overall structure.


The following sections describe DRA features that are available in the Beta
The following sections describe DRA features that support advanced use
case. Usage of them is optional and/or may only be relevant with DRA
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the and/or here? I think just using and is sufficient.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.


The following sections describe DRA features that are available in the Beta
The following sections describe DRA features that support advanced use
case. Usage of them is optional and/or may only be relevant with DRA
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "features that support advanced use cases"

Suggested change
case. Usage of them is optional and/or may only be relevant with DRA
cases. Usage of them is optional and/or may only be relevant with DRA

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Changes since 1.34:
- Updating the time stamp on updates.
- Beta graduation.

To simplify feature graduation, the explicit "alpha/beta features" sections get
replaced with "Additional features". Linking to feature gates gets harmonized.
@pohly pohly force-pushed the dra-device-taints-1.36 branch from 5724c70 to cab8aa6 Compare April 2, 2026 07:59
@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Apr 2, 2026

Is this ready for a final /lgtm? I think I have addressed all pending comments.

It would be good to merge this soon because it affects other docs PRs which graduate features.

@mortent
Copy link
Copy Markdown
Member

mortent commented Apr 2, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 2, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: b296ac8dde366724a307a955ad53bca62dce1d45

@pohly
Copy link
Copy Markdown
Contributor Author

pohly commented Apr 2, 2026

/assign @tengqm

Ready for approval.

/priority important-soon

Blocks other docs PRs.

cc @AnshumanTripathi

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Projects

Status: 👀 In review

Development

Successfully merging this pull request may close these issues.

10 participants