Skip to content

feat: add Helm chart for driver deployment#83

Open
fmuyassarov wants to merge 4 commits intokubernetes-sigs:mainfrom
Nordix:devel/helm-charts
Open

feat: add Helm chart for driver deployment#83
fmuyassarov wants to merge 4 commits intokubernetes-sigs:mainfrom
Nordix:devel/helm-charts

Conversation

@fmuyassarov
Copy link
Copy Markdown
Member

@fmuyassarov fmuyassarov commented Mar 11, 2026

Add a Helm chart for driver installation. This PR adds:

  • Helm chart for driver installation
  • Documentation to describe installation and available values
  • Linter CI for the charts & schema validation

Follow-up (TODO)

  • chart packaging and publishing to ghcr.io/kubernetes-sigs/dra-driver-cpu/charts/dra-driver-cpu
  • versioned releases from tags, 0.0.0-main from main branch

Note: currently all the templates (DeamonSet, ServiceAccount, etc are based on the what is available in install.yaml).

Fixes: #72

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 11, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fmuyassarov
Once this PR has been reviewed and has the lgtm label, please assign klueska for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from klueska March 11, 2026 22:31
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 11, 2026
@fmuyassarov fmuyassarov force-pushed the devel/helm-charts branch 2 times, most recently from 9598847 to 3ba52b2 Compare March 12, 2026 16:47
@fmuyassarov
Copy link
Copy Markdown
Member Author

I’ll keep this PR as a draft until we’re ready to land it (post 0.1.0?). In the meantime, please feel free to take a look and share any thoughts.
/cc @ffromani @pravk03

@AutuSnow
Copy link
Copy Markdown
Contributor

@fmuyassarov Can you add the configurations for livenessProbe and readinessProbe

@fmuyassarov
Copy link
Copy Markdown
Member Author

livenessProbe

Yes sure.

@fmuyassarov
Copy link
Copy Markdown
Member Author

@fmuyassarov Can you add the configurations for livenessProbe and readinessProbe

@AutuSnow added #84 for the install.yaml and soon will add here as well.

@AutuSnow
Copy link
Copy Markdown
Contributor

@fmuyassarov Can you add the configurations for livenessProbe and readinessProbe

@AutuSnow added #84 for the install.yaml and soon will add here as well.

Thanks !!

@fmuyassarov
Copy link
Copy Markdown
Member Author

fmuyassarov commented Mar 22, 2026

Similar health probes as in #84 are added here to the chart.

@fmuyassarov fmuyassarov marked this pull request as ready for review March 22, 2026 10:59
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2026
@ffromani
Copy link
Copy Markdown
Contributor

will review again shortly, thanks for the patience

@fmuyassarov
Copy link
Copy Markdown
Member Author

fmuyassarov commented Apr 13, 2026

Thanks. But I would ask don't do it yet, because I'm about to add few more improvements in an hour or two. Will ping you once ready.

@fmuyassarov
Copy link
Copy Markdown
Member Author

fmuyassarov commented Apr 13, 2026

Thanks. But I would ask don't do it yet, because I'm about to add few more improvements in an hour or two. Will ping you once ready.

@ffromani
This should be ready now. Added few more changes I had in mind.

Comment on lines +34 to +44
maintainers:
- name: johnbelamaric
url: https://github.com/johnbelamaric
- name: pohly
url: https://github.com/pohly
- name: klueska
url: https://github.com/klueska
- name: ffromani
url: https://github.com/ffromani
- name: pravk03
url: https://github.com/pravk03
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure we need to update this list, but perhaps @pravk03 @ffromani can confirm. This was copy paste of the current OWNERS file.

@fmuyassarov
Copy link
Copy Markdown
Member Author

/test pull-dra-driver-cpu-e2e-device-mode-grouped-arm64

@fmuyassarov
Copy link
Copy Markdown
Member Author

/test pull-dra-driver-cpu-e2e-device-mode-grouped-arm64
/test pull-dra-driver-cpu-e2e-device-mode-individual-arm64

Comment thread deployment/helm/dra-driver-cpu/templates/deviceclass.yaml Outdated

image:
repository: us-central1-docker.pkg.dev/k8s-staging-images/dra-driver-cpu/dra-driver-cpu
tag: latest
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a release version ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be and it doesn't matter much here because eventually we will publish this chart as an OCI artifact same way as container image and during that release process we will have to patch the version to the same as the release version that we are about to cut. In other words, it will be part of the release process PR that I will submit as follow up once the chart lands on the main.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or to be exact, with image.tag empty values.yaml, the template falls back to .Chart.AppVersion, so the DaemonSet will use the same release version image.

Comment thread deployment/helm/dra-driver-cpu/templates/clusterrole.yaml
@pravk03
Copy link
Copy Markdown
Contributor

pravk03 commented Apr 17, 2026

Thanks for looking into this! I've left a few minor comments inline. I have very limited experience with Helm, so I will lean on @ffromani and @AutuSnow who are already reviewing this PR for a more thorough review and approval.

Copy link
Copy Markdown
Contributor

@AutuSnow AutuSnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience and contribution. I left a comment

Comment thread deployment/helm/dra-driver-cpu/templates/clusterrole.yaml
Comment thread deployment/helm/dra-driver-cpu/values.yaml
Comment thread deployment/helm/dra-driver-cpu/templates/_helpers.tpl Outdated
spec:
selectors:
- cel:
expression: device.driver == "dra.cpu"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeviceClass/dra.cpu is cluster-scoped and shared across any installation. helm uninstall will delete it, breaking other deployments or existing ResourceClaims that reference it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It it true that uninstalling the chart will delete shared DeviceClass but how is it different from install.yaml? I mean that it's not a Helm specific problem but rather an inherent characteristic of how this driver works. This is the same behavior as kubectl delete -f install.yaml.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it mirrors install.yaml behavior. But Helm gives us a tool install.yaml doesn't — helm.sh/resource-policy: keep on the DeviceClass would prevent accidental deletion on helm uninstall while preserving create/upgrade semantics. Worth considering since DeviceClass is cluster-scoped shared state. Not a blocker if you prefer to defer.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting, thanks for sharing. I wasn't aware of helm.sh/resource-policy: keep annotation. The only thing I'm unsure about is - once this annotation is set, the resources becomes orphaned (according to this document) and in case of a helm upgrade we won't be to replace the resource (when needed).

Are we okay with that?

spec:
selector:
matchLabels:
{{- include "dra-driver-cpu.selectorLabels" . | nindent 6 }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

install.yaml selector is app: dracpu; this chart uses app.kubernetes.io/{name,instance}. DaemonSet selectors are immutable, so users cannot migrate from install.yaml → Helm in place. Either keep the legacy app: dracpu selector label for compat, or add an explicit migration note (delete+reinstall) to README.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, but I’m not entirely sure why users would need to migrate from install.yaml to Helm. From our perspective, the installation method shouldn’t impact ongoing usage, and we are not expecting/requiring existing users to migrate and I don't see value in doing so.

Those who originally installed the driver using install.yaml can continue running it as is without any changes. For new installations users can choose between install.yaml and Helm, though Helm would generally be the preferred option. Could you help to clarify your question?

Comment thread deployment/helm/dra-driver-cpu/values.yaml Outdated
Comment thread deployment/helm/dra-driver-cpu/Chart.yaml Outdated
Comment thread .github/workflows/helm-lint.yaml
@fmuyassarov
Copy link
Copy Markdown
Member Author

Thanks for the reviews @AutuSnow , @pravk03. I've addressed most of the comments. PTAL.

Alternative to install.yaml with a Helm chart that exposes driver
configuration as values.yaml parameters.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Helm chart for dra-driver-cpu

5 participants