Skip to content

compute domain plugin for KWOK nodes#162

Merged
enoodle merged 6 commits intomainfrom
erez/compute-domain-kwok
Feb 2, 2026
Merged

compute domain plugin for KWOK nodes#162
enoodle merged 6 commits intomainfrom
erez/compute-domain-kwok

Conversation

@enoodle
Copy link
Contributor

@enoodle enoodle commented Jan 29, 2026

No description provided.

Add new kwok-compute-domain-dra-plugin component that creates ResourceSlices
for compute domain channels on KWOK simulated nodes.

Key changes:
- New internal/kwok-compute-domain-dra-plugin package with Node controller
- Controller watches nodes with label type=kwok or annotation kwok.x-k8s.io/node=fake
- Creates ResourceSlice named kwok-<node>-compute-domain-channel for each KWOK node
- ResourceSlice contains channel-0 device for compute domain allocation
- New cmd/kwok-compute-domain-dra-plugin entrypoint
- Updated Dockerfile and Makefile to build new component
Add new Helm values block for KWOK compute-domain DRA plugin:
- kwokComputeDomainDraPlugin.enabled (default: false)
- image configuration
- resource requests/limits
Add Helm templates for KWOK compute-domain DRA plugin deployment:
- deployment.yaml: single replica Deployment with no host volumes
- serviceaccount.yaml: dedicated service account
- clusterrole.yaml: permissions for nodes and resourceslices
- clusterrolebinding.yaml: binds role to service account

All templates gated by kwokComputeDomainDraPlugin.enabled flag.
Update integration test harness:
- Enable kwokComputeDomainDraPlugin in values.yaml
- Load kwok-compute-domain-dra-plugin image into kind cluster
- Pass image tag to helm install
- Wait for kwok-compute-domain-dra-plugin deployment readiness
Add integration tests for compute-domain on KWOK nodes:
- Test manifest: compute-domain-kwok-pod.yaml with ComputeDomain CR and
  Pod targeting KWOK node with nodeSelector and tolerations
- Test: ResourceSlice created for KWOK nodes with compute-domain channels
- Test: Pod scheduled on KWOK node can allocate compute-domain claim
- Test: ComputeDomain status updated to Ready with KWOK node listed
@enoodle enoodle force-pushed the erez/compute-domain-kwok branch from 8ccd519 to e8eb57b Compare January 29, 2026 10:36
@enoodle enoodle marked this pull request as ready for review February 1, 2026 12:17
}

if err := r.createOrUpdateResourceSlice(ctx, &node); err != nil {
log.Printf("Failed to create/update ResourceSlice for node %s: %v", node.Name, err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the log level here be automatically ERROR? if not should we set it to be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will look at all t the logs here to verify they use correct conventions

replace log with klog for structured logging
@enoodle enoodle merged commit 454f059 into main Feb 2, 2026
6 of 7 checks passed
@enoodle enoodle deleted the erez/compute-domain-kwok branch February 2, 2026 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants