Run 37138 Add VPA support for KAI-Scheduler#1119
Run 37138 Add VPA support for KAI-Scheduler#1119JRosenboimNVIDIA wants to merge 30 commits intoNVIDIA:mainfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds comprehensive Vertical Pod Autoscaler (VPA) support to the KAI Scheduler system. Introduces VPA configuration API types, schema definitions, defaulting logic, RBAC permissions, and operand implementations to manage VPA resources for all scheduler components. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
a06b0a6 to
6d0cc1c
Compare
18d869d to
d8603ca
Compare
Merging this branch changes the coverage (11 decrease, 6 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
Merging this branch changes the coverage (11 decrease, 6 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
Merging this branch changes the coverage (8 decrease, 9 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (10)
cmd/operator/app/app.go (1)
6-34: Normalize the import block to the 3-group convention.Since this change updates the import list, please regroup imports into exactly: standard library, external dependencies, internal packages.
Proposed import regrouping
import ( "os" + apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1" + "k8s.io/apimachinery/pkg/runtime" + utilruntime "k8s.io/apimachinery/pkg/util/runtime" + clientgoscheme "k8s.io/client-go/kubernetes/scheme" + _ "k8s.io/client-go/plugin/pkg/client/auth" + vpav1 "k8s.io/autoscaler/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1" nvidiav1 "github.com/NVIDIA/gpu-operator/api/nvidia/v1" monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1" - vpav1 "k8s.io/autoscaler/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1" + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/healthz" + "sigs.k8s.io/controller-runtime/pkg/log/zap" + "sigs.k8s.io/controller-runtime/pkg/manager" + "sigs.k8s.io/controller-runtime/pkg/metrics/server" "github.com/NVIDIA/KAI-scheduler/cmd/operator/config" kaiv1 "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1" kaiv1alpha1 "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1alpha1" "github.com/NVIDIA/KAI-scheduler/pkg/operator/controller" "github.com/NVIDIA/KAI-scheduler/pkg/operator/operands" - - apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1" - "sigs.k8s.io/controller-runtime/pkg/manager" - "sigs.k8s.io/controller-runtime/pkg/metrics/server" - - // Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.) - // to ensure that exec-entrypoint and run can make use of them. - _ "k8s.io/client-go/plugin/pkg/client/auth" - - "k8s.io/apimachinery/pkg/runtime" - utilruntime "k8s.io/apimachinery/pkg/util/runtime" - clientgoscheme "k8s.io/client-go/kubernetes/scheme" - ctrl "sigs.k8s.io/controller-runtime" - "sigs.k8s.io/controller-runtime/pkg/healthz" - "sigs.k8s.io/controller-runtime/pkg/log/zap" // +kubebuilder:scaffold:imports )As per coding guidelines: “Organize imports in three groups separated by blank lines: (1) Standard library, (2) External dependencies, (3) Internal packages”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/operator/app/app.go` around lines 6 - 34, The import block must be reorganized into three groups (standard library, external dependencies, internal packages) separated by blank lines: move the single standard library import "os" into the first group; place all k8s/sigs/prometheus/metrics/apiextensions/client-go/ctrl/healthz/log/zap and other third-party imports (e.g., nvidiav1, monitoringv1, vpav1, apiextensionsv1, clientgoscheme, utilruntime, runtime, manager, server, ctrl, healthz, zap, and the side-effect auth import) into the second group; put the repo-local imports (github.com/NVIDIA/KAI-scheduler/cmd/operator/config, kaiv1, kaiv1alpha1, controller, operands) into the third group; ensure a single blank line separates each group and retain the // +kubebuilder:scaffold:imports and the side-effect import (_ "k8s.io/client-go/plugin/pkg/client/auth") in the appropriate external group.pkg/apis/kai/v1/pod_grouper/pod_grouper_test.go (1)
24-24: Add one test for non-nil global VPA inheritance.Line 24 and Line 33 only validate the
nilglobal VPA path. SinceSetDefaultsWhereNeedednow accepts global VPA, add a case where global VPA is set and local VPA is nil to verify inheritance behavior.Also applies to: 33-33
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/pod_grouper/pod_grouper_test.go` at line 24, Add a new unit test in pod_grouper_test.go that covers the non-nil global VPA inheritance path: create a PodGrouper (podGrouper) with a non-nil GlobalVPA (populate fields you expect to be inherited) and a nil LocalVPA, call podGrouper.SetDefaultsWhereNeeded(&replicaCount, nil), and assert that the resulting pod spec/fields reflect values inherited from GlobalVPA (use the same assertion style as the existing nil-global-VPA tests to validate expected fields). Ensure the test targets the SetDefaultsWhereNeeded method and verifies the specific fields you expect to be copied from GlobalVPA into the pod/group.pkg/apis/kai/v1/scheduler/scheduler_test.go (1)
23-66: Add one non-nilglobal VPA test case for the new defaulting path.All updated invocations pass
nilas the new argument, so regressions in global VPA propagation/override behavior won’t be caught here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/scheduler/scheduler_test.go` around lines 23 - 66, Add a test that exercises the non-nil global VPA path by calling Scheduler.SetDefaultsWhereNeeded with a real, minimal global VPA object instead of nil; create a Scheduler{}, a replicaCount value, and a small VPA object (the same type expected by SetDefaultsWhereNeeded) and pass it as the second argument, then assert that fields which should be influenced by a global VPA (e.g., resource requests/limits or replica-related defaults on scheduler.Service / SchedulerService / Replicas) reflect the VPA-driven defaults/overrides; locate the call site in the tests using SetDefaultsWhereNeeded and add this one case to validate propagation/override behavior.pkg/apis/kai/v1/pod_grouper/pod_grouper.go (1)
78-80: Avoid sharing the sameVPASpecpointer between global and component config.Use a deep copy when inheriting
globalVPAto prevent unintended cross-component mutation through shared state.♻️ Suggested change
- if pg.VPA == nil { - pg.VPA = globalVPA - } + if pg.VPA == nil && globalVPA != nil { + pg.VPA = globalVPA.DeepCopy() + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/pod_grouper/pod_grouper.go` around lines 78 - 80, The code assigns the global VPASpec pointer directly to component config (pg.VPA = globalVPA), which shares mutable state; instead, when pg.VPA is nil copy the globalVPA into a new VPASpec instance (deep copy) and assign that to pg.VPA so each component gets its own independent VPASpec; locate the assignment for pg.VPA in pod_grouper.go and replace the direct pointer assignment with logic that creates a deep copy of globalVPA (copying all fields, nested structs/slices/maps) before assigning to pg.VPA.pkg/apis/kai/v1/binder/binder_test.go (1)
26-91: Add coverage for non-nilglobalVPAbehavior.All updated calls at Line 26, Line 38, Line 45, Line 67, and Line 91 pass
nilforglobalVPA, so inheritance/override semantics introduced by the new parameter are still untested in this suite. Add one case for inheriting global VPA and one case proving binder-local VPA takes precedence.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/binder/binder_test.go` around lines 26 - 91, Add tests covering non-nil globalVPA behavior: create two new It blocks in binder_test.go that call Binder.SetDefaultsWhereNeeded with a non-nil globalVPA value and assert VPA inheritance/override semantics. First test: provide a globalVPA (with some VPA settings) and a Binder with no local VPA, call SetDefaultsWhereNeeded(&replicaCount?, &globalVPA) and Expect the resulting Binder to have the global VPA settings applied. Second test: provide the same globalVPA but initialize Binder with its own local VPA differing from global, call SetDefaultsWhereNeeded(..., &globalVPA) and Expect the Binder to retain its local VPA (local takes precedence). Use the existing Binder, SetDefaultsWhereNeeded, and globalVPA identifiers so assertions target the Binder's VPA-related fields (the local VPA struct/field names used in the code) to verify inheritance and override.pkg/apis/kai/v1/queue_controller/queue_controller.go (1)
7-12: Imports should be organized with external dependencies before internal packages.Per coding guidelines, imports should be: (1) Standard library, (2) External dependencies, (3) Internal packages. Currently, the internal
commonpackage import precedesk8s.iopackages.♻️ Suggested import organization
import ( - "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common" v1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/resource" "k8s.io/utils/ptr" + + "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common" )As per coding guidelines: "Organize imports in three groups separated by blank lines: (1) Standard library, (2) External dependencies, (3) Internal packages"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/queue_controller/queue_controller.go` around lines 7 - 12, Reorder the import block in queue_controller.go so external dependencies come before internal packages: keep standard library (if any) first, then group external imports like "k8s.io/api/core/v1", "k8s.io/apimachinery/pkg/api/resource", "k8s.io/utils/ptr" together, followed by the internal import "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common"; update the import grouping and add single blank lines between the three groups to match the project's import style.pkg/operator/operands/known_types/known_types_test.go (1)
6-17: Imports should be organized into three groups per coding guidelines.The imports mix standard library, external dependencies, and internal packages. Per guidelines, organize into: (1) Standard library, (2) External dependencies, (3) Internal packages.
♻️ Suggested import organization
import ( "testing" . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" - vpav1 "k8s.io/autoscaler/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1" - - kaiv1 "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1" - metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/types" + vpav1 "k8s.io/autoscaler/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1" + + kaiv1 "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1" )As per coding guidelines: "Organize imports in three groups separated by blank lines: (1) Standard library, (2) External dependencies, (3) Internal packages"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/operands/known_types/known_types_test.go` around lines 6 - 17, The import block in known_types_test.go is not grouped per guidelines; reorder the imports into three groups separated by blank lines: (1) standard library imports (e.g., testing), (2) external dependencies (e.g., github.com/onsi/ginkgo/v2, github.com/onsi/gomega, k8s.io/... packages like metav1, types, vpav1), and (3) internal packages (e.g., github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 referred to as kaiv1); update the import block so each group is contiguous and separated by a blank line and keep the same import identifiers used in the file.pkg/apis/kai/v1/binder/binder.go (1)
7-14: Imports should be organized into proper groups.External k8s.io packages are split around the internal package import. Group all external dependencies together before internal packages.
♻️ Suggested import organization
import ( + v1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/api/resource" "k8s.io/utils/ptr" "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common" "github.com/NVIDIA/KAI-scheduler/pkg/common/constants" - v1 "k8s.io/api/core/v1" - "k8s.io/apimachinery/pkg/api/resource" )As per coding guidelines: "Organize imports in three groups separated by blank lines: (1) Standard library, (2) External dependencies, (3) Internal packages"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/binder/binder.go` around lines 7 - 14, The import block in binder.go is misgrouped: move all external dependencies (k8s.io/* and other third-party imports like "k8s.io/utils/ptr", v1 "k8s.io/api/core/v1", "k8s.io/apimachinery/pkg/api/resource") together, and place internal project imports ("github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common" and "github.com/NVIDIA/KAI-scheduler/pkg/common/constants") in a separate group below them, with a blank line separating groups so the import groups follow the standard library, external, internal ordering.pkg/apis/kai/v1/queue_controller/queue_controller_test.go (1)
24-24: Test updated for new signature - consider adding VPA-specific tests.The call site is correctly updated to pass
nilfor the globalVPA parameter. However, there are no tests verifying that VPA defaults are applied when a non-nil globalVPA is provided.Would you like me to suggest additional test cases for VPA defaulting behavior?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/apis/kai/v1/queue_controller/queue_controller_test.go` at line 24, Add unit tests in queue_controller_test.go that exercise queueController.SetDefaultsWhereNeeded(replicaCount, globalVPA) with a non-nil globalVPA to verify VPA-specific defaults are applied: create a test that constructs a queueController instance and a non-nil globalVPA object, call SetDefaultsWhereNeeded(&replicaCount, &globalVPA), and assert the expected VPA-related fields on the controller (or its PodTemplate/annotations) were set; also keep the existing test that passes nil for globalVPA to ensure behavior differs when globalVPA is absent. Use the symbols queueController, SetDefaultsWhereNeeded, replicaCount, and globalVPA to find and update/add tests.deployments/kai-scheduler/crds/kai.scheduler_configs.yaml (1)
1209-1215: Consider addingminimum: 1constraint tominReplicas.The description states "Only positive values are allowed" but the schema does not enforce this with a
minimumconstraint. Without it, users can specify 0 or negative values that will pass CRD validation but may cause unexpected VPA behavior at runtime.If this CRD is auto-generated from Go types, add a
+kubebuilder:validation:Minimum=1marker to the source type.Proposed schema fix (apply to all 8 VPA blocks)
minReplicas: description: |- Minimal number of replicas which need to be alive for Updater to attempt pod eviction (pending other checks like PDB). Only positive values are allowed. Overrides global '--min-replicas' flag. format: int32 + minimum: 1 type: integer🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deployments/kai-scheduler/crds/kai.scheduler_configs.yaml` around lines 1209 - 1215, The schema for the field minReplicas lacks a minimum constraint despite the description saying "Only positive values are allowed"; update the CRD and source types to enforce this by adding a minimum: 1 constraint for the minReplicas field (and apply the same change to all other VPA blocks), or if this CRD is generated from Go types add the marker +kubebuilder:validation:Minimum=1 to the Go field that produces minReplicas so the generated kai.scheduler_configs.yaml includes minimum: 1 for minReplicas.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 9-15: The CHANGELOG entry for the v0.15.0 release is placed
directly under "## [v0.15.0] - 2026-03-05" which likely breaks the CI validator
that expects changes to be added under "## [Unreleased]" before cutting a
release; move or duplicate the new bullet lines (the VPA support lines currently
under "## [v0.15.0] - 2026-03-05") into the "## [Unreleased]" section (or update
the Unreleased section to include these bullets), then ensure the release
section "## [v0.15.0] - 2026-03-05" is only populated by your release tooling or
the cut step so the validator recognizes the PR update.
In `@deployments/kai-scheduler/crds/kai.scheduler_configs.yaml`:
- Around line 1220-1226: Update the CRD/documentation to add VPA and Kubernetes
compatibility notes for the InPlaceOrRecreate update mode: document that
InPlaceOrRecreate depends on the in-place pod resize feature (Alpha in
Kubernetes 1.27–1.32 and requires the feature gate enabled, Beta from 1.33+) and
that it requires VPA version 1.0+; add this note near the enum declaration for
the update mode (the enum value "InPlaceOrRecreate") and in any user-facing docs
or schema description for update mode so users are warned about cluster and VPA
version requirements before enabling it.
In `@hack/setup-e2e-cluster.sh`:
- Around line 76-83: Replace the floating metrics-server manifest URL used when
INSTALL_VPA is true with a pinned v0.8.1 release URL (so the kubectl apply
targets the tested metrics-server v0.8.1 instead of releases/latest), and verify
the subsequent kubectl patch targeting the metrics-server deployment (the JSON
patch adding "--kubelet-insecure-tls" to the container args) still applies
cleanly against that release; keep the INSTALL_VPA conditional and the kubectl
wait as-is.
In `@pkg/apis/kai/v1/common/vpa.go`:
- Around line 31-35: The defaulting currently only sets vpav1.PodUpdatePolicy
when UpdatePolicy is nil; update VPASpec.SetDefaultsWhereNeeded to also check if
v.UpdatePolicy != nil && v.UpdatePolicy.UpdateMode == nil and set UpdateMode to
vpav1.UpdateModeInPlaceOrRecreate (use the same pattern as when UpdatePolicy is
nil), referencing symbols: VPASpec, SetDefaultsWhereNeeded, UpdatePolicy,
UpdateMode, and vpav1.UpdateModeInPlaceOrRecreate; then add a unit test in
pkg/apis/kai/v1/common/vpa_test.go that constructs &VPASpec{UpdatePolicy:
&vpav1.PodUpdatePolicy{}} calls SetDefaultsWhereNeeded and asserts
UpdatePolicy.UpdateMode equals pointer-to vpav1.UpdateModeInPlaceOrRecreate.
In `@pkg/apis/kai/v1/scheduler/scheduler.go`:
- Around line 71-73: When resolving the effective VPA for the scheduler, after
inheriting globalVPA into s.VPA (the existing block that sets s.VPA = globalVPA
when s.VPA == nil), always run defaulting on the resulting VPASpec by calling
VPASpec.SetDefaultsWhereNeeded() on s.VPA (or on globalVPA if assigned) so that
any partially-specified local s.VPA gets its missing fields (e.g.,
updatePolicy.updateMode) populated; locate the logic around s.VPA and globalVPA
in scheduler.go and invoke VPASpec.SetDefaultsWhereNeeded() on the effective VPA
object immediately after the inheritance/assignment.
In `@pkg/operator/operands/known_types/verticalpodautoscalers.go`:
- Around line 33-40: The current InitWithManager block treats any error from
mgr.GetFieldIndexer().IndexField as "VPA CRD not available" and returns nil,
which can hide other failures; update the IndexField error handling in
InitWithManager to (1) detect a CRD-missing/no-match error (e.g., use k8s
api/errors.IsNotFound or meta.IsNoMatchError / apiextensions-specific check) and
only skip registration and leave vpaAvailable false in that case, and (2) for
all other errors, log the full error via log.FromContext(ctx).Error(...) and
return the error instead of nil so actual failures surface; refer to IndexField,
mgr.GetFieldIndexer(), CollectableOwnerKey, vpaIndexer, vpaAvailable and
InitWithManager when making the change.
---
Nitpick comments:
In `@cmd/operator/app/app.go`:
- Around line 6-34: The import block must be reorganized into three groups
(standard library, external dependencies, internal packages) separated by blank
lines: move the single standard library import "os" into the first group; place
all k8s/sigs/prometheus/metrics/apiextensions/client-go/ctrl/healthz/log/zap and
other third-party imports (e.g., nvidiav1, monitoringv1, vpav1, apiextensionsv1,
clientgoscheme, utilruntime, runtime, manager, server, ctrl, healthz, zap, and
the side-effect auth import) into the second group; put the repo-local imports
(github.com/NVIDIA/KAI-scheduler/cmd/operator/config, kaiv1, kaiv1alpha1,
controller, operands) into the third group; ensure a single blank line separates
each group and retain the // +kubebuilder:scaffold:imports and the side-effect
import (_ "k8s.io/client-go/plugin/pkg/client/auth") in the appropriate external
group.
In `@deployments/kai-scheduler/crds/kai.scheduler_configs.yaml`:
- Around line 1209-1215: The schema for the field minReplicas lacks a minimum
constraint despite the description saying "Only positive values are allowed";
update the CRD and source types to enforce this by adding a minimum: 1
constraint for the minReplicas field (and apply the same change to all other VPA
blocks), or if this CRD is generated from Go types add the marker
+kubebuilder:validation:Minimum=1 to the Go field that produces minReplicas so
the generated kai.scheduler_configs.yaml includes minimum: 1 for minReplicas.
In `@pkg/apis/kai/v1/binder/binder_test.go`:
- Around line 26-91: Add tests covering non-nil globalVPA behavior: create two
new It blocks in binder_test.go that call Binder.SetDefaultsWhereNeeded with a
non-nil globalVPA value and assert VPA inheritance/override semantics. First
test: provide a globalVPA (with some VPA settings) and a Binder with no local
VPA, call SetDefaultsWhereNeeded(&replicaCount?, &globalVPA) and Expect the
resulting Binder to have the global VPA settings applied. Second test: provide
the same globalVPA but initialize Binder with its own local VPA differing from
global, call SetDefaultsWhereNeeded(..., &globalVPA) and Expect the Binder to
retain its local VPA (local takes precedence). Use the existing Binder,
SetDefaultsWhereNeeded, and globalVPA identifiers so assertions target the
Binder's VPA-related fields (the local VPA struct/field names used in the code)
to verify inheritance and override.
In `@pkg/apis/kai/v1/binder/binder.go`:
- Around line 7-14: The import block in binder.go is misgrouped: move all
external dependencies (k8s.io/* and other third-party imports like
"k8s.io/utils/ptr", v1 "k8s.io/api/core/v1",
"k8s.io/apimachinery/pkg/api/resource") together, and place internal project
imports ("github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common" and
"github.com/NVIDIA/KAI-scheduler/pkg/common/constants") in a separate group
below them, with a blank line separating groups so the import groups follow the
standard library, external, internal ordering.
In `@pkg/apis/kai/v1/pod_grouper/pod_grouper_test.go`:
- Line 24: Add a new unit test in pod_grouper_test.go that covers the non-nil
global VPA inheritance path: create a PodGrouper (podGrouper) with a non-nil
GlobalVPA (populate fields you expect to be inherited) and a nil LocalVPA, call
podGrouper.SetDefaultsWhereNeeded(&replicaCount, nil), and assert that the
resulting pod spec/fields reflect values inherited from GlobalVPA (use the same
assertion style as the existing nil-global-VPA tests to validate expected
fields). Ensure the test targets the SetDefaultsWhereNeeded method and verifies
the specific fields you expect to be copied from GlobalVPA into the pod/group.
In `@pkg/apis/kai/v1/pod_grouper/pod_grouper.go`:
- Around line 78-80: The code assigns the global VPASpec pointer directly to
component config (pg.VPA = globalVPA), which shares mutable state; instead, when
pg.VPA is nil copy the globalVPA into a new VPASpec instance (deep copy) and
assign that to pg.VPA so each component gets its own independent VPASpec; locate
the assignment for pg.VPA in pod_grouper.go and replace the direct pointer
assignment with logic that creates a deep copy of globalVPA (copying all fields,
nested structs/slices/maps) before assigning to pg.VPA.
In `@pkg/apis/kai/v1/queue_controller/queue_controller_test.go`:
- Line 24: Add unit tests in queue_controller_test.go that exercise
queueController.SetDefaultsWhereNeeded(replicaCount, globalVPA) with a non-nil
globalVPA to verify VPA-specific defaults are applied: create a test that
constructs a queueController instance and a non-nil globalVPA object, call
SetDefaultsWhereNeeded(&replicaCount, &globalVPA), and assert the expected
VPA-related fields on the controller (or its PodTemplate/annotations) were set;
also keep the existing test that passes nil for globalVPA to ensure behavior
differs when globalVPA is absent. Use the symbols queueController,
SetDefaultsWhereNeeded, replicaCount, and globalVPA to find and update/add
tests.
In `@pkg/apis/kai/v1/queue_controller/queue_controller.go`:
- Around line 7-12: Reorder the import block in queue_controller.go so external
dependencies come before internal packages: keep standard library (if any)
first, then group external imports like "k8s.io/api/core/v1",
"k8s.io/apimachinery/pkg/api/resource", "k8s.io/utils/ptr" together, followed by
the internal import "github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/common";
update the import grouping and add single blank lines between the three groups
to match the project's import style.
In `@pkg/apis/kai/v1/scheduler/scheduler_test.go`:
- Around line 23-66: Add a test that exercises the non-nil global VPA path by
calling Scheduler.SetDefaultsWhereNeeded with a real, minimal global VPA object
instead of nil; create a Scheduler{}, a replicaCount value, and a small VPA
object (the same type expected by SetDefaultsWhereNeeded) and pass it as the
second argument, then assert that fields which should be influenced by a global
VPA (e.g., resource requests/limits or replica-related defaults on
scheduler.Service / SchedulerService / Replicas) reflect the VPA-driven
defaults/overrides; locate the call site in the tests using
SetDefaultsWhereNeeded and add this one case to validate propagation/override
behavior.
In `@pkg/operator/operands/known_types/known_types_test.go`:
- Around line 6-17: The import block in known_types_test.go is not grouped per
guidelines; reorder the imports into three groups separated by blank lines: (1)
standard library imports (e.g., testing), (2) external dependencies (e.g.,
github.com/onsi/ginkgo/v2, github.com/onsi/gomega, k8s.io/... packages like
metav1, types, vpav1), and (3) internal packages (e.g.,
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 referred to as kaiv1); update
the import block so each group is contiguous and separated by a blank line and
keep the same import identifiers used in the file.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bbb16830-ba51-4dd0-b38d-2e3d0da70eb6
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (50)
CHANGELOG.mdcmd/operator/app/app.godeployments/kai-scheduler/crds/kai.scheduler_configs.yamldeployments/kai-scheduler/templates/kai-config.yamldeployments/kai-scheduler/templates/rbac/operator.yamldeployments/kai-scheduler/values.yamlgo.modhack/setup-e2e-cluster.shpkg/apis/kai/v1/admission/admission.gopkg/apis/kai/v1/admission/admission_test.gopkg/apis/kai/v1/admission/zz_generated.deepcopy.gopkg/apis/kai/v1/binder/binder.gopkg/apis/kai/v1/binder/binder_test.gopkg/apis/kai/v1/binder/zz_generated.deepcopy.gopkg/apis/kai/v1/common/vpa.gopkg/apis/kai/v1/common/vpa_test.gopkg/apis/kai/v1/common/zz_generated.deepcopy.gopkg/apis/kai/v1/config_types.gopkg/apis/kai/v1/global.gopkg/apis/kai/v1/node_scale_adjuster/node_scale_adjuster.gopkg/apis/kai/v1/node_scale_adjuster/node_scale_adjuster_test.gopkg/apis/kai/v1/node_scale_adjuster/zz_generated.deepcopy.gopkg/apis/kai/v1/pod_group_controller/pod_group_controller.gopkg/apis/kai/v1/pod_group_controller/pod_group_controller_test.gopkg/apis/kai/v1/pod_group_controller/zz_generated.deepcopy.gopkg/apis/kai/v1/pod_grouper/pod_grouper.gopkg/apis/kai/v1/pod_grouper/pod_grouper_test.gopkg/apis/kai/v1/pod_grouper/zz_generated.deepcopy.gopkg/apis/kai/v1/queue_controller/queue_controller.gopkg/apis/kai/v1/queue_controller/queue_controller_test.gopkg/apis/kai/v1/queue_controller/zz_generated.deepcopy.gopkg/apis/kai/v1/scheduler/scheduler.gopkg/apis/kai/v1/scheduler/scheduler_test.gopkg/apis/kai/v1/scheduler/zz_generated.deepcopy.gopkg/apis/kai/v1/zz_generated.deepcopy.gopkg/operator/controller/config_controller.gopkg/operator/controller/schedulingshard_controller.gopkg/operator/operands/admission/admission.gopkg/operator/operands/binder/binder.gopkg/operator/operands/common/vpa.gopkg/operator/operands/common/vpa_test.gopkg/operator/operands/deployable/deployable_test.gopkg/operator/operands/known_types/known_types.gopkg/operator/operands/known_types/known_types_test.gopkg/operator/operands/known_types/verticalpodautoscalers.gopkg/operator/operands/node_scale_adjuster/node_scale_adjuster.gopkg/operator/operands/pod_group_controller/pod_group_controller.gopkg/operator/operands/pod_grouper/pod_grouper.gopkg/operator/operands/queue_controller/queue_controller.gopkg/operator/operands/scheduler/scheduler.go
Merging this branch changes the coverage (8 decrease, 9 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
Description
Example scheduler config:
Result:
Checklist
Summary by CodeRabbit
Release Notes
--install-vpaflag to enable automated VPA installation during test cluster setup.