[SANDBOX-1399][SANDBOX-1395] apply all cluster resources at once and read the prior state from the nstemplateset status instead of labels#700
Conversation
WalkthroughRefactors cluster-resources reconciliation to a template-driven, single-pass apply/cleanup flow; simplifies Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Reconciler
participant ClusterMgr as clusterResources.ensure()
participant Processor as processTierTemplate()
participant Applier as objectApplier
participant K8s as Kubernetes API
Reconciler->>ClusterMgr: ensure(ctx, NSTemplateSet)
ClusterMgr->>Processor: resolve tier template -> newObjects, currentObjects
Processor-->>ClusterMgr: newObjects, currentObjects
ClusterMgr->>Applier: newObjectApplier(newObjects, currentObjects)
Applier->>Applier: Apply each new object (create/update)
Applier->>K8s: create/update resource
K8s-->>Applier: success / error
Applier->>Applier: update NSTemplateSet status (Provisioning/Provisioned/Updating)
Applier->>Applier: Cleanup stale objects (delete)
Applier->>K8s: delete resource
K8s-->>Applier: deleted / error
Applier-->>ClusterMgr: result (nil or error)
ClusterMgr-->>Reconciler: returns error only (no requeue flag)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Skipping CI for Draft Pull Request. |
Codecov Report❌ Patch coverage is ❌ Your changes status has failed because you have indirect coverage changes. Learn more about Unexpected Coverage Changes and reasons for indirect coverage changes. Additional details and impacted files@@ Coverage Diff @@
## master #700 +/- ##
==========================================
+ Coverage 80.81% 81.21% +0.39%
==========================================
Files 48 48
Lines 4478 4445 -33
==========================================
- Hits 3619 3610 -9
+ Misses 698 683 -15
+ Partials 161 152 -9
🚀 New features to boost your workflow:
|
MatousJobanek
left a comment
There was a problem hiding this comment.
first quick review - overall looks good 👍 haven't reviewed the tests yet
| changeStatusIfNeeded := func() error { | ||
| if !statusChanged { | ||
| if firstDeployment { | ||
| err = r.setStatusProvisioningIfNotUpdating(ctx, nsTmplSet) | ||
| } else { | ||
| err = r.setStatusUpdatingIfNotProvisioning(ctx, nsTmplSet) | ||
| } | ||
| if err != nil { | ||
| return false, r.wrapErrorWithStatusUpdateForClusterResourceFailure(gvkCtx, nsTmplSet, err, | ||
| "failed to process template for the cluster resources with the name '%s'", nsTmplSet.Spec.ClusterResources.TemplateRef) | ||
| return err | ||
| } | ||
| statusChanged = true | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
do we need to run this in the loop? I know that it's executed only once, but my point is why not running
if firstDeployment {
err = r.setStatusProvisioningIfNotUpdating(ctx, nsTmplSet)
} else {
err = r.setStatusUpdatingIfNotProvisioning(ctx, nsTmplSet)
}
if err != nil {
return err
}simply here?
There was a problem hiding this comment.
Because the list of newObjs might be empty or only contain resources that should not be created. We'd be setting the status prematurely in those cases. There are pre-existing unit tests for this, so I wanted to keep the behavior as is because I frankly still don't fully see all the consequences changes like this might trigger.
There was a problem hiding this comment.
if the newObjs doesn't contain anything, then there will most like be still something to be deleted (the refs don't match so we expect that there has been some change).
Anyway, I don't think that there would be any consequences on anything in the whole chain (this is a very rare corner-case), which means that it would be completely OK to simplify this.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
controllers/nstemplateset/nstemplateset_controller_test.go (1)
1175-1176: Bug: wrong namespace asserted as the “second” oneThis test asserts deletion of the dev namespace twice. The “second” namespace should be stage.
Apply this diff:
- secondtNSName := fmt.Sprintf("%s-dev", spacename) + secondtNSName := fmt.Sprintf("%s-stage", spacename)
🧹 Nitpick comments (8)
controllers/nstemplateset/nstemplateset_controller_test.go (1)
817-819: Nit: fix comment typo“updgraded” → “upgraded”.
controllers/nstemplateset/nstemplateset_controller.go (1)
219-233: Finalizer removal after issuing cluster-resource deletes may leave residualsIf any cluster-scoped resource has its own finalizer,
Client.Deletecan succeed while the object remains terminating. Removing the NSTemplateSet finalizer immediately could leave orphaned cluster resources.
- Verify whether any cluster resources you apply can carry finalizers (e.g., CRDs like Idler).
- If yes, consider requeue-until-deleted semantics (similar to namespaces) or at least checking for DeletionTimestamp/NotFound before removing the finalizer.
controllers/nstemplateset/cluster_resources_test.go (6)
356-359: Avoid variable shadowing of shared fixtures (rename local emptyCrq).Local emptyCrq shadows the fixture name used in templates, reducing clarity. Rename the local to avoid confusion.
- emptyCrq := newClusterResourceQuota("empty", "advanced") + emptyCRQObj := newClusterResourceQuota("empty", "advanced") @@ - manager, cl := prepareClusterResourcesManager(t, nsTmplSet, crq, crb, codeNs, previousTierTemplate, emptyCrq) + manager, cl := prepareClusterResourcesManager(t, nsTmplSet, crq, crb, codeNs, previousTierTemplate, emptyCRQObj)
391-394: Avoid variable shadowing of shared fixtures (rename local emptyCrq).Same shadowing issue here; rename for clarity.
- emptyCrq := newClusterResourceQuota(spaceName, "withemptycrq") - emptyCrq.Name = "for-empty" - manager, cl := prepareClusterResourcesManager(t, nsTmplSet, emptyCrq, crq, crb, codeNs, previousTierTemplate) + emptyCRQObj := newClusterResourceQuota(spaceName, "withemptycrq") + emptyCRQObj.Name = "for-empty" + manager, cl := prepareClusterResourcesManager(t, nsTmplSet, emptyCRQObj, crq, crb, codeNs, previousTierTemplate)
119-127: Guard empty feature string to avoid [""] slice edge case.Splitting an empty string yields []string{""}. While harmless here, guarding improves intent and avoids future footguns if logic changes.
- enabledFeatures := strings.Split(testRun.enabledFeatures, ",") + var enabledFeatures []string + if testRun.enabledFeatures != "" { + enabledFeatures = strings.Split(testRun.enabledFeatures, ",") + } for _, f := range allTierFeatures { - if slices.Contains(enabledFeatures, f) { + if slices.Contains(enabledFeatures, f) { clusterAssertion.HasResource(fmt.Sprintf("%s-for-%s", f, spacename), "av1.ClusterResourceQuota{}) } else { clusterAssertion.HasNoResource(fmt.Sprintf("%s-for-%s", f, spacename), "av1.ClusterResourceQuota{}) } }
323-324: Fix misleading inline comment.The comment contradicts the assertion. Either remove it or adjust to reflect that the finalizer is present.
- HasFinalizer(). // finalizer was not added and nothing else was done + HasFinalizer(). // finalizer present; controller reported UnableToTerminate below
756-756: Polish test title for clarity.Minor grammar fix.
- t.Run("update to abcde13 fails because it find the template", func(t *testing.T) { + t.Run("update to abcde13 fails because it cannot find the template", func(t *testing.T) {
10-12: Prefer stdlib slices (Go 1.21+) or alias k8s slices to avoid future conflicts.If you target Go 1.21+, switch to the stdlib “slices” package. If not, consider aliasing the k8s import (e.g., kslices) to reduce confusion when migrating.
What Go version does this repo target? If 1.21+, I can propose a small diff to use stdlib slices; otherwise I can suggest the alias update across the file.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
controllers/nstemplateset/cluster_resources.go(3 hunks)controllers/nstemplateset/cluster_resources_test.go(13 hunks)controllers/nstemplateset/nstemplateset_controller.go(2 hunks)controllers/nstemplateset/nstemplateset_controller_test.go(13 hunks)test/cluster_assertion.go(1 hunks)test/nstemplateset_assertion.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
controllers/nstemplateset/nstemplateset_controller_test.go (3)
test/nstemplateset_assertion.go (5)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Updating(232-238)Provisioned(216-222)NewTierTemplateName(212-214)test/cluster_assertion.go (2)
AssertThatCluster(20-25)WithLabel(45-51)test/namespace_assertion.go (1)
AssertThatNamespace(32-38)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)UnableToProvisionClusterResources(258-265)Updating(232-238)UpdateFailed(240-247)test/cluster_assertion.go (4)
AssertThatCluster(20-25)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build & push operator bundles for e2e tests
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (6)
test/cluster_assertion.go (1)
53-58: LGTM: handy assertion for label absenceThe
WithoutLabeloption is clear and useful for negative label checks.controllers/nstemplateset/nstemplateset_controller_test.go (2)
615-715: Assertions align with the “apply-all-at-once” flowGood end-to-end coverage for cluster resources first, then namespace internals; matches the new reconciliation behavior.
1477-1483: LGTM: helpful status helper for cross-tier clusterresources
withStatusClusterResourcesInTierneatly sets up cross-tier last-applied refs for upgrade tests.test/nstemplateset_assertion.go (2)
160-166: LGTM: added nil-guard for Spec.ClusterResourcesThe extra NotNil check prevents panics in assertions.
168-174: LGTM: status-based clusterresources assertion
HasLastAppliedClusterResourcesTemplateRefcomplements the spec-based assertion well.controllers/nstemplateset/cluster_resources.go (1)
5-7: OK — go.mod targets Go 1.23,slicesimport is safego.mod declares "go 1.23.0" (>=1.21), so the standard-library
slicespackage is supported; no change needed.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
controllers/nstemplateset/cluster_resources.go (1)
115-120: Potential performance issue with O(n²) nested loop for object matchingThe nested loop to find and remove matching objects has O(n²) time complexity. With many cluster resources, this could cause performance issues.
Consider using a map for O(1) lookups:
- // by removing the new objects from the current objects, we are going to be left with the objects - // that should be removed from the cluster after we're done with this loop - for oldIdx, curObj := range currentObjects { - if commonclient.SameGVKandName(curObj, newObj) { - currentObjects = slices.Delete(currentObjects, oldIdx, oldIdx+1) - break - } - } + // by removing the new objects from the current objects, we are going to be left with the objects + // that should be removed from the cluster after we're done with this loop + currentObjectsMap := make(map[string]int) + for i, curObj := range currentObjects { + key := fmt.Sprintf("%s/%s", curObj.GetObjectKind().GroupVersionKind().String(), curObj.GetName()) + currentObjectsMap[key] = i + } + + newObjKey := fmt.Sprintf("%s/%s", newObj.GetObjectKind().GroupVersionKind().String(), newObj.GetName()) + if idx, found := currentObjectsMap[newObjKey]; found { + currentObjects = slices.Delete(currentObjects, idx, idx+1) + delete(currentObjectsMap, newObjKey) + }However, note that this approach requires careful index management after deletions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
controllers/nstemplateset/cluster_resources.go(3 hunks)controllers/nstemplateset/cluster_resources_test.go(13 hunks)make/podman.mk(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)UnableToProvisionClusterResources(258-265)Updating(232-238)UpdateFailed(240-247)test/cluster_assertion.go (4)
AssertThatCluster(20-25)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (27)
make/podman.mk (1)
17-17: Good dependency: ensure auth before pushing.Adding podman-logged-in as a prerequisite for podman-push is the right call to gate pushes on auth.
controllers/nstemplateset/cluster_resources_test.go (18)
10-12: LGTM - Import additions for new functionality.The new imports
k8s.io/client-go/kubernetes/schemeandk8s.io/utils/strings/slicesare appropriately added to support the enhanced test functionality. These align with the PR's shift to template-driven resource handling.
41-56: LGTM - Test case correctly validates provisioning status.This test case effectively validates that the new single-reconciliation approach correctly sets the NSTemplateSet status to "Provisioning" and creates all cluster resources in one pass. The assertions for all expected resources (ClusterResourceQuota, ClusterRoleBinding, and Idlers) are comprehensive.
103-128: LGTM - Feature-gated resource testing is well-structured.The test correctly validates that only resources for enabled features are deployed while ensuring feature-gated resources are properly excluded. The use of
slices.Containsfor feature checking is appropriate and the parameterized approach covers all relevant scenarios.
138-138: LGTM - Consistent API usage.The transition from the previous multi-step approach to calling
manager.ensure(ctx, nsTmplSet)directly aligns with the PR's objective to apply all cluster resources in a single reconciliation.
150-155: LGTM - Status-based resource management test.The test properly validates the new status-based approach by using
withStatusClusterResources("abcde11")to indicate already-provisioned resources, which aligns with the PR's shift from label-based to status-based resource tracking.
195-195: LGTM - Consistent API usage throughout tests.All test cases consistently use
manager.ensure(ctx, nsTmplSet)which aligns with the single-reconciliation approach described in the PR objectives.Also applies to: 214-214, 361-361, 396-396, 424-424, 456-456, 488-488, 517-517, 546-546, 584-584, 612-612, 648-648, 672-672, 699-699, 716-716, 743-743, 762-762, 809-809
199-199: LGTM - Updated error message reflects new template-based approach.The error message correctly references "TierTemplate for the to-be-applied cluster resources" which aligns with the PR's shift to template-driven resource handling.
222-222: LGTM - Detailed error message for cluster resource application failures.The error message provides comprehensive details including resource name, GVK, and failure context, which will aid in debugging cluster resource application issues.
231-232: LGTM - Test environment setup moved to appropriate location.Moving the environment variable setup to the beginning of the test function ensures it's properly configured for all test cases in the
TestDeleteClusterResourcessuite.
237-237: LGTM - Updated NSTemplateSet setup with status-based approach.The test correctly uses both
withClusterResources("abcde11")andwithStatusClusterResources("abcde11")which reflects the new status-based resource tracking approach described in the PR.
244-244: LGTM - Consistent deletion API usage.All deletion test cases consistently use
manager.delete(ctx, nsTmplSet)which maintains API consistency across the test suite.Also applies to: 264-264, 288-288, 301-301, 317-317
250-250: LGTM - Correct assertion for ClusterRoleBinding deletion.The assertion correctly verifies the absence of the ClusterRoleBinding with the expected name format
spacename+"-tekton-view".
255-255: LGTM - Status-based resource tracking in deletion tests.The tests properly use
withStatusClusterResources("abcde11")to indicate the current state of cluster resources, which aligns with the PR's shift from label-based to status-based tracking.Also applies to: 281-281
336-336: LGTM - Comprehensive variable naming consistency.The consistent use of
spaceNameinstead ofspacenamethroughout the promotion and update tests improves code readability and follows Go naming conventions. All resource assertions correctly reference the updated variable names.Also applies to: 350-358, 365-370, 381-393, 400-410, 419-434, 445-467, 478-497, 513-533, 544-558, 571-594, 604-623, 635-639, 643-656, 661-679, 684-707, 711-724, 736-754, 758-775, 787-816
343-348: LGTM - Template creation for previous state simulation.The creation of
previousTierTemplateobjects correctly simulates the previous state of cluster resources, which is essential for testing the status-based comparison logic introduced in this PR.Also applies to: 375-380, 413-418, 438-443, 471-476, 537-542, 565-570, 598-603, 730-735, 790-795
353-353: LGTM - Status-based resource tracking implementation.The use of
withStatusClusterResourcesInTierandwithStatusClusterResourcesproperly implements the new status-based approach where the controller determines resource membership by comparing current and previous templateRef values stored in status.Also applies to: 384-384, 448-448, 481-481, 543-543, 575-575, 604-604, 665-665, 688-688, 802-802
369-371: Wrong type asserted for non-existence; should check CRQ, not CRB.This assertion verifies absence of a ClusterRoleBinding named "for-empty", but based on the test context and the previous review comment, this should be checking for a ClusterResourceQuota instead.
Apply this diff:
AssertThatCluster(t, cl). HasResource("for-"+spaceName, "av1.ClusterResourceQuota{}). - HasNoResource("for-empty", "av1.ClusterResourceQuota{}) + HasNoResource("for-empty", "av1.ClusterResourceQuota{})
590-590: LGTM - Updated error messages reflect new resource handling.The error messages correctly reference the new resource naming conventions and provide detailed failure context that will help with troubleshooting cluster resource application failures.
Also applies to: 619-619, 749-749
controllers/nstemplateset/cluster_resources.go (8)
46-48: Already flagged: Don't short‑circuit when templateRef is unchangedThis early return prevents re-applying missing/modified cluster resources when templateRef hasn't changed, breaking drift healing. Since you don't watch cluster resources, periodic reconciles will no-op and won't self-heal.
188-209: Already flagged: Fix variable scope issueThe
errvariable from the shortifdeclaration is not accessible outside the if-block, causing a compile error on line 195.
171-209: Good addition of explicit delete functionalityThe new
deletemethod provides a clean way to explicitly remove all cluster-scoped resources when an NSTemplateSet is deleted. The implementation correctly processes the existing template to determine what resources to remove and handles errors appropriately.
93-106: Status management approach looks reasonableThe
changeStatusIfNeededclosure properly differentiates between first deployment and update scenarios, setting appropriate statuses only once. This addresses the previous review concern about premature status setting.
68-82: Template processing logic is well-structuredThe approach of processing both new and old tier templates to determine the desired state and current state is clean and logical. Error handling includes appropriate context for debugging.
117-117: Fix slices.Delete usage - requires slice reassignmentThe
slices.Deletefunction "modifies the contents of the slice" and "operates in place", but you must reassign the result to update the slice length properly.Apply this diff:
- currentObjects = slices.Delete(currentObjects, oldIdx, oldIdx+1) + currentObjects = slices.Delete(currentObjects, oldIdx, oldIdx+1)Actually, looking more carefully at the code, the assignment is already present. However, there's still a logic issue - after deleting an element, the indices of subsequent elements shift, but the loop continues with the original indices. This can cause elements to be skipped.
- for oldIdx, curObj := range currentObjects { - if commonclient.SameGVKandName(curObj, newObj) { - currentObjects = slices.Delete(currentObjects, oldIdx, oldIdx+1) - break - } - } + for oldIdx := 0; oldIdx < len(currentObjects); oldIdx++ { + curObj := currentObjects[oldIdx] + if commonclient.SameGVKandName(curObj, newObj) { + currentObjects = slices.Delete(currentObjects, oldIdx, oldIdx+1) + break + } + }Likely an incorrect or invalid review comment.
30-30: Signature change is safe — no callers depend on the removed booleanVerified: the only production caller is the Reconciler which calls r.clusterResources.ensure(ctx, nsTmplSet) and handles only an error (controllers/nstemplateset/nstemplateset_controller.go:139); tests in controllers/nstemplateset/cluster_resources_test.go also call it as error-only. No callers require the old (bool, error) return.
141-143: Verified — deleteObsoleteObjects is compatible with bulk deletion. The function iterates currentObjs and deletes entries not present in newObjects; cluster_resources.ensure prunes currentObjects against newObjs before calling deleteObsoleteObjects(ctx, r.Client, currentObjects, nil), so passing nil is correct. See controllers/nstemplateset/nstemplateset_controller.go:238–258.
c04a460 to
013cff1
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
controllers/nstemplateset/cluster_resources_test.go (1)
89-103: Subtest state leak: reuse of nsTmplSet mutates annotations across iterations.Create a fresh NSTemplateSet per subtest; otherwise “no enabled features” can inherit prior features.
Apply:
- if testRun.enabledFeatures != "" { - nsTmplSet.Annotations = map[string]string{ - toolchainv1alpha1.FeatureToggleNameAnnotationKey: testRun.enabledFeatures, - } - } - manager, fakeClient := prepareClusterResourcesManager(t, nsTmplSet) + // fresh NSTemplateSet per subtest to avoid leakage from previous iterations + nsTmplSet := newNSTmplSet(namespaceName, spacename, "advanced", + withNamespaces("abcde11", "dev"), withClusterResources("abcde11")) + if testRun.enabledFeatures != "" { + nsTmplSet.Annotations = map[string]string{ + toolchainv1alpha1.FeatureToggleNameAnnotationKey: testRun.enabledFeatures, + } + } + manager, fakeClient := prepareClusterResourcesManager(t, nsTmplSet)
🧹 Nitpick comments (3)
controllers/nstemplateset/cluster_resources.go (2)
24-26: Fix outdated method doc: signature no longer returns (bool, error).Apply:
-// Returns `true, nil` if something was changed, `false, nil` if nothing changed, `false, err` if an error occurred +// Returns nil on success, or an error on failure.
122-130: Avoid err shadowing in apply error path.Shadowing
errreduces clarity. Use a different local or reuseerr.Apply:
- if _, err = r.apply(ctx, nsTmplSet, newTierTemplate, newObj); err != nil { - err := fmt.Errorf("failed to apply changes to the cluster resource %s, %s: %w", newObj.GetName(), newObj.GetObjectKind().GroupVersionKind().String(), err) - return r.wrapErrorWithStatusUpdate(ctx, nsTmplSet, failureStatusReason, err, "failure while syncing cluster resources") - } + if _, err = r.apply(ctx, nsTmplSet, newTierTemplate, newObj); err != nil { + wrapErr := fmt.Errorf("failed to apply changes to the cluster resource %s, %s: %w", + newObj.GetName(), newObj.GetObjectKind().GroupVersionKind().String(), err) + return r.wrapErrorWithStatusUpdate(ctx, nsTmplSet, failureStatusReason, wrapErr, "failure while syncing cluster resources") + }controllers/nstemplateset/cluster_resources_test.go (1)
756-757: Fix test name grammar.Apply:
- t.Run("update to abcde13 fails because it find the template", func(t *testing.T) { + t.Run("update to abcde13 fails because it cannot find the template", func(t *testing.T) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
controllers/nstemplateset/cluster_resources.go(3 hunks)controllers/nstemplateset/cluster_resources_test.go(13 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:00.999Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
📚 Learning: 2025-09-24T10:52:00.999Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:00.999Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
Applied to files:
controllers/nstemplateset/cluster_resources_test.gocontrollers/nstemplateset/cluster_resources.go
🧬 Code graph analysis (2)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)UnableToProvisionClusterResources(258-265)Updating(232-238)UpdateFailed(240-247)test/cluster_assertion.go (4)
AssertThatCluster(20-25)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (2)
controllers/nstemplateset/cluster_resources.go (2)
46-48: Short-circuit skips drift healing when refs match. Confirm this is intended.With
oldTemplateRef == newTemplateRef, no reconcile occurs, so externally deleted/modified cluster resources won’t be re-applied. Given the immutable TierTemplate assumption, this may be OK, but confirm we don’t need self-healing here (or add a periodic resync/watch later).
5-7: Stdlibslicesis fine — repo targets Go 1.23.0
go.mod setsgo 1.23.0, so the standard-libraryslicespackage is supported; no change required.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
controllers/nstemplateset/cluster_resources.go (4)
22-24: Update function doc to match new signature (no bool return).Comment still mentions a (bool, error) return. Please align the doc with the current signature.
Apply:
-// ensure ensures that the cluster resources exist. -// Returns `true, nil` if something was changed, `false, nil` if nothing changed, `false, err` if an error occurred +// ensure ensures that the cluster resources exist. +// Returns an error on failure; nil on success.
112-118: Optional: avoid O(n²) scan by indexing currentObjects.Not critical given small sets, but you can pre-index currentObjects by GVK+name for O(1) deletes to simplify and speed up the loop.
I can provide a small helper to build a map keyed by GroupVersionKind+Name and reconstruct the slice after deletions if you’d like.
121-129: Avoid err shadowing; improve readability.The inner
err := fmt.Errorf(...)shadows the outer err. Use a different name.- if _, err = r.apply(ctx, nsTmplSet, newTierTemplate, newObj); err != nil { - err := fmt.Errorf("failed to apply changes to the cluster resource %s, %s: %w", newObj.GetName(), newObj.GetObjectKind().GroupVersionKind().String(), err) - return r.wrapErrorWithStatusUpdate(ctx, nsTmplSet, failureStatusReason, err, "failure while syncing cluster resources") + if _, err = r.apply(ctx, nsTmplSet, newTierTemplate, newObj); err != nil { + wrapErr := fmt.Errorf("failed to apply changes to the cluster resource %s, %s: %w", newObj.GetName(), newObj.GetObjectKind().GroupVersionKind().String(), err) + return r.wrapErrorWithStatusUpdate(ctx, nsTmplSet, failureStatusReason, wrapErr, "failure while syncing cluster resources") }
181-181: Nit: use GetName() for consistency.Minor consistency improvement only.
- currentObjects, err := currentTierTemplate.process(r.Scheme, map[string]string{SpaceName: nsTmplSet.Name}) + currentObjects, err := currentTierTemplate.process(r.Scheme, map[string]string{SpaceName: nsTmplSet.GetName()})
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
controllers/nstemplateset/cluster_resources.go(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:00.999Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
📚 Learning: 2025-09-24T10:52:00.999Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:00.999Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
Applied to files:
controllers/nstemplateset/cluster_resources.go
📚 Learning: 2025-09-24T11:19:39.211Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:187-196
Timestamp: 2025-09-24T11:19:39.211Z
Learning: In controllers/nstemplateset/cluster_resources.go, the delete method specifically handles cluster-scoped resources only (not namespaced resources), so using types.NamespacedName{Name: toDelete.GetName()} in Client.Get calls is correct since cluster resources don't have namespaces. However, using runtimeclient.ObjectKeyFromObject() is preferred for future-proofing.
Applied to files:
controllers/nstemplateset/cluster_resources.go
🧬 Code graph analysis (1)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (5)
controllers/nstemplateset/cluster_resources.go (5)
45-47: Early return disables drift healing when templateRef is unchanged — confirm intent.Per your stated design (TierTemplates treated as immutable; content changes use new refs), this short-circuit is intentional. Note, though, it also prevents re-creating cluster resources if they are manually deleted or drifted without a ref change, even if a reconcile is triggered.
If that’s acceptable for now, keep as-is; otherwise, consider re-applying idempotently while deferring status changes until a real mutation is detected (using the createdOrModified flag from apply).
Do you want to keep this behavior until TierTemplateRevisions land, or should we enable self-healing now? I can propose a minimal refactor that avoids status churn.
140-142: Good reuse of deleteObsoleteObjects.Switching to the shared helper aligns behavior with namespaces/space-roles and reduces duplication.
189-191: LGTM: use ObjectKeyFromObject for Get.Correct for cluster-scoped resources and more future-proof than constructing NamespacedName manually.
5-7: Verified: go.mod uses Go 1.23.0 — stdlibslices(requires ≥1.21) is supported.
30-47: Confirmed: periodic resync used; no additional watch required.SetupWithManager explicitly avoids watching cluster-scoped resources and relies on controller-runtime's periodic resync (manager.Options.Cache.SyncPeriod) — see controllers/nstemplateset/nstemplateset_controller.go lines 65–73; the early return in cluster_resources.ensure on identical TemplateRef is intentional.
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
controllers/nstemplateset/cluster_resources.go(3 hunks)controllers/nstemplateset/cluster_resources_test.go(13 hunks)controllers/nstemplateset/features.go(1 hunks)controllers/nstemplateset/features_test.go(1 hunks)controllers/nstemplateset/nstemplateset_controller_test.go(14 hunks)controllers/nstemplateset/status.go(1 hunks)
👮 Files not reviewed due to content moderation or server errors (3)
- controllers/nstemplateset/nstemplateset_controller_test.go
- controllers/nstemplateset/features.go
- controllers/nstemplateset/features_test.go
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
📚 Learning: 2025-09-24T11:19:39.229Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:187-196
Timestamp: 2025-09-24T11:19:39.229Z
Learning: In controllers/nstemplateset/cluster_resources.go, the delete method specifically handles cluster-scoped resources only (not namespaced resources), so using types.NamespacedName{Name: toDelete.GetName()} in Client.Get calls is correct since cluster resources don't have namespaces. However, using runtimeclient.ObjectKeyFromObject() is preferred for future-proofing.
Applied to files:
controllers/nstemplateset/cluster_resources_test.gocontrollers/nstemplateset/cluster_resources.gocontrollers/nstemplateset/nstemplateset_controller_test.go
📚 Learning: 2025-09-24T10:52:01.009Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
Applied to files:
controllers/nstemplateset/cluster_resources.gocontrollers/nstemplateset/nstemplateset_controller_test.go
🧬 Code graph analysis (3)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)Updating(232-238)UnableToProvisionClusterResources(258-265)UpdateFailed(240-247)test/cluster_assertion.go (4)
AssertThatCluster(20-25)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
controllers/nstemplateset/nstemplateset_controller_test.go (3)
test/nstemplateset_assertion.go (5)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Updating(232-238)Provisioned(216-222)NewTierTemplateName(212-214)test/cluster_assertion.go (2)
AssertThatCluster(20-25)WithLabel(45-51)test/namespace_assertion.go (1)
AssertThatNamespace(32-38)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
| if oldTemplateRef == newTemplateRef && !featuresChanged(nsTmplSet) { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Restore drift healing when templateRef is unchanged
Short-circuiting when oldTemplateRef == newTemplateRef means we never even look at (let alone re-apply) the expected objects. If any cluster-scoped resource is deleted or mutated out-of-band, we won’t recreate or fix it because we bail out before the reconciliation logic runs. The previous implementation healed this drift; the new guard regresses that behaviour.
Please keep the “no work” fast-path only after you’ve proven that every object that should exist actually does (and no deletions are pending). One way to do that is to process the templates, detect whether any create/update/delete is required, and only then early-return; alternatively, remove the short-circuit and restructure the status transition so it’s triggered only when we truly perform work, avoiding spurious status flips. Either way, we must retain the ability to reconcile missing resources under the same templateRef.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
controllers/nstemplateset/cluster_resources_test.go(13 hunks)controllers/nstemplateset/features_test.go(1 hunks)controllers/nstemplateset/status.go(2 hunks)controllers/nstemplateset/status_test.go(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
📚 Learning: 2025-09-24T11:19:39.229Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:187-196
Timestamp: 2025-09-24T11:19:39.229Z
Learning: In controllers/nstemplateset/cluster_resources.go, the delete method specifically handles cluster-scoped resources only (not namespaced resources), so using types.NamespacedName{Name: toDelete.GetName()} in Client.Get calls is correct since cluster resources don't have namespaces. However, using runtimeclient.ObjectKeyFromObject() is preferred for future-proofing.
Applied to files:
controllers/nstemplateset/cluster_resources_test.go
🧬 Code graph analysis (1)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)Updating(232-238)UnableToProvisionClusterResources(258-265)UpdateFailed(240-247)test/cluster_assertion.go (4)
AssertThatCluster(20-25)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (11)
controllers/nstemplateset/features_test.go (1)
101-158: LGTM! Good test coverage for feature change detection.The new
TestFeaturesChangedtest provides comprehensive coverage for all feature change scenarios including edge cases with duplicates.controllers/nstemplateset/status.go (2)
134-140: Clean implementation of cluster resources drift detection.The simplified comparison logic using nil checks and TemplateRef comparison aligns well with the PR's goal of using status-based membership determination instead of labels. The logic correctly handles all four cases: both nil, both non-nil with same/different refs, and one nil/one non-nil.
This aligns with the learning from metlos about TierTemplates being immutable, where content changes result in new versions with new references.
109-131: Approve feature comparison logicSorted and deduplicated feature lists are compared bidirectionally and covered by existing tests (e.g., “should detect duplicates” in status_test.go), so no further changes needed.
controllers/nstemplateset/cluster_resources_test.go (8)
184-186: LGTM! Test correctly sets up the feature-1 CRQ.The test correctly creates a ClusterResourceQuota with the name "feature-1-for-" + spacename to test cleanup of resources from disabled features. The resource is properly configured for the test scenario.
296-297: Verify correct resource type in assertion.Good that this was already fixed based on past review comments - the test now correctly checks for absence of
ClusterResourceQuotainstead ofClusterRoleBindingfor the "for-empty" resource.
36-56: LGTM! Test validates status transitions correctly.The test properly validates that the controller sets the status to provisioning and creates the expected cluster resources (ClusterResourceQuota, ClusterRoleBinding, and Idler resources).
149-174: Test properly validates no-op when resources exist.Good test for the scenario where all cluster resources are already created with matching status. The test correctly verifies that the status remains "Provisioned" and all expected resources still exist.
176-201: Excellent test for feature toggle cleanup.This test effectively validates the cleanup of cluster resources when features are disabled. It correctly:
- Sets up a scenario where feature-1 was enabled (in status) but is now disabled (feature-2 in annotation)
- Creates a feature-1 resource that should be cleaned up
- Verifies the resource is deleted and status changes to "Updating"
806-843: Good test coverage for feature removal scenarios.The test properly validates that when a feature annotation is removed from the NSTemplateSet, the corresponding cluster resource is deleted. This aligns well with the new status-based tracking approach.
370-374: Fix test parameter usage - pass both key and value.The
test.WithParamscall appears to only pass the value without the key. Based on the codebase patterns, it should pass both the parameter name constant and the value.Apply this fix consistently throughout the file:
- test.WithParams(spacename), + test.WithParams(SpaceName, spacename),
500-502: Fix inconsistent parameter usage in test templates.Multiple instances of
test.WithParams(spacename)need to be fixed to include the parameter key.- test.WithParams(spacename), + test.WithParams(SpaceName, spacename),
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
openshift-ci/Dockerfile.tools(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
43ee550 to
ef59420
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
controllers/nstemplateset/nstemplateset_controller_test.go (1)
1169-1176: Fix second namespace assertion to check 'stage', not 'dev'The “second namespace” verification repeats dev; it should assert stage deletion.
Apply:
- secondtNSName := fmt.Sprintf("%s-dev", spacename) + secondtNSName := fmt.Sprintf("%s-stage", spacename)
🧹 Nitpick comments (1)
controllers/nstemplateset/cluster_resources.go (1)
114-119: Potential inefficiency with repeated slice searchUsing
slices.Deleteinside a loop can be O(n²) in the worst case, as each deletion requires shifting elements and the outer loop searches through the slice again for each new object.Consider optimizing by building a map or using a different approach:
- // by removing the new objects from the current objects, we are going to be left with the objects - // that should be removed from the cluster after we're done with this loop - for oldIdx, curObj := range currentObjects { - if commonclient.SameGVKandName(curObj, newObj) { - currentObjects = slices.Delete(currentObjects, oldIdx, oldIdx+1) - break - } - } + // Mark objects that should not be deleted by filtering them out later + toKeep := make(map[string]bool) + for _, newObj := range newObjs { + if shouldCreate(newObj, nsTmplSet) { + key := fmt.Sprintf("%s/%s", newObj.GetObjectKind().GroupVersionKind().String(), newObj.GetName()) + toKeep[key] = true + } + } + + // Filter currentObjects to only those not in toKeep + filtered := currentObjects[:0] + for _, curObj := range currentObjects { + key := fmt.Sprintf("%s/%s", curObj.GetObjectKind().GroupVersionKind().String(), curObj.GetName()) + if !toKeep[key] { + filtered = append(filtered, curObj) + } + } + currentObjects = filtered
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
controllers/nstemplateset/cluster_resources.go(3 hunks)controllers/nstemplateset/cluster_resources_test.go(13 hunks)controllers/nstemplateset/features.go(1 hunks)controllers/nstemplateset/features_test.go(1 hunks)controllers/nstemplateset/nstemplateset_controller.go(2 hunks)controllers/nstemplateset/nstemplateset_controller_test.go(14 hunks)controllers/nstemplateset/status.go(2 hunks)controllers/nstemplateset/status_test.go(2 hunks)test/cluster_assertion.go(1 hunks)test/nstemplateset_assertion.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- controllers/nstemplateset/nstemplateset_controller.go
- test/nstemplateset_assertion.go
- controllers/nstemplateset/features_test.go
- test/cluster_assertion.go
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
📚 Learning: 2025-09-24T10:52:01.009Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
Applied to files:
controllers/nstemplateset/cluster_resources.gocontrollers/nstemplateset/nstemplateset_controller_test.go
📚 Learning: 2025-09-24T11:19:39.229Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:187-196
Timestamp: 2025-09-24T11:19:39.229Z
Learning: In controllers/nstemplateset/cluster_resources.go, the delete method specifically handles cluster-scoped resources only (not namespaced resources), so using types.NamespacedName{Name: toDelete.GetName()} in Client.Get calls is correct since cluster resources don't have namespaces. However, using runtimeclient.ObjectKeyFromObject() is preferred for future-proofing.
Applied to files:
controllers/nstemplateset/cluster_resources.gocontrollers/nstemplateset/nstemplateset_controller_test.gocontrollers/nstemplateset/cluster_resources_test.go
🧬 Code graph analysis (3)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
controllers/nstemplateset/nstemplateset_controller_test.go (3)
test/nstemplateset_assertion.go (5)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Updating(232-238)Provisioned(216-222)NewTierTemplateName(212-214)test/cluster_assertion.go (2)
AssertThatCluster(20-25)WithLabel(45-51)test/namespace_assertion.go (1)
AssertThatNamespace(32-38)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)Updating(232-238)UnableToProvisionClusterResources(258-265)UpdateFailed(240-247)test/cluster_assertion.go (3)
Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (21)
controllers/nstemplateset/status.go (3)
112-131: Symmetric feature drift check looks correctSet-style comparison in both directions with normalization is solid. Returning the cleaned, deduped list for status is a nice touch.
135-139: Nil-aware clusterResources drift check LGTMConcise and correct for the intended “compare by TemplateRef only” design. Based on learnings.
5-5: stdlib ‘slices’ import is supported
go.mod specifies Go 1.23, satisfying the ≥1.21 requirement.controllers/nstemplateset/features.go (1)
28-32: Thin wrapper is fineSimple delegate to the status helper; no issues.
controllers/nstemplateset/status_test.go (1)
277-343: Good coverage for feature toggles sync (including duplicates)Assertions match the normalized, deduped output and symmetric comparison behavior.
controllers/nstemplateset/nstemplateset_controller_test.go (2)
615-641: Provisioning expectations updated appropriatelyAsserting CRQ/CRB/Idlers created in the first pass and withholding TemplateRef/Tier labels until inner resources are applied aligns with the new single-pass logic. Based on learnings.
1418-1422: Helper to seed Status.FeatureToggles is appropriateThis improves test setup clarity around status-vs-annotation scenarios.
controllers/nstemplateset/cluster_resources_test.go (6)
41-56: Status transitions and base objects creation assertions LGTMValidates initial provisioning path and aligns with unified apply of cluster resources.
148-174: No-op path when already provisioned is correctEnsures idempotency and preserves Provisioned condition with existing resources.
216-231: Error path for missing TierTemplate is asserted wellMessage and UnableToProvisionClusterResources condition match the controller’s behavior.
246-250: Creation failure path assertion looks goodChecks error propagation and status condition; consistent with the new apply flow.
295-298: Correct resource type in absence assertionAsserting HasNoResource for CRQ “for-empty” is now correct.
184-186: Seed CRQ with correct space and feature annotationThe fixture creates a CRQ with empty SpaceLabel and only overrides Name. It may not be associated with the user space for cleanup. Seed labels and feature annotation explicitly.
Apply:
- crq := newClusterResourceQuota("", "advanced") - crq.Name = "feature-1-for-" + spacename // manually create the resource with the name matching the feature resource + crq := newClusterResourceQuota(spacename, "advanced", + withFeatureAnnotation("feature-1"), + withName("feature-1-for-"+spacename))controllers/nstemplateset/cluster_resources.go (8)
5-6: LGTM!Adding necessary imports for improved functionality.
29-47: LGTM! Template reference comparison with feature detectionThe short-circuit logic correctly compares template references and includes feature change detection. Based on learnings, this approach assumes TierTemplates are immutable with versioning through new template references, which is the intended design pattern.
49-81: LGTM! Comprehensive template processing with proper error handlingThe template retrieval and processing logic handles both new and old templates correctly, with contextual error wrapping for cluster resource failures.
83-105: LGTM! Well-structured status managementThe status change detection and conditional updates properly differentiate between first deployment and updates, with appropriate failure status reasons.
107-130: LGTM! Efficient object processing with proper status updatesThe object processing loop correctly applies new objects and removes them from the current objects list for later cleanup. The status change is properly handled before applying resources.
132-143: LGTM! Proper cleanup of obsolete objectsThe logic correctly handles deletion of remaining objects with appropriate status updates before deletion.
171-208: LGTM! Comprehensive delete implementation with proper error handlingThe delete method correctly processes the current template to determine objects to delete, handles missing resources gracefully, and includes proper logging and error handling.
187-206: Delete error handling is correct Theelse ifensures that a NotFound error handled in the first branch is never re‐evaluated in the second; no change required.Likely an incorrect or invalid review comment.
|
/retest I've seen Not sure what might be causing the flakiness if it is introduced by this PR yet. |
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
controllers/nstemplateset/cluster_resources_test.go (1)
373-373: Fix template parameter usage: missing key intest.WithParams.Multiple calls to
test.WithParams(spacename)are missing the parameter key. The function expects key-value pairs, but only the value is provided. This would cause template parameter substitution to fail.Based on previous review feedback, replace all occurrences:
-test.WithParams(spacename) +test.WithParams(SpaceName, spacename)This ensures the
SpaceNameconstant is used as the key andspacenameas the value.Also applies to: 405-405, 443-443, 468-468, 501-501, 567-567, 595-595, 628-628, 760-760, 820-820
🧹 Nitpick comments (1)
controllers/nstemplateset/cluster_resources_test.go (1)
34-34: Inconsistent variable naming:spacenamevsspaceName.The file uses both
spacename(line 34) andspaceName(line 363) to represent the space name. This inconsistency can lead to confusion and potential typos. Choose one convention and apply it consistently throughout the file.Consider standardizing on
spaceName(camelCase) as it follows Go naming conventions:- spacename := "johnsmith" + spaceName := "johnsmith"Then update all references to use the standardized name.
Also applies to: 363-363
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
controllers/nstemplateset/cluster_resources_test.go(13 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(224-230)Provisioned(216-222)Updating(232-238)UnableToProvisionClusterResources(258-265)UpdateFailed(240-247)test/cluster_assertion.go (4)
AssertThatCluster(20-25)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (2)
controllers/nstemplateset/cluster_resources_test.go (2)
148-174: LGTM: Idempotency test correctly validates no-op when resources are up-to-date.The test properly verifies that when
status.clusterresources.templaterefmatchesspec.clusterresources.templateref, no changes are made. This aligns with the PR's shift from label-based to status-based membership tracking.
176-201: LGTM: Feature cleanup logic correctly removes resources for disabled features.The test validates that resources for features no longer in the enabled list are properly cleaned up. The manual creation of the feature-gated CRQ (lines 184-186) correctly simulates a previously deployed feature resource, and the assertion confirms it's removed.
… nstemplateset status instead of labels
… if this method got repurposed for namespaced-objects also
Co-authored-by: Matous Jobanek <mjobanek@redhat.com>
Co-authored-by: Matous Jobanek <mjobanek@redhat.com>
by splitting it up into more functions and structs.
2f42908 to
725c159
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
controllers/nstemplateset/cluster_resources_test.go (1)
121-129: Guard against empty feature list to avoid sentinel ""Minor clarity: avoid Split on empty string to prevent a [""] slice.
- enabledFeatures := strings.Split(testRun.enabledFeatures, ",") + var enabledFeatures []string + if testRun.enabledFeatures != "" { + enabledFeatures = strings.Split(testRun.enabledFeatures, ",") + }controllers/nstemplateset/nstemplateset_controller.go (1)
235-239: Fix function comment: return type is error only, not (bool, error)The comment describes boolean returns, but deleteObsoleteObjects returns only error.
-// deleteObsoleteObjects takes template objects of the current tier and of the new tier (provided as newObjects param), -// compares their names and GVKs and deletes those ones that are in the current template but are not found in the new one. -// return `true, nil` if an object was deleted, `false, nil`/`false, err` otherwise +// deleteObsoleteObjects takes template objects of the current tier and of the new tier (provided as newObjects), +// compares their names and GVKs, and deletes objects present in currentObjs but not found in newObjects. +// Returns nil on success or an error if a deletion fails.controllers/nstemplateset/nstemplateset_controller_test.go (1)
836-849: Remove duplicate withStatusClusterResources optionwithStatusClusterResources("abcde11") is applied twice in this setup. It’s harmless but redundant.
nsTmplSet := newNSTmplSet(namespaceName, spacename, "advanced", withNamespaces("abcde12", "dev"), withClusterResources("abcde12"), - withStatusClusterResources("abcde11"), withSpaceRoles(map[string][]string{ "advanced-admin-abcde12": {spacename}, "advanced-viewer-abcde12": {"zorro", spacename, "viewer"}, }), withStatusNamespaces("abcde11", "dev", "stage"), withStatusClusterResources("abcde11"),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
controllers/nstemplateset/cluster_resources.go(3 hunks)controllers/nstemplateset/cluster_resources_test.go(11 hunks)controllers/nstemplateset/features.go(1 hunks)controllers/nstemplateset/features_test.go(1 hunks)controllers/nstemplateset/nstemplateset_controller.go(2 hunks)controllers/nstemplateset/nstemplateset_controller_test.go(14 hunks)controllers/nstemplateset/status.go(2 hunks)controllers/nstemplateset/status_test.go(2 hunks)test/cluster_assertion.go(1 hunks)test/nstemplateset_assertion.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- test/cluster_assertion.go
- controllers/nstemplateset/features.go
- controllers/nstemplateset/status_test.go
- controllers/nstemplateset/status.go
- controllers/nstemplateset/features_test.go
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
📚 Learning: 2025-09-24T10:52:01.009Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
Applied to files:
controllers/nstemplateset/cluster_resources.go
📚 Learning: 2025-09-24T11:19:39.229Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:187-196
Timestamp: 2025-09-24T11:19:39.229Z
Learning: In controllers/nstemplateset/cluster_resources.go, the delete method specifically handles cluster-scoped resources only (not namespaced resources), so using types.NamespacedName{Name: toDelete.GetName()} in Client.Get calls is correct since cluster resources don't have namespaces. However, using runtimeclient.ObjectKeyFromObject() is preferred for future-proofing.
Applied to files:
controllers/nstemplateset/cluster_resources.go
🧬 Code graph analysis (3)
controllers/nstemplateset/nstemplateset_controller_test.go (3)
test/nstemplateset_assertion.go (5)
AssertThatNSTemplateSet(35-41)Provisioning(231-237)Updating(239-245)Provisioned(223-229)NewTierTemplateName(219-221)test/cluster_assertion.go (2)
AssertThatCluster(20-25)WithLabel(45-51)test/namespace_assertion.go (1)
AssertThatNamespace(32-38)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
controllers/nstemplateset/cluster_resources_test.go (2)
test/nstemplateset_assertion.go (6)
AssertThatNSTemplateSet(35-41)Provisioning(231-237)Provisioned(223-229)Updating(239-245)UnableToProvisionClusterResources(265-272)UpdateFailed(247-254)test/cluster_assertion.go (5)
AssertThatCluster(20-25)HasDeletionTimestamp(68-72)Containing(60-66)WithoutLabel(53-58)WithLabel(45-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (2)
controllers/nstemplateset/cluster_resources.go (1)
34-37: Early return disables drift healing when templateRef/features are unchangedWith this guard, we skip reconciling cluster resources entirely if old/new refs and features are equal. Given we don’t watch cluster-scoped resources, periodic reconciles also no-op, so missing/modified resources won’t self‑heal.
If this is intentional (immutable TierTemplates and no drift-heal under same ref), please confirm. If drift-heal is desired, consider processing templates and only short‑circuit after detecting “no create/update/delete” needed, or idempotently re-apply expected objects while avoiding status flips when no actual changes occur.
Based on learnings
test/nstemplateset_assertion.go (1)
168-174: New status assertions look goodBoth helpers are correct and improve test clarity.
Also applies to: 183-188
MatousJobanek
left a comment
There was a problem hiding this comment.
Looks good, great job 👍 🚀
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MatousJobanek, metlos The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
controllers/nstemplateset/cluster_resources.go (1)
236-252: LGTM! Clean obsolete resource cleanup.The
Cleanupmethod properly:
- Defers status changes until work is needed (line 240)
- Deletes obsolete resources via
deleteObsoleteObjects(line 247)- Handles errors with appropriate status updates
Consider moving the comment from lines 245-246 closer to the
deleteObsoleteObjectscall for improved readability:func (oa *objectApplier) Cleanup(ctx context.Context) error { if len(oa.currentObjects) > 0 { - // we'll be making changes to the cluster, because there are still some unprocessed objects. - // let's reflect that in the status, if it was not yet updated. if err := oa.changeStatusIfNeeded(ctx); err != nil { return err } } - // what we're left with here is the list of currently existing objects that are no longer present in the template. - // we need to delete them + // what we're left with in currentObjects is the list of existing objects that are no longer + // present in the new template and need to be deleted if err := deleteObsoleteObjects(ctx, oa.r.Client, oa.currentObjects, nil); err != nil { return oa.r.wrapErrorWithStatusUpdate(ctx, oa.nstt, oa.failureStatusReason, err, "failure while syncing cluster resources") }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
controllers/nstemplateset/cluster_resources.go(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-24T10:52:01.009Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:46-49
Timestamp: 2025-09-24T10:52:01.009Z
Learning: In the NSTemplateSet cluster resources controller, the current approach compares template references (oldTemplateRef vs newTemplateRef) rather than comparing actual cluster resource contents. This assumes TierTemplates are immutable - any content changes result in new TierTemplate versions with new references. The short-circuit when template refs are equal is intentional based on this design, as it cannot detect changes within the same TierTemplate.
Applied to files:
controllers/nstemplateset/cluster_resources.go
📚 Learning: 2025-09-24T11:19:39.229Z
Learnt from: metlos
PR: codeready-toolchain/member-operator#700
File: controllers/nstemplateset/cluster_resources.go:187-196
Timestamp: 2025-09-24T11:19:39.229Z
Learning: In controllers/nstemplateset/cluster_resources.go, the delete method specifically handles cluster-scoped resources only (not namespaced resources), so using types.NamespacedName{Name: toDelete.GetName()} in Client.Get calls is correct since cluster resources don't have namespaces. However, using runtimeclient.ObjectKeyFromObject() is preferred for future-proofing.
Applied to files:
controllers/nstemplateset/cluster_resources.go
🧬 Code graph analysis (1)
controllers/nstemplateset/cluster_resources.go (1)
controllers/nstemplateset/nstemplatetier.go (1)
SpaceName(100-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & push operator bundles for e2e tests
🔇 Additional comments (8)
controllers/nstemplateset/cluster_resources.go (8)
3-20: LGTM!The new imports (
fmt,slices,commonclient) support the refactored all-at-once reconciliation flow.
22-60: Well-structured refactoring to single-pass reconciliation.The shift from one-resource-per-reconcile to processing all cluster resources at once is cleanly implemented. The short-circuit logic (lines 35-37) correctly handles both template reference changes and feature toggle changes (line 70), addressing drift concerns while respecting TierTemplate immutability.
62-73: LGTM! Feature-aware change detection.The inclusion of
featuresChanged(nstt)on line 70 ensures that cluster resources are reconciled when feature toggles change, even if the templateRef remains the same. This properly addresses drift scenarios for feature-gated resources.Based on learnings
75-95: Good extraction of tier template processing logic.The
processTierTemplatehelper cleanly encapsulates template retrieval and processing, making it reusable for both spec and status comparisons.
97-118: LGTM!The
applymethod correctly maintains labels for metadata while the reconciliation logic now relies on spec/status templateRef comparison for membership determination.
121-153: LGTM! Clean deletion flow.The refactored
deletemethod properly leveragesprocessTierTemplateto derive the list of resources to remove and handles edge cases (not found, already deleting) correctly. The use ofruntimeclient.ObjectKeyFromObjecton line 134 is future-proof.Based on learnings
155-191: Excellent objectApplier pattern introduction.The
objectAppliertype cleanly encapsulates the apply/cleanup lifecycle with proper separation of concerns:
- Deferred status changes (only when work is actually performed)
- Clear distinction between first deployment vs update scenarios
- Proper bookkeeping of current vs new objects
209-234: LGTM! Proper object application with feature-toggle awareness.The
Applymethod correctly:
- Respects feature toggles via
shouldCreate(line 210)- Updates bookkeeping by removing matched objects from
currentObjects(lines 216-221)- Uses
commonclient.SameGVKandNamefor comparison (line 217)- Defers status changes until actually needed (line 223)
The use of
slices.Deletewith an immediatebreak(line 219) is safe for concurrent modification during iteration.
|
/retest |



When applying cluster-scoped resourcest the NSTemplateSet controller currently does the following:
This PR changes those 2 aspects such that:
spec.clusterresources.templateref) with the last-applied templateref (status.clusterresources.templateref).Those 2 changes required quite massive changes it the unit tests. I took the opportunity to slightly clean those up, too (removing a couple of redundant tests).
This change is getting us closer to unifying the way we deploy the cluster- and namespace-scoped resources.
Note that the e2e tests are currently not passing, so I'm opening this just as a draft PR to gather some preliminary thoughts and suggestions.
Jira:
Summary by CodeRabbit
New Features
Bug Fixes
Tests