-
Notifications
You must be signed in to change notification settings - Fork 50
Merge https://github.com/kubernetes-sigs/cluster-api-provider-aws:v2.10.1 (fd1ef27) into main #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Merge https://github.com/kubernetes-sigs/cluster-api-provider-aws:v2.10.1 (fd1ef27) into main #588
Conversation
…ot/cherry-pick-5793-to-release-2.10 [release-2.10] 🐛 fix: bumps golangci-lint to work with go 1.24+
…nd tests This PR updates the default value for HostAffinity from `host` to `default` as that's also the AWS platform default, and potentially a more sensible value to set if the user does not have a preference. It also improves the API's go doc comments to further explain the effects of the settings and adds a bunch more units to pinpoint the exact behaviour described in the updated doc.
…ot/cherry-pick-5801-to-release-2.10 [release-2.10] 🐛 fix: change HostAffinity default 'host'->'default' improved API doc and tests
Relaxes the validation for ROSA NodePool autoscaling to allow users to specify a minimum of 0 replicas, enabling scale-to-zero scenarios. MaxReplicas remains with a minimum of 1. Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
…ot/cherry-pick-5816-to-release-2.10 [release-2.10] 🌱 Allow ROSA NodePool autoscaling MinReplicas to be 0
Signed-off-by: serngawy <serngawy@gmail.com>
…ot/cherry-pick-5786-to-release-2.10 [release-2.10] ✨ ROSA Add logForward config AND ImageTypes
Signed-off-by: serngawy <serngawy@gmail.com>
…ot/cherry-pick-5842-to-release-2.10 [release-2.10] 🐛 Fix flaky test TestROSARoleConfigReconcileExist
the webhook server should use the tlsconfig specified in the manager options, so users setting tls fields in the manager see their preference honoured not only for the metrics server but also for the webhook server.
…ot/cherry-pick-5848-to-release-2.10 [release-2.10] 🐛 fix: use tlsconfig from the manager options for the webhook server
WalkthroughThis pull request introduces support for ROSA log forwarding capabilities, updates host affinity behavior and validation across machine and control plane types, adds image type support for ROSA machine pools, and updates multiple dependencies. Changes span workflow configurations, API type definitions, validation webhooks, CRD schemas, controller reconciliation logic, and generated code. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
Hi @cloud-team-rebase-bot[bot]. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
# Conflicts: # .github/dependabot.yml # .github/workflows/dependabot.yml # OWNERS_ALIASES # Conflicts: # .github/PULL_REQUEST_TEMPLATE.md # .github/dependabot.yml # Conflicts: # .github/workflows/codeql-analysis.yml # .github/workflows/dependabot.yml # Conflicts: # .github/workflows/codeql-analysis.yml # .github/workflows/dependabot.yml # OWNERS_ALIASES
… image to be consistent with ART for 4.18 Reconciling with https://github.com/openshift/ocp-build-data/tree/827ab4ccce9cbbcf82c9dbaf6398b61d6cff8d7a/images/ose-aws-cluster-api-controllers.yml
… image to be consistent with ART for 4.19 Reconciling with https://github.com/openshift/ocp-build-data/tree/a39508c86497b4e5e463d7b2c78e51e577be9e7d/images/ose-aws-cluster-api-controllers.yml
# Conflicts: # openshift/infrastructure-components.yaml
Signed-off-by: Nolan Brubaker <nolan@nbrubaker.com>
… image to be consistent with ART for 4.20 Reconciling with https://github.com/openshift/ocp-build-data/tree/8f77fc475c95f9d98c348deb2feb88f5952d7357/images/ose-aws-cluster-api-controllers.yml
… image to be consistent with ART for 4.21 Reconciling with https://github.com/openshift/ocp-build-data/tree/4fbe3fab45239dc4be6f5d9d98a0bf36e0274ec9/images/ose-aws-cluster-api-controllers.yml
… image to be consistent with ART for 4.22 Reconciling with https://github.com/openshift/ocp-build-data/tree/087d1930e36b609f77d73bd8a313d85c940cff4d/images/ose-aws-cluster-api-controllers.yml
d8e9ede to
8077add
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@exp/controllers/rosamachinepool_controller.go`:
- Around line 545-549: The computeSpecDiff routine is causing reconcile churn
because an empty desiredSpec.ImageType (omitted in the RosaMachinePool CR) does
not match OCM's default "Default"; update computeSpecDiff to normalize
desiredSpec.ImageType before comparison by setting desiredSpec.ImageType =
string(cmv1.ImageTypeDefault) when it is empty so the diff treats omitted and
OCM-defaulted values as equal (alternatively, add the same defaulting to the
Default() webhook or ensure buildNodePoolFromSpec logic that sets ImageType for
Windows/Default aligns with computeSpecDiff normalization).
In `@main.go`:
- Around line 147-150: The code calls flags.GetManagerOptions and logs
setupLog.Error when err != nil but continues execution, which can lead to
dereferencing the returned metricsOptions and panics; update the error path in
main after the call to flags.GetManagerOptions (variables: tlsOptions,
metricsOptions, err) to fail fast by exiting the process (e.g., return from main
or call os.Exit(1)) immediately after logging the error so no later code (that
uses metricsOptions) runs with an invalid value.
🧹 Nitpick comments (2)
exp/controllers/rosaroleconfig_controller_test.go (2)
381-382: Avoid reusing outer-scopeerrvariable inside Eventually closure.Line 381 assigns to the
errvariable declared at line 341 (fromCreateNamespace). This is confusing and inconsistent with line 604 which uses inline error checking. Use the same pattern:- err = reconciler.Client.Get(ctx, req.NamespacedName, updatedRoleConfig) - g.Expect(err).ToNot(HaveOccurred()) + g.Expect(reconciler.Client.Get(ctx, req.NamespacedName, updatedRoleConfig)).ToNot(HaveOccurred())
396-396: Add explicit.WithPolling()for consistency.This Eventually block is missing
.WithPolling(500 * time.Millisecond)which is present inTestROSARoleConfigReconcileExistat line 630. Without it, the default polling interval (10ms) will cause many more Reconcile calls than necessary.- }).WithTimeout(30 * time.Second).Should(Succeed()) + }).WithTimeout(30 * time.Second).WithPolling(500 * time.Millisecond).Should(Succeed())
| if rosaMachinePoolSpec.ImageType == string(cmv1.ImageTypeWindows) { | ||
| npBuilder = npBuilder.ImageType(cmv1.ImageTypeWindows) | ||
| } else if rosaMachinePoolSpec.ImageType == string(cmv1.ImageTypeDefault) { | ||
| npBuilder = npBuilder.ImageType(cmv1.ImageTypeDefault) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's examine the actual code at lines 545-549
cat -n exp/controllers/rosamachinepool_controller.go | sed -n '540,555p'Repository: openshift/cluster-api-provider-aws
Length of output: 742
🏁 Script executed:
#!/bin/bash
# Find computeSpecDiff implementation
rg -n "func computeSpecDiff" -A 20 -g'*.go'Repository: openshift/cluster-api-provider-aws
Length of output: 2079
🏁 Script executed:
#!/bin/bash
# Search for RosaMachinePool defaulting logic
rg -n "RosaMachinePool.*Default|func.*Default.*RosaMachinePool" -g'*.go' -iRepository: openshift/cluster-api-provider-aws
Length of output: 834
🏁 Script executed:
#!/bin/bash
# Look for ImageType handling and defaults
rg -n "ImageType.*Default|DefaultImageType|ImageTypeDefault" -g'*.go' -B 2 -A 2Repository: openshift/cluster-api-provider-aws
Length of output: 2431
🏁 Script executed:
#!/bin/bash
# Check for NodePoolToRosaMachinePoolSpec implementation
rg -n "NodePoolToRosaMachinePoolSpec" -A 15 -g'*.go'Repository: openshift/cluster-api-provider-aws
Length of output: 7984
🏁 Script executed:
#!/bin/bash
# Check the full Default() webhook implementation
cat -n exp/api/v1beta2/rosamachinepool_webhook.go | sed -n '161,200p'Repository: openshift/cluster-api-provider-aws
Length of output: 602
🏁 Script executed:
#!/bin/bash
# Check the RosaMachinePool spec definition for ImageType
cat -n exp/api/v1beta2/rosamachinepool_types.go | sed -n '75,95p'Repository: openshift/cluster-api-provider-aws
Length of output: 1096
Normalize ImageType before computing spec diff to prevent reconcile churn.
When a RosaMachinePool CR omits ImageType, the field remains empty. However, OCM defaults ImageType to "Default" when creating the NodePool. On reconciliation, computeSpecDiff detects a mismatch (empty string vs "Default") and repeatedly triggers UpdateNodePool calls even when the CR has not changed. This causes unnecessary API traffic and reconcile churn.
The Default() webhook does not currently default ImageType, and buildNodePoolFromSpec (lines 545-549) skips empty strings. Add normalization in computeSpecDiff to align empty desiredSpec.ImageType with the OCM default before comparison, or default ImageType to "Default" in the webhook.
Example normalization in computeSpecDiff
func computeSpecDiff(desiredSpec expinfrav1.RosaMachinePoolSpec, nodePool *cmv1.NodePool) string {
currentSpec := utils.NodePoolToRosaMachinePoolSpec(nodePool)
+ if desiredSpec.ImageType == "" {
+ desiredSpec.ImageType = currentSpec.ImageType
+ }
ignoredFields := []string{🤖 Prompt for AI Agents
In `@exp/controllers/rosamachinepool_controller.go` around lines 545 - 549, The
computeSpecDiff routine is causing reconcile churn because an empty
desiredSpec.ImageType (omitted in the RosaMachinePool CR) does not match OCM's
default "Default"; update computeSpecDiff to normalize desiredSpec.ImageType
before comparison by setting desiredSpec.ImageType =
string(cmv1.ImageTypeDefault) when it is empty so the diff treats omitted and
OCM-defaulted values as equal (alternatively, add the same defaulting to the
Default() webhook or ensure buildNodePoolFromSpec logic that sets ImageType for
Windows/Default aligns with computeSpecDiff normalization).
| tlsOptions, metricsOptions, err := flags.GetManagerOptions(managerOptions) | ||
| if err != nil { | ||
| setupLog.Error(err, "Unable to start manager: invalid flags") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail fast on invalid manager flags.
If err is non-nil at Line 147-150, execution continues and *metricsOptions is dereferenced at Line 186, which can panic or run with invalid config. Exit after logging.
🛠️ Suggested fix
if err != nil {
setupLog.Error(err, "Unable to start manager: invalid flags")
+ os.Exit(1)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tlsOptions, metricsOptions, err := flags.GetManagerOptions(managerOptions) | |
| if err != nil { | |
| setupLog.Error(err, "Unable to start manager: invalid flags") | |
| } | |
| tlsOptions, metricsOptions, err := flags.GetManagerOptions(managerOptions) | |
| if err != nil { | |
| setupLog.Error(err, "Unable to start manager: invalid flags") | |
| os.Exit(1) | |
| } |
🤖 Prompt for AI Agents
In `@main.go` around lines 147 - 150, The code calls flags.GetManagerOptions and
logs setupLog.Error when err != nil but continues execution, which can lead to
dereferencing the returned metricsOptions and panics; update the error path in
main after the call to flags.GetManagerOptions (variables: tlsOptions,
metricsOptions, err) to fail fast by exiting the process (e.g., return from main
or call os.Exit(1)) immediately after logging the error so no later code (that
uses metricsOptions) runs with an invalid value.
No description provided.