Conversation
| // during the cycling process from being removed by Cluster Autoscaler before | ||
| // the corresponding old nodes are fully terminated. | ||
| // See: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-can-i-prevent-cluster-autoscaler-from-scaling-down-a-particular-node | ||
| const clusterAutoscalerScaleDownDisabledAnnotation = "cluster-autoscaler.kubernetes.io/scale-down-disabled" |
There was a problem hiding this comment.
could we generalise this behaviour and give the annotation via the CNR instead?
There was a problem hiding this comment.
A few reasons why i didn't
The annotation is tied to a node lifecycle, not a cnr lifecycle. If annotations were the in cnr spec, deleting a cnr before cleanup could leave stale annotations.
Autoscaler reads annotations from a node object, not the cnr . If we put them in cnr spec, we would have to sync them to nodes.
cnrs can affect overlapping nodegroups, having annotations in cnr spec can create race conditions and ownership issues.
There was a problem hiding this comment.
If we put them in cnr spec, we would have to sync them to nodes.
This is exactly what I'm suggesting. We are already doing this in the code, I am just proposing that the annotation(s) get defined in the CNR rather than hardcoded.
cnrs can affect overlapping nodegroups
Each CNR is generated by observer for a single nodegroup. If a CNR affects nodes across nodegroups that would be a misconfiguration somewhere.
There was a problem hiding this comment.
I don't think we need to be generic here, the annotations are very specific to cluster-autoscaler. If we need another label, we probably want a code change because it will need to be used in specific conditions.
There was a problem hiding this comment.
What this comment has caused me to re-evaluate is that we should add a field on the CNR to enable/disable this behaviour and default it to true. That way if any users of the software have an issue with this particular annotation/workflow, they can disable the feature and keep their cluster working.
There was a problem hiding this comment.
If we have the annotation defined in the CNR then that achieve's it. It wouldn't be enabled by default because the annotations need to be added for the behaviour to take effect. I'd argue this feature is just about adding annotations to new instances and then removing them later. We are doing it for cluster-autoscaler of course but it can apply more widely.
There was a problem hiding this comment.
i added the option to opt out of it in the nodegroup (not cnr) see 67d6d0f#commitcomment-175885640 for reasoning
There was a problem hiding this comment.
I like putting it on the nodegroup, it feels like something that should be associated with a base nodegroup config.
Cluster Autoscaler Annotation Management - Feature Summary
Adds annotation management to coordinate between Cyclops and Cluster Autoscaler during node cycling, preventing Cluster Autoscaler from removing new nodes before old nodes are drained.
Problem
During node cycling, Cluster Autoscaler can remove new nodes before Cyclops finishes draining old nodes, causing cycling to fail or nodes to be terminated prematurely.
Solution
cluster-autoscaler.kubernetes.io/scale-down-disabled: "true"annotation to new nodes during ScalingUp phasecyclops.atlassian.com/annotation-managedto track ownership (distinguishes Cyclops-managed vs pre-existing annotations)cyclops.atlassian.com/disable-annotation-management: "true"disables management (default: enabled)Key Changes
pkg/controller/cyclenoderequest/transitioner/util.go:
addScaleDownDisabledAnnotation()- Adds annotation + marker (preserves pre-existing)cleanupScaleDownDisabledAnnotations()- Removes annotations only if marker presentshouldManageAnnotations()- Checks NodeGroup opt-out annotationgetNodeGroup()- Finds matching NodeGroup resourcepkg/controller/cyclenoderequest/transitioner/transitions.go:
transitionScalingUp()- Adds annotations (if enabled)transitionSuccessful()/transitionHealing()- Removes annotations (if added)pkg/controller/cyclenoderequest/transitioner/transitioner.go:
nodeGroupAnnotationKey,cyclopsManagedAnnotationOpt-Out Configuration
Why NodeGroup? Persistent configuration (survives CNR deletion), aligned with node lifecycle, prevents conflicts with overlapping CNRs.
Design Decisions
Verification