-
Notifications
You must be signed in to change notification settings - Fork 26
Race condition in taint patching can clobber concurrent taint modifications #179
Description
Summary
The addTaintBySpec and removeTaintBySpec functions in node_controller.go use client.StrategicMergeFrom() without optimistic locking, which can cause race conditions where concurrent taint modifications from other controllers are silently overwritten.
Background
When NRC adds or removes a taint, it:
- Creates a deep copy of the node at time T1
- Modifies the taints list
- Patches the node
If another controller (e.g., Karpenter, CSI drivers, cloud-controller-manager) modifies the node's taints between T1 and the patch, NRC's patch will overwrite those changes with stale state.
Current Code
func (r *RuleReadinessController) removeTaintBySpec(ctx context.Context, node *corev1.Node, taintSpec corev1.Taint, ruleName string) error {
patch := client.StrategicMergeFrom(node.DeepCopy()) // No optimistic locking
var newTaints []corev1.Taint
for _, taint := range node.Spec.Taints {
if taint.Key != taintSpec.Key || taint.Effect != taintSpec.Effect {
newTaints = append(newTaints, taint)
}
}
node.Spec.Taints = newTaints
if err := r.Patch(ctx, node, patch); err != nil {
return err
}
// ...
}Race Condition Example
T1: NRC caches node taints: [karpenter.sh/unregistered, xyz.co/taint]
T2: Karpenter removes karpenter.sh/unregistered → [xyz.co/taint]
T3: NRC removes xyz.co/taint using stale T1 state
→ patches with [karpenter.sh/unregistered] // Karpenter's removal reverted!
This affects any taint modified by other controllers during the window between NRC's read and write.
Proposed Fix
Use MergeFromWithOptions with optimistic locking, as Karpenter does for both add and remove taint operations:
https://github.com/kubernetes-sigs/karpenter/blob/main/pkg/controllers/state/statenode.go#L483-L533
// Karpenter's RequireNoScheduleTaint (handles both add AND remove):
stored := node.DeepCopy()
// ... taint modifications (add or remove) ...
if !equality.Semantic.DeepEqual(stored, node) {
// We use client.MergeFromWithOptimisticLock because patching
// a list with a JSON merge patch can cause races due to the
// fact that it fully replaces the list on a change
return kubeClient.Patch(ctx, node,
client.MergeFromWithOptions(stored, client.MergeFromWithOptimisticLock{}))
}The fix for NRC would be:
// Fails with conflict if node changed, allowing retry
patch := client.MergeFromWithOptions(node.DeepCopy(), client.MergeFromWithOptimisticLock{})With optimistic locking, if the node's resourceVersion changed between read and write, the patch fails with a conflict error, allowing the controller to retry with fresh state.
Impact
Nodes stuck with startup taints that should have been removed.