Skip to content

allow setting label to nodes about to be upgraded/restarted #3204

@ibotty

Description

@ibotty

Description

Because there is no agreed-upon way to signal operators that a node is drained, there are multiple ways that operators handle it.
Rook detects node drain by observing pods on the node. This works fine but feels a bit fragile.
The problem is that some operators (e.g. the Zalando PostgreSQL Operator) "detect" drains by watching node's labels. Whenever a label is not set anymore (e.g. "node-ready=true") it will (try to) failover to another DB pod on another node.

This is a feature request to update node's labels when a reboot is about to happen.

Steps to reproduce the issue:

  1. update some machineconfig,
  2. observe machine-config-daemon trying to drain a node,
  3. failing to drain the node because there is a pdb on a pod on that node,

meanwhile
4. some operator not knowing that the machine is about to be rebooted and not updating the pdb (directly or indirectly.)

  1. the node not getting drained.

Describe the results you expected:

  1. update some machineconfig,
  2. machine-config-daemon updating label machineconfiguration.openshift.io/pending-restart=false to =true,
    3a. an operator removes active workload from the node, removing/updating pdbs that affect the node,
    3b. machine-config-daemon drains the node,
  3. node reboots successful,
  4. machine-config-daemon sets label machineconfiguration.openshift.io/pending-restart=false.

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions