Skip to content

Allow administrators to guide upgrade ordering #2059

@cgwalters

Description

@cgwalters

Today the MCO makes no attempt to apply any ordering to which nodes it updates from the candidates. One problem we're thinking about is (particularly on bare metal scenarios where there might be a lot of pods on a node, and possibly pods expensive to reschedule like CNV) that it's quite possible that workloads are disrupted multiple times for an OS upgrade.

When we go to drain a node, its pods will be rescheduled across the remaining nodes...and then we will upgrade one of those, quite possibly moving one of the workload pods again etc.

One idea here is to add the minimal hooks such that a separate controller could influence this today.

If for example we supported a label machineconfig.openshift.io/upgrade-weight=42 and the node controller picked the highest weight node, then the separate controller could also e.g. mark $number nodes which are next in the upgrade ordering as unschedulable, ensuring that the drain from the current node doesn't land on them.

Without excess capacity or changing the scheduler to more strongly prefer packing nodes it seems hard to avoid multiple disruption, but the label would allow this baseline integration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions