enhance flow for custom machineconfigs for specific machinesets

The CI team is trying to use new AWS m5d.xlarge instances which have two NVMe disks attached.  We crafted a [custom RAID partition machineconfig](https://github.com/openshift/release/pull/8102) to enable that.

We added this as part of the main `worker` pool - the MCD will fail to roll out the partitioning on the *existing* workers, but that's fine because the plan was to "roll" the worker pool.  Basically get the new MC in the pool, have new workers come online with that config, then scale down the old workers.

However, there are a few issues here.

First, this whole thing would obviously be a *lot* better if we had machineset-specific machineconfigs.  That would solve a bunch of races and be much more elegant.

What we're seeing right now is that one new m5d node went `OutOfDisk=true` because it was booted with just a 16G root volume from the *old* config.  That unschedulable node then blocks rollout of further changes.

I think we can unstick ourselves here by deleting that node and getting the MCO to roll out the new config.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enhance flow for custom machineconfigs for specific machinesets #1619

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

enhance flow for custom machineconfigs for specific machinesets #1619

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions