Skip to content

Cloud-Config didn't get applied after boot #2020

@computeralex92

Description

@computeralex92

Description

Cloud-Config (AWS-based EC2 instances) didn't apply, so the instance configuration and required software/agents weren't deployed.

Impact

Due a dynamic infrastructure, instances get started without config, which can cause missing resources in AWS ECS clusters etc.

Environment and steps to reproduce

  1. Set-up: Flatcar 4459.2.3 / AWS EC2
  2. Task: Normal boot of a normal instance based on EC2
  3. Action(s):
    a. Instance will be created with Cloud-config
    b. The following cloudconfig is in use (only the relevant part of the blow shown error):
#cloud-config
coreos:
  units:
    - name: update-engine.service
      # command: restart
      ## If updates should be disabled: mask the unit after next restart and stop it for the initial start
      mask: true
      command: stop
  1. Error:
    The cloudinit.service stops exactly here:
Feb 19 10:36:36 ip-172-31-61-31 bash[1936]: 2026/02/19 10:36:36 Ensuring runtime unit file "etcd.service" is unmasked
Feb 19 10:36:36 ip-172-31-61-31 bash[1936]: 2026/02/19 10:36:36 Ensuring runtime unit file "etcd2.service" is unmasked
Feb 19 10:36:36 ip-172-31-61-31 bash[1936]: 2026/02/19 10:36:36 Ensuring runtime unit file "fleet.service" is unmasked
Feb 19 10:36:36 ip-172-31-61-31 bash[1936]: 2026/02/19 10:36:36 Ensuring runtime unit file "locksmithd.service" is unmasked
Feb 19 10:36:36 ip-172-31-61-31 bash[1936]: 2026/02/19 10:36:36 Calling unit command "stop" on "update-engine.service"
Feb 19 14:49:45 ip-172-31-61-31 systemd[1]: oem-cloudinit.service: Main process exited, code=killed, status=15/TERM
Feb 19 14:49:45 ip-172-31-61-31 systemd[1]: oem-cloudinit.service: Failed with result 'signal'.
Feb 19 14:49:45 ip-172-31-61-31 systemd[1]: Stopped oem-cloudinit.service - Run cloudinit.

It seems that after the stop-command to the update-engine, the service didn't continue and was only killed by a restart a few hours later.

Expected behavior

The config will be used without any error, the log should run further (this is from a healthy instance):

Feb 19 15:02:40 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:40 Ensuring runtime unit file "etcd.service" is unmasked
Feb 19 15:02:40 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:40 Ensuring runtime unit file "etcd2.service" is unmasked
Feb 19 15:02:40 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:40 Ensuring runtime unit file "fleet.service" is unmasked
Feb 19 15:02:40 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:40 Ensuring runtime unit file "locksmithd.service" is unmasked
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Calling unit command "stop" on "update-engine.service"
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Result of "stop" on "update-engine.service": done
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Calling unit command "stop" on "locksmithd.service"
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Result of "stop" on "locksmithd.service": done
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Calling unit command "start" on "sonarqube-sysctl.service"
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Result of "start" on "sonarqube-sysctl.service": done
Feb 19 15:02:41 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:41 Calling unit command "start" on "daemon-aws-ecs-agent.service"
Feb 19 15:02:46 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:46 Result of "start" on "daemon-aws-ecs-agent.service": done
Feb 19 15:02:46 ip-172-31-63-252 bash[1992]: 2026/02/19 15:02:46 Calling unit command "start" on "daemon-filebeat.service"
Feb 19 15:03:03 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:03 Result of "start" on "daemon-filebeat.service": done
Feb 19 15:03:03 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:03 Calling unit command "start" on "daemon-cadvisor.service"
Feb 19 15:03:20 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:20 Result of "start" on "daemon-cadvisor.service": done
Feb 19 15:03:20 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:20 Calling unit command "start" on "daemon-node-exporter.service"
Feb 19 15:03:24 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:24 Result of "start" on "daemon-node-exporter.service": done
Feb 19 15:03:24 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:24 Calling unit command "start" on "rpc-statd.service"
Feb 19 15:03:24 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:24 Result of "start" on "rpc-statd.service": done
Feb 19 15:03:24 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:24 Calling unit command "start" on "mnt-environment.mount"
Feb 19 15:03:26 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:26 Result of "start" on "mnt-environment.mount": done
Feb 19 15:03:26 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:26 Calling unit command "start" on "mnt-efs.mount"
Feb 19 15:03:26 ip-172-31-63-252 bash[1992]: 2026/02/19 15:03:26 Result of "start" on "mnt-efs.mount": done
Feb 19 15:03:26 ip-172-31-63-252 systemd[1]: oem-cloudinit.service: Deactivated successfully.
Feb 19 15:03:26 ip-172-31-63-252 systemd[1]: Finished oem-cloudinit.service - Run cloudinit.

Additional information

Normally, a reboot of the instance fixes the issue and the config will be used.
Yes, we are aware we should switch to Ignition, but the config is working now for years and due a ongoing migration to a different environment, we tried to avoid using resources to move to Ignition.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    Status

    🪵Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions