Skip to content

Intermittent AWS AMI corruptions #840

@dongsupark

Description

@dongsupark

Description

During the recent release process, we encountered an unknown issue that os/kola/aws failed only with AWS arm64 of Stable 3227.2.2.

Console log of the Kola test says:

[    4.245983] systemd-fsck[680]: ROOT contains a file system with errors, check forced.
ROOT: fsck 0.0% complete...
[    4.340402] device-mapper: verity: sha256 using implementation "sha256-ce"
ROOT: fsck 81.4% complete...
[    4.316076] systemd-fsck[680]: ROOT: Directory inode 7252, block #0, offset 0: directory corrupted
[    4.317332] systemd-fsck[680]: ROOT: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
[    4.317526] systemd-fsck[680]: (i.e., without -a or -p options)
[    4.324259] systemd-fsck[674]: fsck failed with exit status 4.
[FAILED] Failed to start File Syste…ck on /dev/disk/by-label/ROOT.

Tried rerunning the specific kola tests, no luck.
Tried rerunning the whole vm-matrix to regenerate the AMIs, and running kola tests again. Still no luck.
It is obviously not possible to manually initiate an EC2 instance from the problematic AMI.

Impact

AWS kola tests for arm64 cannot run at all.

Environment and steps to reproduce

There is no simple way to reproduce this issue.
That issue happens only in the specific case, not in other channels, not in other archs.
We have seen a similar issue in this year, but not in Stable, not arm64.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Implemented

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions