Skip to content

Automatically terminate EC2 instances when the Kubelet stops reporting #67

@jferris

Description

@jferris

We have seen several instances where:

  • All pods on a node are marked as "Terminating"
  • The underlying node reports that the Kubelet has stopped reporting
  • EC2 healthchecks for the node are passing

We would like to deploy a mechanism to automatically detect and terminate these instances rather than doing it manually.

https://mission-control.thoughtbot.com/branch/main/aws/book/debug/cluster-errors.html#kubelet-stopped-posting-node-status

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions