Skip to content

Support AlwaysEjectOneHost configuration in Outlier Detection #8140

@Inode1

Description

@Inode1

Description:

Currently, Envoy Gateway's Outlier Detection configuration doesn't expose the AlwaysEjectOneHost field that exists in Envoy's native outlier detection configuration. This feature is useful for certain load balancing scenarios where it's beneficial to always have at least one host ejected regardless of the value of max_ejection_percent.

Motivation for alwaysEjectOneHost

  • Percentage-based ejection limits (max_ejection_percent) can round down to zero for small clusters (e.g. 2–9 replicas), making outlier detection ineffective even when a host is clearly unhealthy. Many real-world deployments use dynamic autoscaling, where the number of replicas changes frequently and cannot be reliably tuned with static percentage thresholds. In practice, failures often affect a single instance at a time (e.g. node failure, bad pod, transient network issue). Ensuring at least one host can be ejected significantly reduces user-facing impact in these common scenarios. Increasing max_ejection_percent globally (e.g. to 50%) to compensate is undesirable, as it increases the risk of excessive ejections in larger clusters and widens the blast radius during real incidents.Upstream issue Outlier_detection: enforce max_ejection_percentage envoy#27909 (comment)

  • Historically, Envoy’s outlier detection behavior guaranteed that at least one host would be ejected outlier_detection: add always_eject_one_host envoy#34796, regardless of max_ejection_percent. Removing this behavior can cause silent regressions

  • Introducing opt-in field preserves backward compatibility

[optional Relevant Links:]
https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/outlier_detection.proto#config-cluster-v3-outlierdetection

Metadata

Metadata

Assignees

Labels

area/apiAPI-related issues

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions