-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
In 1.20 the exec probe timeout will start being enforced:
Before Kubernetes 1.20, the field timeoutSeconds was not respected for exec probes: probes continued running indefinitely, even past their configured deadline, until a result was returned.
So if this callback was not intended/tested to be running under 1 second, agent may start being killed in case of heavy load or resource starvation as liveness probe will start failing:
kubernetes-configs/logging-agent.yaml
Lines 46 to 64 in f01ceca
| livenessProbe: | |
| exec: | |
| command: | |
| - /bin/sh | |
| - -c | |
| - | | |
| LIVENESS_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-300}; STUCK_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-900}; if [ ! -e /var/run/google-fluentd/buffers ]; then | |
| exit 1; | |
| fi; touch -d "${STUCK_THRESHOLD_SECONDS} seconds ago" /tmp/marker-stuck; if [[ -z "$(find /var/run/google-fluentd/buffers -type f -newer /tmp/marker-stuck -print -quit)" ]]; then | |
| rm -rf /var/run/google-fluentd/buffers; | |
| exit 1; | |
| fi; touch -d "${LIVENESS_THRESHOLD_SECONDS} seconds ago" /tmp/marker-liveness; if [[ -z "$(find /var/run/google-fluentd/buffers -type f -newer /tmp/marker-liveness -print -quit)" ]]; then | |
| exit 1; | |
| fi; | |
| failureThreshold: 3 | |
| initialDelaySeconds: 600 | |
| periodSeconds: 60 | |
| successThreshold: 1 | |
| timeoutSeconds: 1 |
I recommend to bump the value to some big number after testing it
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels