-
Notifications
You must be signed in to change notification settings - Fork 53
Open
Description
IPC handshake may block indefinitely during container startup
Description
The IPC handshake between urunc create and urunc start can block indefinitely if the peer process never connects or sends the expected message.
AwaitMessage() currently performs an unbounded wait while:
- accepting the Unix socket connection
- reading the IPC message
If the peer process is interrupted (for example due to a containerd restart, OOM kill, or node under heavy load), the waiting process may never exit.
Impact
- Orphaned
urunc --reexecprocesses - Containers stuck in
ContainerCreating - Gradual resource leaks on the node
- No clear error surfaced to the caller
Reproduction hints (non-deterministic)
This behavior is timing-dependent, but has been observed when:
- Restarting
containerdbetweenurunc createandurunc start - Terminating the
urunc startprocess during startup - Running on a heavily loaded node
Minimal repro outline (best-effort)
- Start container creation using urunc.
- Interrupt the startup sequence before
urunc startcompletes. - Observe that the IPC helper process remains blocked indefinitely.
Expected behavior
Container startup should either succeed or fail with a clear error.
It should not hang indefinitely in failure paths.
Related work
There is an open PR that adds a bounded timeout to the IPC handshake to avoid unbounded blocking. This issue is intended to document the problem and gather feedback on the appropriate behavior.
Metadata
Metadata
Assignees
Labels
No labels