Skip to content

fix: detect and exit processing when a MultiPool worker terminates/crashes#1426

Open
vlaci wants to merge 1 commit intomainfrom
vlaci/pool-shutdown
Open

fix: detect and exit processing when a MultiPool worker terminates/crashes#1426
vlaci wants to merge 1 commit intomainfrom
vlaci/pool-shutdown

Conversation

@vlaci
Copy link
Contributor

@vlaci vlaci commented Mar 12, 2026

process_until_done() and _clear_output_queue() block on .get(). If a worker dies, the expected result never arrives and the main process hangs forever.

Use multiprocessing.connection.wait()1 to wait on both the output queue pipe and worker process sentinels2

Replace the global pool registry and custom SIGTERM/SIGINT handler with a plain raise SystemExit(128 + signum). This works because SystemExit propagates through connection.wait() (the underlying poll(2) returns EINTR, CPython invokes the signal handler before retrying per PEP 4753), through __exit__ which calls close(immediate=True). SIGINT needs no custom handler as Python's default already raises KeyboardInterrupt, which propagates the same way.

Wrap the graceful shutdown path in close() with except BaseException to fall back to immediate termination if anything goes wrong mid-teardown.


This is an alternative to #1422

Footnotes

  1. https://docs.python.org/3/library/multiprocessing.html#multiprocessing.connection.wait

  2. https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.sentinel

  3. https://peps.python.org/pep-0475/

@vlaci vlaci force-pushed the vlaci/pool-shutdown branch from 73e9cb4 to d119dfa Compare March 12, 2026 13:56
@vlaci vlaci self-assigned this Mar 12, 2026
@vlaci vlaci force-pushed the vlaci/pool-shutdown branch from d119dfa to 1209048 Compare March 12, 2026 14:01
@qkaiser qkaiser self-requested a review March 12, 2026 14:41
@qkaiser qkaiser added bug Something isn't working python Pull requests that update Python code help wanted Extra attention is needed labels Mar 12, 2026
Copy link
Contributor

@qkaiser qkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this very much.

self._clear_input_queue()
self._request_workers_to_quit()
self._clear_output_queue()
except BaseException:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to capture this exception and raise it at the end of close() if it exists ? Say a Ctrl-C/SIGTERM is received while we're draining the queues/waiting for workers to quit.

…ashes

`process_until_done()` and `_clear_output_queue()` block on
`.get()`. If a worker dies, the expected result never arrives and the
main process hangs forever.

Use `multiprocessing.connection.wait()`[^1] to wait on both the
output queue pipe and worker process sentinels[^2]

Replace the global pool registry and custom SIGTERM/SIGINT handler
with a plain `raise SystemExit(128 + signum)`. This works because
`SystemExit` propagates through `connection.wait()` (the underlying
`poll(2)` returns EINTR, CPython invokes the signal handler before
retrying per PEP 475[^3]), through `__exit__` which calls
`close(immediate=True)`. SIGINT needs no custom handler as Python's
default already raises `KeyboardInterrupt`, which propagates the same
way.

Wrap the graceful shutdown path in `close()` with `except
BaseException` to fall back to immediate termination if anything goes
wrong mid-teardown.

[^1]: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.connection.wait
[^2]: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.sentinel
[^3]: https://peps.python.org/pep-0475/
@vlaci vlaci force-pushed the vlaci/pool-shutdown branch from 1209048 to 51c6446 Compare March 20, 2026 15:18
@vlaci vlaci changed the base branch from main to e2e-tests March 20, 2026 15:18
Base automatically changed from e2e-tests to main March 20, 2026 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working help wanted Extra attention is needed python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants