Add kernel execution cancellation via SIGUSR1 signal#339
Open
Edwardvaneechoud wants to merge 2 commits intofeauture/kernel-implementationfrom
Open
Add kernel execution cancellation via SIGUSR1 signal#339Edwardvaneechoud wants to merge 2 commits intofeauture/kernel-implementationfrom
Edwardvaneechoud wants to merge 2 commits intofeauture/kernel-implementationfrom
Conversation
When a user cancels a flow with a running python_script node, the kernel code previously kept running until completion or the 300s httpx timeout. This adds SIGUSR1-based interruption so kernel executions can be stopped promptly. - kernel_runtime: Register SIGUSR1 handler that raises KeyboardInterrupt during exec(), with _is_executing guard to ignore signals outside of code execution. The /execute endpoint now catches KeyboardInterrupt and returns a cancellation response. - KernelManager: Add interrupt_execution_sync() that sends SIGUSR1 to the kernel container via Docker API (container.kill). - FlowNode: Add _kernel_cancel_context attribute set during kernel execution. cancel() now checks this context and triggers the kernel interrupt alongside the existing worker fetcher cancellation. - FlowGraph: Set/clear _kernel_cancel_context around execute_sync() in add_python_script._func(). https://claude.ai/code/session_018zriRonXcPshWgksMcZeCY
- Remove test that raised KeyboardInterrupt through TestClient (leaked through the ASGI thread boundary causing pytest exit code 130) - Reset _is_executing flag in conftest between tests - Trim verbose comments and docstrings https://claude.ai/code/session_018zriRonXcPshWgksMcZeCY
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements execution cancellation support for kernel-based code execution by sending SIGUSR1 signals to running containers. When a user cancels a node execution, the system now interrupts the kernel's running code instead of waiting for it to complete.
Key Changes
Kernel Manager: Added
interrupt_execution_sync()andinterrupt_execution()methods to send SIGUSR1 signals to kernel containers, with proper state validation and error handling.Kernel Runtime: Implemented signal handler infrastructure:
_cancel_signal_handler()that raisesKeyboardInterruptwhen SIGUSR1 is received during code executionexec()calls with_is_executingflag to track execution stateKeyboardInterruptexception handling in the/executeendpoint to return a cancellation responseFlowNode: Enhanced
cancel()method to:_kernel_cancel_contextattributeFlowGraph: Set and clear
_kernel_cancel_contexton nodes during kernel execution to enable cancellation.Tests: Added comprehensive test coverage for:
Implementation Details
The cancellation mechanism works by:
FlowNode.cancel()callsKernelManager.interrupt_execution_sync()_is_executingflag)KeyboardInterruptto abort the running code/executeendpoint catchesKeyboardInterruptand returns a cancellation responseThis approach is non-blocking and works within the constraints of Python's signal handling (main thread only in production, tested component-wise in test environments).