Skip to content

Children Failed to Join #2

@ruidan-li

Description

@ruidan-li

Hello!

I am currently using the kinesis-python (https://github.com/NerdWalletOSS/kinesis-python) library, which use your offspring library to have multiple shard readers, and it turns out that the terminated children fail to join sometime. My main process will do some stuff and call sys.exit() when received SIGINT or SIGTERM, and based on the log, it shows "Caught signal 15" (https://github.com/borgstrom/offspring/blob/master/src/offspring/process.py#L112). And then self.end() is called and so is sys.exit() in the run(). However, the children sometimes never join, and after tracing, it is stuck at os.waitpid() (in multiprocessing/forking.py). Trying to figure out what is going on, I placed the sys.exit() in the signal_handler and it works. So I am wondering if it is possible to refactor the SubprocessLoop.run and the signal_handler, to place self.end() and sys.exit() inside the signal_hander instead. I would also be happy to hear your thoughts on this issue!

Thanks,
Ruidan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions