Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swap the order of terminating workers: from worker first to childs first #1092

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ichuan
Copy link

@ichuan ichuan commented Oct 31, 2018

Currently when circus terminating a worker, it sends signal to worker, and then to each child process.

But according to send_signal_child , when sending signals to a child, psutil's Process.children() is used to live query the children of the parent worker process, which is signaled death before. This is the cause that child process don't receive signal.

A workaround is changing the order, and send signal to childs first.

May be a fix for #1044

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 62.866% when pulling 3896985 on ichuan:master into 583e2c1 on circus-tent:master.

Copy link
Contributor

@k4nar k4nar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you for your contribution!

Could you try to add a unit test to ensure this behavior? This is such a sensitive part of the code, I would be more confident if this was tested :) .

@bugchecker
Copy link

bugchecker commented Nov 4, 2019

If self.send_signal_process called by self.kill_process, recursive first time will be False. So process.children() return only direct children. If parent process terminated but not terminate all its childs, in the future after graceful_timeout self.send_signal_process will be called once again with recursive=True. But process.children() will not be able to return the children because parent process will not exist and the children will be "orphaned".

I suppose for that case full list of the children should be saved previously, before first calling of the self.send_signal_process in self.kill_process (and used for detect is_alive and sending signals to children). One more thought: when was detected is_alive=False for parent process, do check is_alive for its children and send SIGTERM signals if alive (to not wait graceful_timeout and not terminate its emergency).

Tested with npm start which starts and finishes (after SIGTERM) sh -c "..." but does not finish children started by sh -c "...".

Copy link

@bugchecker bugchecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants