You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bitbucket Server 7.0 has shipped with NuProcess enabled by default, and the system uses it for the majority of the git processes it runs. Predominantly the system uses NuProcessBuilder.run, but it does sometimes use start to run them asynchronously. Most of the time, that works fine. However, we've found that, if any NuProcessHandler for an asynchronous process "misbehaves" by blocking for an unexpectedly long time, it basically causes a cascading failure that brings down the entire system and leaves zombie processes behind. (If the handler eventually becomes unblocked the system can recover, but we've seen cases where, for various reasons, the handler can never free up.) A misbehaving handler for 1 process can effectively kill a pump thread, meaning it no longer pumps any processes assigned to it.
Support for asynchronous processes is valuable, though, and so we want to have access to it. It's a great feature of NuProcess. We've done what we can to "harden" our handlers, to ensure they never block (or at most block only very briefly), but at this point I'm wondering whether you might be open to accepting a patch that made the way NuProcess handles its pump threads configurable. The default implementation could remain the current round-robin assignment across a fixed set of threads, but an alternative implementation (which is the one Bitbucket Server would use) would use something like an ExecutorService to pump a single process at once, with that thread potentially reused for a different (single) process after that process returned.
Such an approach is significantly "safer" for us, at the cost of additional threads. However, given we're switching away from an approach using Java's ProcessBuilder which requires 2-3 threads per process (plus the JVM-managed reaper thread), using a single thread per process with NuProcess still represents a considerable savings for the system.
The text was updated successfully, but these errors were encountered:
Bitbucket Server 7.0 has shipped with NuProcess enabled by default, and the system uses it for the majority of the
git
processes it runs. Predominantly the system usesNuProcessBuilder.run
, but it does sometimes usestart
to run them asynchronously. Most of the time, that works fine. However, we've found that, if anyNuProcessHandler
for an asynchronous process "misbehaves" by blocking for an unexpectedly long time, it basically causes a cascading failure that brings down the entire system and leaves zombie processes behind. (If the handler eventually becomes unblocked the system can recover, but we've seen cases where, for various reasons, the handler can never free up.) A misbehaving handler for 1 process can effectively kill a pump thread, meaning it no longer pumps any processes assigned to it.Support for asynchronous processes is valuable, though, and so we want to have access to it. It's a great feature of NuProcess. We've done what we can to "harden" our handlers, to ensure they never block (or at most block only very briefly), but at this point I'm wondering whether you might be open to accepting a patch that made the way NuProcess handles its pump threads configurable. The default implementation could remain the current round-robin assignment across a fixed set of threads, but an alternative implementation (which is the one Bitbucket Server would use) would use something like an
ExecutorService
to pump a single process at once, with that thread potentially reused for a different (single) process after that process returned.Such an approach is significantly "safer" for us, at the cost of additional threads. However, given we're switching away from an approach using Java's
ProcessBuilder
which requires 2-3 threads per process (plus the JVM-managed reaper thread), using a single thread per process with NuProcess still represents a considerable savings for the system.The text was updated successfully, but these errors were encountered: