-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource requirement tweaks #107
base: dev
Are you sure you want to change the base?
Resource requirement tweaks #107
Conversation
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 2.14.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
|
I've made the modifier controlling the proportion of JVM maximum heap space allocated from total available memory configurable for users with defaults defined at the module level. For processes where large amounts of memory is required I've increased the default proportion from the suggested 75%. Similarly for processes with low amounts of memory allocated, the default has also been increased since the modifier doesn't scale well at the low-end imo. I'd rather not sacrifice usable memory to optimise where not strictly needed - please let me know if these defaults are a problem with your set up and we can adjust as necessary. I'll run some tests to ensure these changes to indeed work, let's discuss any points you'd like review in the meantime! |
Looks great, I think it's a good solution to make this setting configurable. |
Solves #106 .
When testing out oncoanalyser against some more difficult samples I ran into quite some erratic behaviour of stages running out of memory.
In general this was caused by the java process allocating 95% of available memory, followed by the java process invoking an external (usually R) script. This would cause the container to go over the memory limits set in the
base.config
, and the container would be killed by the supervisor (in my case k8s).Even without an external application (outside of java) being invoked this can happen. The
Xmx
parameter only sets the size of the heap. The GC operates outside of this. So if the heap fills up, the GC kicks in, filling up the remaining memory, and causing the whole container to go OOM.Also, some stages, like purple and sage are set to a resource class that is too low for the amount of work they need to do, especially for more difficult samples.
Therefore we don't set the size of the heap higher than 75% of the container memory limits when running hmftools (see pipeline5, second is heap memory and third arg is total memory). After changing this the stability of the pipeline greatly improved. I went from sage+purple pretty much always going OOM to the pipeline completing successfully even for difficult samples.
I also bumped the process requirements for purple, sage, and orange. This brings it more in line with the resource requirements set in pipeline5. These extra resources are necessary to finish more difficult samples in a reasonable of time and without going OOM.