Replies: 1 comment 1 reply
-
Hi. I removed the 'bug', because there's no bug here in the future framework. I also turned it into a GitHub Discussion, since this is more of a general discussion around R's memory management system and nothing to be fixed in the future framework. It sounds like future.callr might do what you want; plan(future.callr::callr) It'll launch a new, fresh R process for each future, and when the future is resolved, the worker process is ended. The tradeoff is, obviously, more overhead from starting up and shutting down the background workers. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi hi, I really likes this package, packages like furrr uses it a lot, and now I'm going inside this package, very powerful, recently I notice the scripts was using a lot of ram, then I want to inspect it, actually R have some series problems handling its own memory, how R don't organize very well its own memory causes it can't be free even if "is free", here some info:
https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-R-apparently-not-releasing-memory_003f
R should move the objects from higher ram address to lower ones to can release it from the SO, but seems it just don't do that, this causes R mainly increases the Ram all the time, we can't remove or free items (we can, but the ram will not be unloaded), and the garbage collector don't works here.
After check this, I found we can use multiprocessing to assign vars and release the memory that can be cleaned, in my tests this is a lot of memory, loading a file of maps I notice, from the used 6GB of ram, 5GB of them is data used to load the file, and 1GB was the file loaded, this 5GB can't be collected by the garbage collector that should be cleaned, well now go back to the issue.
One way to can keep the ram clean is with multiprocessing, open a new R instance, do things, and return the result, the final process will be closed and the garbage will be removed, great, I have playing with future and furrr, now the results.
SO: Linux, gentoo 64
future: 1.25.0
The behavior seems to depend on the plan:
multicore
, every worker will be loaded, and the ram will be cleaned when the child ends, great.multisession
andcluster
will execute all the childs but none of them will release the Ram, + furrr is a super ram eater, set again the plan will clean the ram, probable because will close the process and open a new ones.multicore
don't release the Ram too, but in the case, even if we set the plan again the Raom will not be unloaded.The code I use to this tests:
This affects a lot when we work with a lot of data..., particularly I was working constructing maps with furrr +
multisession
, the process was using..., more than 60GB of ram, I even need to set more swap to can continue.As I know,
multisession
don't close the new process to can recicle them, an option to not recicle them, close them and open a new ones every time would be great, more time, but a lot of less Ram usage.In the case of the nested future, I don't have idea why the ram has not been free after the process ends.
Future is great, I thing this can helps to balance the workload, is ideal have fast scripts, but some times, we need sacrifice time to save other resources to can keep it working.
Thx!
Beta Was this translation helpful? Give feedback.
All reactions