-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REQ - Guidance on writing R code that will run in parallel #10
Comments
@ciarag01 @jakeybob @Moohan @rmccreath Just alerting you to this issue. My plan is to draft some guidance, as it's clear that most R code in PHS is being written as single-threaded and not taking advantage of the multiple CPUs available in a Posit Workbench session. This could result in significant performance improvements when processing large datasets. |
In case you see any queries coming in requesting guidance on parallel processing, you can tell people that this is on our radar and we're developing guidance for this. |
|
I'm not sure what the best route is here. All the different available methods make it quite thorny. I don't think And, I feel like any guidance along the lines of "here are several different ways you can do this" won't be well received. So do we choose one way to recommend...? This would be better for consistency and support/training but a) I'm not convinced this is the best idea and b) even if it is, I don't know which method would be the best to pick... Should probably sidestep the |
Thought this might be the best place to ask this question, and if no one knows it's just another thing to add to future guidance! If I use For example, if I have a session with 8 CPUs and 4GB of RAM, will this be shared among the 'sessions' or will it spawn new nodes for the new sessions, in which case what limits/specs do they have? |
I suspect this will run in the current session only and the workers spawned will be more equivalent to "background jobs" (running as independent R processes but sharing the parent session total resources) than "workbench jobs" (starting new sessions with their own resources). Only one way to find out for sure though – give it a punt and see what happens? 😀 |
For example:
{parallelly}
{furrr}
package, rather than{purrr}
{multidplyr}
backend with{dplyr}
Other useful links:
The text was updated successfully, but these errors were encountered: