Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch multi-core to parallel package #151

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mariusbarth
Copy link

Hi @richarddmorey,

I started playing around with the code on my fork because I was frustrated that there was no multi-core option on Windows machines (I am working on Linux myself, but collaborating with Windows folks and sharing dynamic documents with them). I originally intended to implement a multi-core option for Windows users, and therefore switched to the parallel package that provides multi-core facilities on Windows and Unix-alikes, and is shipped with base R. Unfortunately, the parallel package only provides PSOCK clusters on Windows, and these turned out to hamper performance (compared to single-core performance).
Therefore, I again switched off the multi-core option on Windows and moved to fork clusters for Unix-alikes.

While I failed my ultimate goal of providing a multi-core option for Windows users, I found that my new implementation comes with some benefits for users of unix-alikes, so I thought it might be worthwhile to adopt the changes.

  • It is faster: With a quad-core CPU, code runs faster by a magnitude of 2-3 (compared to 1.5-2 with the old implementation)
  • It is possible to provide a cluster in advance, which makes BayesFactor ready for HPC computing.
  • Dependencies to foreach and doMC are not necessary, anymore.

I don't know if I missed something important, so feel free to dismiss the code changes suggested here. ;)

Here is an example of how to run code on a pre-specified cluster:

library(parallel)
library(BayesFactor)

# Create a default cluster that can be used by anovaBF()
cl <- makeForkCluster(4L)
setDefaultCluster(cl)
getDefaultCluster()
data(puzzles)

# duplicate data set so that the computations last a bit longer:
puzzles$set <- 1
puzzles2 <- puzzles
puzzles2$set <- 2
puzzles <- rbind(puzzles, puzzles2)
puzzles$set <- factor(puzzles$set)

library(microbenchmark)

out <- microbenchmark(
  out_a <- anovaBF(RT ~ shape*color*set + ID, data = puzzles, whichRandom = "ID", progress = FALSE),
  out_b <- anovaBF(formula = RT ~ shape*color*set + ID, data = puzzles, whichRandom = "ID", progress = FALSE, multicore = TRUE),
  times = 20
)

@crsh
Copy link

crsh commented Feb 15, 2024

Hi Richard, I had some issues with doMC on my Mac (computation starts but never seems to finish), so I was wondering whether this PR is still under consideration. Cheers.

@richarddmorey
Copy link
Owner

I think this got lost in the shuffle of things several years ago, but I don't see any reason why it can't be implemented. However, importantly it needs to work under several other softare packages that might us it (JASP, jamovi) and I'm not sure how such a change would interact with them. @vandenman - do you have a view on this?

@vandenman
Copy link
Collaborator

@richarddmorey I checked and JASP only uses lmBF from the ANOVA functions (not anovaBF or generalTestBF). I think jamovi also only uses those functions (based on this search). So this should be safe to merge on our end.

@mariusbarth
Copy link
Author

Hi @richarddmorey, I just resolved the tiny merge conflict that came from a differing RoxygenNote in the DESCRIPTION file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants