-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelizing PALM #20
Comments
Permutations lend themselves to running in parallel. One option is to use CPUs in parallel (e.g. parfor), the other option is to use the GPU (gpuArray). The latter can be a big win, but be aware that writing and reading memory to the GPU is slow and that some consumer GPUs have very slow double-precision maths (fp64) relative to single precision (fp32), so you if single precisions acceptable you would get better performance by explicitly using singles, e.g. G = gpuArray(single(X));. Below is code from tests I did with Ben Torkian to explore this. With datasets typical to my team, the GPU far outperformed the CPU. On the other hand, it does require a CUDA capable (Nvidia) GPU and the correct Matlab toolbox.
|
@neurolabusc So, the custom code modification is required for the cuda utilization for PALM? |
Permutation thresholding is an example of an embarrassingly parallel task that can be accelerated by running variations on different cores simultaneously. One can use parfor to leverage multiple CPUs at the same time. Alternatively, if you have a NVidia CUDA-compatible graphics card you can use gpuarray. Both require the Parallel Computing Toolbox.
|
Is there any way to run permutations in parallel? FSL randomise has the randomise_parallel script that runs multiple copies of randomise in parallel and then combines the resulting maps into one single file. Is a similar procedure possible with PALM?
The text was updated successfully, but these errors were encountered: