Parallelize var_inf #46

joselotl · 2023-09-07T02:46:56Z

This should be the last summarization algorithm to be parallelized as it needs all of the p(z) to be loaded in memory at once.

sschmidt23 · 2023-09-07T06:21:32Z

We should ask Markus Rau if there are any sampling shortcuts that we can use that work while loading only subsamples of the data in memory, he may know of some statistical tricks that make that possible (I hope so, as I don't see how we can load hundreds of millions to billions of galaxies at once).

joselotl · 2023-09-07T17:36:49Z

It doesn't need to be loaded in the same node. My idea was to make sure that we have enough nodes to load all the p(z). My guess is that it will take around 10TB in total for the full catalog. It will mean to use around 20 CPU nodes from Perlmuter.

joselotl self-assigned this Sep 7, 2023

joselotl linked a pull request Nov 16, 2023 that will close this issue

46 parallelize var inf #77

Merged

4 tasks

joselotl closed this as completed in #77 Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize var_inf #46

Parallelize var_inf #46

joselotl commented Sep 7, 2023

sschmidt23 commented Sep 7, 2023

joselotl commented Sep 7, 2023

Parallelize var_inf #46

Parallelize var_inf #46

Comments

joselotl commented Sep 7, 2023

sschmidt23 commented Sep 7, 2023

joselotl commented Sep 7, 2023