Update documentation in `AbstractDataset` to show proper creation of a custom pytorch dataset #255

mjo22 · 2024-08-08T15:14:12Z

Most importantly we need to update the __getitem__, since we cannot return arbitrary pytrees in torch.utils.data.Datasets.

However, this will also involve thinking about how to load things so that we do not make unnecessary array copies. Taking the RelionDataset for example, some questions are

Should we be explicitly converting between JAX arrays are torch tensors in the __getitem__? This would give us the control to make sure conversion is copy-free, on either GPU or CPU (see https://jax.readthedocs.io/en/latest/jax.dlpack.html)
Do we need an is_cpu_array boolean in the RelionDataset to force JAX arrays to be read on the CPU? We definitely do not want to move to the GPU and back to the CPU unnecessarily.

The text was updated successfully, but these errors were encountered:

mjo22 · 2024-08-09T16:02:39Z

Progress on point 2 in #257. We should test that things actually work out to be copy-free. It may also be possible to go from GPU jax to GPU torch in a copy-free way using this.

mjo22 assigned mjo22 and DSilva27 Aug 8, 2024

mjo22 added the documentation Improvements or additions to documentation label Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update documentation in `AbstractDataset` to show proper creation of a custom pytorch dataset #255

Update documentation in `AbstractDataset` to show proper creation of a custom pytorch dataset #255

mjo22 commented Aug 8, 2024 •

edited

Loading

mjo22 commented Aug 9, 2024

Update documentation in AbstractDataset to show proper creation of a custom pytorch dataset #255

Update documentation in AbstractDataset to show proper creation of a custom pytorch dataset #255

Comments

mjo22 commented Aug 8, 2024 • edited Loading

mjo22 commented Aug 9, 2024

Update documentation in `AbstractDataset` to show proper creation of a custom pytorch dataset #255

Update documentation in `AbstractDataset` to show proper creation of a custom pytorch dataset #255

mjo22 commented Aug 8, 2024 •

edited

Loading