Dataloader is the bottleneck in training? #9

mengxuyiGit · 2024-08-27T23:48:37Z

Thanks for the great work and code release!

However, I noticed a significant training speed difference between loading the original uncompressed GObjaverse data and reading from the processed h5 data, where loading h5 is about 6 times slower than the uncompressed one. The GPU util is also very low.

Is this normal?

apchenstu · 2024-08-28T05:44:54Z

Thank you for the kind words! How slow is it? It is supposed to be faster than the uncompressed version since it doesn't need to be decompressed, and the dataset is stored in a hierarchical structure. The GPU utilization is around 90%-100% on my side.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataloader is the bottleneck in training? #9

Dataloader is the bottleneck in training? #9

mengxuyiGit commented Aug 27, 2024

apchenstu commented Aug 28, 2024

Dataloader is the bottleneck in training? #9

Dataloader is the bottleneck in training? #9

Comments

mengxuyiGit commented Aug 27, 2024

apchenstu commented Aug 28, 2024