-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Dataset Preprocessing due to CPU affinity (?) issues #118
Comments
Hi! thanks for your contribution!, great first issue! |
That's odd indeed. I recommend using Lightning Studio to prepare your dataset. |
I can paper over the issue by setting the cpu affinity using (fishshell incoming)
I'd rather not, thanks. |
Hey @mgolub2, PyTorch Geometric supports CPU affinity mapping for their dataloader: https://github.com/pyg-team/pytorch_geometric/blob/e9648df16dcb6dde0e09b5736b1b2da5d68db2ad/docs/source/advanced/cpu_affinity.rst#L80. If you are interested, you can take inspiration to contribute to litdata native support for cpu affinity, so you don't need to hack around.
|
🐛 Bug
I'm attempting to train a model using litgpt, and the openwebtext dataset. I launch the run as normal following their examples, and the dataset preprocessing starts:
However, the workers are all using a single core!
Checking the pid, the affinity is set incorrectly (?)
I don't know why that would be though. This is fairly new behavior, running the prepare_data portion of openwebtext was quite fast few weeks ago.
To Reproduce
litgpt pretrain --config config_hub/pretrain/tinyllama
Expected behavior
Data preparation should use multiple cores
Environment
conda
,pip
, source): sourceAdditional context
Possibly thinking this is some weird interplay between threading, affinity and the cx11 ABI? Will test in a more normal configuration soon.
The text was updated successfully, but these errors were encountered: