You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I'm trying to train gpt2xl, but keep getting OOM, even when I set batch size to 1 and gradient_accumulation to 8\16\512, contigous_gradients false and allgather_bucket_size \ reduce_bucket_size 2e2.
I can see in nvidia-smi that I'm only reaching half the memory capacity - around 12GB
My system is as stated - 3090 with 24GB memory
80 GB RAM
5600x cpu if that matters
running WSL2 on windows 10
Thanks.
The text was updated successfully, but these errors were encountered:
Hi,
I'm trying to train gpt2xl, but keep getting OOM, even when I set batch size to 1 and gradient_accumulation to 8\16\512, contigous_gradients false and allgather_bucket_size \ reduce_bucket_size 2e2.
I can see in nvidia-smi that I'm only reaching half the memory capacity - around 12GB
My system is as stated - 3090 with 24GB memory
80 GB RAM
5600x cpu if that matters
running WSL2 on windows 10
Thanks.
The text was updated successfully, but these errors were encountered: