You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please indicate the following details about the environment in which you found the bug:
SDV version: 1.0.0
Python version: 3.9
Operating System: Linux
Error Description
It is giving out of memory error even with very small dataset of singular table of 50k record even with very large AWS EC2 instance r5a.12xlarge and with different GPU machines. I am not able to use CUDA here.
File ~/miniconda3/envs/sdv_0172/lib/python3.9/site-packages/torch/autograd/init.py:173, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
168 retain_graph = create_graph
170 # The reason we repeat same the comment below is that
171 # some Python versions print out the first line of a multi-line function
172 # calls in the traceback and some print out the last line
--> 173 Variable.execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
174 tensors, grad_tensors, retain_graph, create_graph, inputs,
175 allow_unreachable=True, accumulate_grad=True)
RuntimeError: [enforce fail at alloc_cpu.cpp:66] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 3530813440 bytes. Error code 12 (Cannot allocate memory)
The text was updated successfully, but these errors were encountered:
Hi @harsh-sengar, thanks for filing this issue. You mention there are 50K records in your dataset. Can you tell me how many columns it has? This will help us debug.
I'll close this issue off since it's been a few weeks. Please feel free to reply if there is more to discuss and I can reopen the issue for investigation.
Environment Details
Please indicate the following details about the environment in which you found the bug:
Error Description
It is giving out of memory error even with very small dataset of singular table of 50k record even with very large AWS EC2 instance r5a.12xlarge and with different GPU machines. I am not able to use CUDA here.
File ~/miniconda3/envs/sdv_0172/lib/python3.9/site-packages/torch/_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
387 if has_torch_function_unary(self):
388 return handle_torch_function(
389 Tensor.backward,
390 (self,),
(...)
394 create_graph=create_graph,
395 inputs=inputs)
--> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File ~/miniconda3/envs/sdv_0172/lib/python3.9/site-packages/torch/autograd/init.py:173, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
168 retain_graph = create_graph
170 # The reason we repeat same the comment below is that
171 # some Python versions print out the first line of a multi-line function
172 # calls in the traceback and some print out the last line
--> 173 Variable.execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
174 tensors, grad_tensors, retain_graph, create_graph, inputs,
175 allow_unreachable=True, accumulate_grad=True)
RuntimeError: [enforce fail at alloc_cpu.cpp:66] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 3530813440 bytes. Error code 12 (Cannot allocate memory)
The text was updated successfully, but these errors were encountered: