OOM issue #29

JunhyeongDoyle · 2023-06-22T20:47:50Z

Hi! Thanks for sharing awesome work.

I'm trying to train a model with dynerf data, but I keep encountering an OOM issue.

( Adjusted down_sample and num_steps in config for initial dynerf training )

'save_outputs': True,
 'scene_bbox': [[-3.0, -1.8, -1.2], [3.0, 1.8, 1.2]],
 'scheduler_type': 'warmup_cosine',
 'single_jitter': False,
 'time_smoothness_weight': 0.001,
 'time_smoothness_weight_proposal_net': 1e-05,
 'train_fp16': True,
 'use_proposal_weight_anneal': True,
 'use_same_proposal_network': False,
 'valid_every': 30000}
2023-06-23 04:43:45,251|    INFO| Loading Video360Dataset with downsample=4.0
Loading train data: 100%|███████████████████████████████████████████████████████████████| 19/19 [00:41<00:00,  2.20s/it]2023-06-23 04:44:53,937|    INFO| Computed 1953572400 ISG weights in 24.48s.
killed

When checked with dmesg, the following error appeared:

[2916867.742639] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0-1,global_oom,task_memcg=/user.slice/user-1004.slice/session-1589.scope,task=python,pid=3555848,uid=1004
[2916867.742721] Out of memory: Killed process 3555848 (python) total-vm:167336276kB, anon-rss:123357620kB, file-rss:4kB, shmem-rss:8kB, UID:1004 pgtables:243752kB oom_score_adj:0
[2916871.967326] oom_reaper: reaped process 3555848 (python), now anon-rss:0kB, file-rss:0kB, shmem-rss:8kB

sarafridov · 2023-06-26T23:21:34Z

The preprocessing step for dynerf is pretty CPU-memory-intensive; I remember getting similar issues if I tried to run too many of these in parallel or without downsampling. I uploaded my precomputed sampling weights for some of the scenes (the .pt files here for salmon and in flamesteak_explicit and searsteak_explicit), so you can try downloading those weights into your data folders and then running the actual training step to see if it works. Note that the salmon scene is a bit more memory-intensive than the others.

JunhyeongDoyle · 2023-06-29T07:42:16Z

@sarafridov Thanks for sharing :)

JokerYan · 2023-08-28T07:21:22Z

thank you very much for the sharing. It is very helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM issue #29

OOM issue #29

JunhyeongDoyle commented Jun 22, 2023

sarafridov commented Jun 26, 2023

JunhyeongDoyle commented Jun 29, 2023

JokerYan commented Aug 28, 2023

OOM issue #29

OOM issue #29

Comments

JunhyeongDoyle commented Jun 22, 2023

sarafridov commented Jun 26, 2023

JunhyeongDoyle commented Jun 29, 2023

JokerYan commented Aug 28, 2023