Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal memory access while using custom images #26

Open
adithya-Avataar opened this issue Mar 28, 2023 · 8 comments
Open

Illegal memory access while using custom images #26

adithya-Avataar opened this issue Mar 28, 2023 · 8 comments

Comments

@adithya-Avataar
Copy link

adithya-Avataar commented Mar 28, 2023

Hi

I am trying to use DPVO to estimate poses for my object. I have continuous images surrounding the object from all directions. When I run the code on my images using demo.py code. The directory contains about 115 images in all.

File "/DPVO/demo.py", line 92, in
pred_traj = run(cfg, args.network, args.imagedir, args.calib, args.stride, args.skip, args.viz, args.timeit, args.save_reconstruction)
File "/root/miniconda3/envs/dpvo/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/DPVO/demo.py", line 51, in run
slam(t, image, intrinsics)
File "/DPVO/dpvo/dpvo.py", line 394, in call
self.update()
File "/DPVO/dpvo/dpvo.py", line 278, in update
self.network.update(self.net, ctx, corr, None, self.ii, self.jj, self.kk)
File "/root/miniconda3/envs/dpvo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/DPVO/dpvo/net.py", line 80, in forward
ix, jx = fastba.neighbors(kk, jj)
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1

when I run the code as CUDA_LAUNCH_BLOCKING=1 python demo.py --save_reconstruction --save_trajectory --imagedir=images3_jpg/ --calib=custom_calib.txt --stride=1
The code runs without any errors, but the the saved trajectory file contains pose values as nans beyond index 15
I have observed the same with multiple other custom image directories as well.

@lahavlipson
Copy link
Collaborator

Are the images sequential, i.e. from a video?

@adithya-Avataar
Copy link
Author

adithya-Avataar commented Mar 29, 2023

Hi thank you very much for your quick response. Yes they are sequential. I have checked the order by printing the filenames in the image stream. (They are a directory of images and they are order after sorting.)

When i resize the images to 2k (1920, 1080) then the code runs withuout errors, but the trajectory generated seems to be very wrong.
Some more info:

Image resolution is 4k resolution - (4032, 3024)
Hardware -
GPU - NVIDIA A10G
Architecture - Ampere
Compute capabillity - 8.6

I have also tried the running docker version, but i see the same error
Could you please tell me if there is anything i could try?

@lahavlipson
Copy link
Collaborator

You could try adjusting the stride, lowering the image resolution further, increasing the patch lifetime or optimization window, though these decisions often depend on the degree of camera motion.

Regarding the memory access error, if you're able to share the images I can investigate the cause (assuming you're permitted to do so).

@adithya-Avataar
Copy link
Author

adithya-Avataar commented Mar 29, 2023

Hi
I have tried doubling all the values i have just tried increasing optimization window, number of patches, patch lifetime. I get an output, but its still not as expected. I have also tried reducing the image size further down to 1024*768. But still the output is not on par with droid slam too.

I have attached the images i have been trying on here - images
I have attached the calibration file as well here - calib

Also again thank you very much

@lahavlipson
Copy link
Collaborator

The numerical issue disappears after disabling mixed precision. I set the stride=1 and shrunk the image resolution and intrinsics by 50%.

DPVO:

image

DROID:

image

@adithya-Avataar
Copy link
Author

Thank you very much. I will try out with these settings.

@adithya-Avataar
Copy link
Author

Hi @lahavlipson I had tried out the settings you had mentioned, and it works, But my output (predicted poses) varies a lot between different runs for the same set of images with same set of hyper parameters and only in one of the runs, I get the output as expected. I am hoping to understand if this is an implementation issue from my end or if this expected behaviour?? I have attached the output for a few runs on the same object that i had shared before. Please let me know if something is being done in a wrong manner.
Screenshot from 2023-04-11 12-36-15
Screenshot from 2023-04-11 12-37-32
Screenshot from 2023-04-11 12-38-18
Screenshot from 2023-04-11 12-39-04
Screenshot from 2023-04-11 12-39-15
Screenshot from 2023-04-11 12-40-15
Screenshot from 2023-04-11 12-42-01
Screenshot from 2023-04-11 12-42-42
Screenshot from 2023-04-11 12-43-21
Screenshot from 2023-04-11 12-44-09
Screenshot from 2023-04-11 12-44-14
Screenshot from 2023-04-11 12-44-54

@lahavlipson
Copy link
Collaborator

DPVO selects patch centroids randomly, so variance in the output like what you've shown is possible. The chosen scale of the scene is most likely related to the randomly initialized depth.

For more predictable behavior, you can increase the number of patches tracked per frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants