Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when run demo and some fix #9

Open
dicarne opened this issue Sep 8, 2022 · 6 comments
Open

Segmentation fault when run demo and some fix #9

dicarne opened this issue Sep 8, 2022 · 6 comments

Comments

@dicarne
Copy link

dicarne commented Sep 8, 2022

OS: ubuntu 20.04LTS
Gcc and g++: 10.3.0
Conda environment: same as environment.yml
Cudnn: 8.3.2
Driver Version: 510.47.03
CUDA Version: 11.6

After installing the dependencies and downloading the model according to the README, run the README
demo command, and it run failed. Without and error message.

$ CUDA_LAUNCH_BLOCKING=1 python demo.py --imagedir=movies/IMG_0494.MOV --calib=calib/iphone.txt --stride=5 --viz
Running with config...
BUFFER_SIZE: 2048
GRADIENT_BIAS: False
KEYFRAME_INDEX: 4
KEYFRAME_THRESH: 15.0
MIXED_PRECISION: True
MOTION_DAMPING: 0.5
MOTION_MODEL: DAMPED_LINEAR
OPTIMIZATION_WINDOW: 10
PATCHES_PER_FRAME: 96
PATCH_LIFETIME: 13
REMOVAL_WINDOW: 22
[1]    310720 segmentation fault (core dumped)  CUDA_LAUNCH_BLOCKING=1 python demo.py --imagedir=movies/IMG_0494.MOV   --viz

I found some method to fix it:
in DPViewer/dpviewer/viewer.cpp
image
and
image

Then I can run without error(sometimes).
And I found the following change will be helpful, before the change, there was a certain probability of segmentation fault, after the change, no more segmentation fault. Presumably this is due to pointer initialization.
image

Since I commented out the code to calculate the transformMatrix, the camera will not be updated.
And I found that whenever I called the calculated code in the loop of this thread, it would definitely error out. Therefore, I create a member function specifically for calculating and updating the transformMatrix in main thread.

  1. New update transformMatrix function (Don't forget to write the function definition and mutex definition):
    image

  2. Lock when render thread draw points and poses:
    image

  3. Call it in python:
    image

Now, in the visualization view, the camera can render the pose normally.

But I observed that point cloud data was used, but my visualization view only had the camera pose and video, not any points.

I'm not very familiar with gl programming, so just a simple replace of the render point function.
image

Finally, I can now run the visualization interface properly on my computer and update the poses and point cloud dynamically!

Here is all changes: dicarne@5f17684 dicarne@677b056


Some unanswered questions: Why, even with the same environment configuration as the author claims, I was unable to run the code of this project until I modified some of the code? I changed several computers and operating systems, and tried in docker, but all failed. I'm curious what the environment configuration is like to run it directly.

@lahavlipson
Copy link
Collaborator

We just created an official docker for DPVO with instructions on how to set it up: https://github.com/princeton-vl/DPVO_Docker

Hopefully this will be much easier; if you run into problems with the docker please follow up and I will try my best to help.

@oscarfossey
Copy link

oscarfossey commented Oct 20, 2022

Hello , thanks for the great repo! I have the same issue when using the docker file. Although i'm not sure it is the same causes because it refers to some XDG_RUNTIME_DIR.

(dpvo) root@laptop:/DPVO# python demo.py --imagedir=movies/IMG_0494.MOV --calib=calib/iphone.txt --stride=5 --viz 
Running with config...
BUFFER_SIZE: 2048
GRADIENT_BIAS: False
KEYFRAME_INDEX: 4
KEYFRAME_THRESH: 15.0
MIXED_PRECISION: True
MOTION_DAMPING: 0.5
MOTION_MODEL: DAMPED_LINEAR
OPTIMIZATION_WINDOW: 10
PATCHES_PER_FRAME: 96
PATCH_LIFETIME: 13
REMOVAL_WINDOW: 22
error: XDG_RUNTIME_DIR not set in the environment.
Segmentation fault (core dumped)
(dpvo) root@laptop:/DPVO# 

EDIT: I change in the docker file the the repo link to the one from @dicarne and it worked.

https://github.com/dicarne/DPVO

Thank you very much both of you!

@Launch-on-Titania
Copy link

Hello , thanks for the great repo! I have the same issue when using the docker file. Although i'm not sure it is the same causes because it refers to some XDG_RUNTIME_DIR.

(dpvo) root@laptop:/DPVO# python demo.py --imagedir=movies/IMG_0494.MOV --calib=calib/iphone.txt --stride=5 --viz 
Running with config...
BUFFER_SIZE: 2048
GRADIENT_BIAS: False
KEYFRAME_INDEX: 4
KEYFRAME_THRESH: 15.0
MIXED_PRECISION: True
MOTION_DAMPING: 0.5
MOTION_MODEL: DAMPED_LINEAR
OPTIMIZATION_WINDOW: 10
PATCHES_PER_FRAME: 96
PATCH_LIFETIME: 13
REMOVAL_WINDOW: 22
error: XDG_RUNTIME_DIR not set in the environment.
Segmentation fault (core dumped)
(dpvo) root@laptop:/DPVO# 

EDIT: I change in the docker file the the repo link to the one from @dicarne and it worked.

https://github.com/dicarne/DPVO

Thank you very much both of you!

Hi, I face the same issue while I use the docker env from the author. Have you solved it?

@caoxudong0513
Copy link

OS: ubuntu 20.04LTS Gcc and g++: 10.3.0 Conda environment: same as environment.yml Cudnn: 8.3.2 Driver Version: 510.47.03 CUDA Version: 11.6

After installing the dependencies and downloading the model according to the README, run the README demo command, and it run failed. Without and error message.

$ CUDA_LAUNCH_BLOCKING=1 python demo.py --imagedir=movies/IMG_0494.MOV --calib=calib/iphone.txt --stride=5 --viz
Running with config...
BUFFER_SIZE: 2048
GRADIENT_BIAS: False
KEYFRAME_INDEX: 4
KEYFRAME_THRESH: 15.0
MIXED_PRECISION: True
MOTION_DAMPING: 0.5
MOTION_MODEL: DAMPED_LINEAR
OPTIMIZATION_WINDOW: 10
PATCHES_PER_FRAME: 96
PATCH_LIFETIME: 13
REMOVAL_WINDOW: 22
[1]    310720 segmentation fault (core dumped)  CUDA_LAUNCH_BLOCKING=1 python demo.py --imagedir=movies/IMG_0494.MOV   --viz

I found some method to fix it: in DPViewer/dpviewer/viewer.cpp image and image

Then I can run without error(sometimes). And I found the following change will be helpful, before the change, there was a certain probability of segmentation fault, after the change, no more segmentation fault. Presumably this is due to pointer initialization. image

Since I commented out the code to calculate the transformMatrix, the camera will not be updated. And I found that whenever I called the calculated code in the loop of this thread, it would definitely error out. Therefore, I create a member function specifically for calculating and updating the transformMatrix in main thread.

1. New update `transformMatrix` function (Don't forget to write the function definition and mutex definition):
   ![image](https://user-images.githubusercontent.com/10357789/189071806-af14827f-b38c-4409-9c3a-082e4d853f3c.png)

2. Lock when render thread draw points and poses:
   ![image](https://user-images.githubusercontent.com/10357789/189072063-6d1adc70-02a4-427c-869d-69c3dec6bec8.png)

3. Call it in python:
   ![image](https://user-images.githubusercontent.com/10357789/189072252-d1945b7c-81d8-4f13-ba55-7592bc0c03bb.png)

Now, in the visualization view, the camera can render the pose normally.

But I observed that point cloud data was used, but my visualization view only had the camera pose and video, not any points.

I'm not very familiar with gl programming, so just a simple replace of the render point function. image

Finally, I can now run the visualization interface properly on my computer and update the poses and point cloud dynamically!

Here is all changes: dicarne@5f17684 dicarne@677b056

Some unanswered questions: Why, even with the same environment configuration as the author claims, I was unable to run the code of this project until I modified some of the code? I changed several computers and operating systems, and tried in docker, but all failed. I'm curious what the environment configuration is like to run it directly.
can you install dpvo in your environment? what can i do ?
cuda 11.6
name: dpvo
channels:

  • pyg
  • pytorch
  • nvidia
  • conda-forge
  • defaults
    dependencies:
  • pip
  • python=3.8
  • pytorch=2.3.1
  • pytorch-scatter=2.1.2
  • pytorch-cuda=12.1
  • torchvision=0.18
  • pip:
    • tensorboard
    • numba
    • tqdm
    • einops
    • pypose
    • kornia
    • numpy==1.23.5
    • plyfile
    • evo
    • opencv-python
    • yacs

`(dpvo) root@ubuntu:/home/code/DPVO# pip install .
Processing /home/code/DPVO
Preparing metadata (setup.py) ... done
Building wheels for collected packages: dpvo
Building wheel for dpvo (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [84 lines of output]
running bdist_wheel
running build
running build_py
creating build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/dpvo.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/logger.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/patchgraph.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/plot_utils.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/blocks.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/config.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/extractor.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/ba.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/init.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/stream.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/projective_ops.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/net.py -> build/lib.linux-x86_64-cpython-38/dpvo
copying dpvo/utils.py -> build/lib.linux-x86_64-cpython-38/dpvo
creating build/lib.linux-x86_64-cpython-38/dpvo/fastba
copying dpvo/fastba/ba.py -> build/lib.linux-x86_64-cpython-38/dpvo/fastba
copying dpvo/fastba/init.py -> build/lib.linux-x86_64-cpython-38/dpvo/fastba
creating build/lib.linux-x86_64-cpython-38/dpvo/lietorch
copying dpvo/lietorch/groups.py -> build/lib.linux-x86_64-cpython-38/dpvo/lietorch
copying dpvo/lietorch/group_ops.py -> build/lib.linux-x86_64-cpython-38/dpvo/lietorch
copying dpvo/lietorch/run_tests.py -> build/lib.linux-x86_64-cpython-38/dpvo/lietorch
copying dpvo/lietorch/gradcheck.py -> build/lib.linux-x86_64-cpython-38/dpvo/lietorch
copying dpvo/lietorch/broadcasting.py -> build/lib.linux-x86_64-cpython-38/dpvo/lietorch
copying dpvo/lietorch/init.py -> build/lib.linux-x86_64-cpython-38/dpvo/lietorch
creating build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/rgbd_utils.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/factory.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/augmentation.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/frame_utils.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/init.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/base.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
copying dpvo/data_readers/tartan.py -> build/lib.linux-x86_64-cpython-38/dpvo/data_readers
creating build/lib.linux-x86_64-cpython-38/dpvo/altcorr
copying dpvo/altcorr/correlation.py -> build/lib.linux-x86_64-cpython-38/dpvo/altcorr
copying dpvo/altcorr/init.py -> build/lib.linux-x86_64-cpython-38/dpvo/altcorr
running build_ext
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/home/code/DPVO/setup.py", line 9, in
setup(
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/init.py", line 117, in setup
return distutils.core.setup(**attrs)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 183, in setup
return run_commands(dist)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 199, in run_commands
dist.run_commands()
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
self.run_command(cmd)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/dist.py", line 999, in run_command
super().run_command(command)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/command/bdist_wheel.py", line 410, in run
self.run_command("build")
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/dist.py", line 999, in run_command
super().run_command(command)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/dist.py", line 999, in run_command
super().run_command(command)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 98, in run
_build_ext.run(self)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
self.build_extensions()
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 522, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File "/root/miniconda3/envs/dpvo/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 417, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.6) mismatches the version that was used to compile
PyTorch (12.1). Please make sure to use the same CUDA versions.

  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for dpvo
Running setup.py clean for dpvo
Failed to build dpvo
`

@SakuraLiHe
Copy link

SakuraLiHe commented Dec 27, 2024

@dicarne I have changed the file, how can I rebuild it?
Traceback (most recent call last):
File "/DPVO/demo.py", line 87, in
(poses, tstamps), (points, colors, calib) = run(cfg, args.network, args.imagedir, args.calib, args.stride, args.skip, args.viz, args.timeit)
File "/root/miniconda3/envs/dpvo/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/DPVO/demo.py", line 50, in run
slam(t, image, intrinsics)
File "/DPVO/dpvo/dpvo.py", line 388, in call
self.viewer.loop()
AttributeError: 'dpviewerx.Viewer' object has no attribute 'loop'

@dicarne
Copy link
Author

dicarne commented Dec 30, 2024

@dicarne I have changed the file, how can I rebuild it? Traceback (most recent call last): File "/DPVO/demo.py", line 87, in (poses, tstamps), (points, colors, calib) = run(cfg, args.network, args.imagedir, args.calib, args.stride, args.skip, args.viz, args.timeit) File "/root/miniconda3/envs/dpvo/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/DPVO/demo.py", line 50, in run slam(t, image, intrinsics) File "/DPVO/dpvo/dpvo.py", line 388, in call self.viewer.loop() AttributeError: 'dpviewerx.Viewer' object has no attribute 'loop'

It's been a while since I remember clearly, but I should have executed pip install . in the Viewer directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants