Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted dataset file: [depth_undistorted.tar.gz], unable to extract #109

Open
YuzheHao opened this issue Oct 2, 2024 · 0 comments
Open

Comments

@YuzheHao
Copy link

YuzheHao commented Oct 2, 2024

I am trying to fine-tune the LightGlue model with megdaDepth dataset.
But the dataset downloaded from ETH website seems to have some missing and corrupted files:

  • In scene_info.zip ---> npz files with the following ID cannot be found:

    • 0012
    • 0070
    • 0036
    • 0407
  • In depth_undistorted.tar.gz ---> Cannot extract complete files from compressed tar.gz file

    • Only the files in one folder: 0065 can be extracted
    • Only can extract 500 MB h5 files from a 148 GB compressed tar.gz file
    • The extraction will stopped after 0065:

    Separate two step extraction:

    # (1) Extract with gzip first, then untar
    $ gunzip -kv depth_undistorted.tar.gz
    (after running for a while)
    gzip: depth_undistorted_backup.tar.gz: invalid compressed data--format violated

    Direct one-step extraction:

    # (1) Extract directly with tar
    $ tar -xvzf depth_undistorted.tar.gz
    depth_undistorted/
    depth_undistorted/0065/
    depth_undistorted/0065/169836871_03dcb437c5_o.h5
    .....
    depth_undistorted/0065/2700367171_f26cbce468_o.h5
    tar: Skipping to next header
    tar: Exiting with failure status due to previous errors

I tried various of extraction methods and none of them works.
I also re-downloaded the dataset files once, for avoiding the corrupted downloading problem from my side.

There is a period that this ETH dataset downloading site is down and cannot be opened, about 2 months ago, maybe something is wrong during that period?

I would appreciate it very much if you could check the data or provide a way for me to deal with this problem!
Thank you very much!

@YuzheHao YuzheHao changed the title Corrupted file: [depth_undistorted.tar.gz], unable to extract Corrupted dataset file: [depth_undistorted.tar.gz], unable to extract Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant