You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run the following command to download the Italian Datasert from MuAViC:
python get_data.py --root-path ./esperanza/ --src-lang it
However, in some moment of the running the script was interrupted. Please find attached the full error trace:
Traceback (most recent call last):
File "/home/dgimeno/phd/muavic/utils.py", line 62, in download_file
wget.download(url, out=str(download_path / filename), bar=custom_bar)
File "/home/dgimeno/anaconda3/envs/muavic/lib/python3.8/site-packages/wget.py", line 506, in download
(fd, tmpfile) = tempfile.mkstemp(".tmp", prefix=prefix, dir=".")
File "/home/dgimeno/anaconda3/envs/muavic/lib/python3.8/tempfile.py", line 331, in mkstemp
return _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File "/home/dgimeno/anaconda3/envs/muavic/lib/python3.8/tempfile.py", line 250, in _mkstemp_inner
fd = _os.open(file, flags, 0o600)
FileNotFoundError: [Errno 2] No such file or directory: './esperanza/metadata/it_metadata.tgz88g65ab3.tmp'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "get_data.py", line 115, in <module>
main(args)
File "get_data.py", line 84, in main
prepare_mtedx(args)
File "get_data.py", line 26, in prepare_mtedx
preprocess_mtedx_video(
File "/home/dgimeno/phd/muavic/mtedx_utils.py", line 220, in preprocess_mtedx_video
video_metadata = load_video_metadata(
File "/home/dgimeno/phd/muavic/utils.py", line 110, in load_video_metadata
download_extract_file_if_not(
File "/home/dgimeno/phd/muavic/utils.py", line 89, in download_extract_file_if_not
download_file(url, download_path)
File "/home/dgimeno/phd/muavic/utils.py", line 65, in download_file
raise HTTPError(e.url, e.code, message, e.hdrs, e.fp)
AttributeError: 'FileNotFoundError' object has no attribute 'url'
The text was updated successfully, but these errors were encountered:
Thank you for raising this issue and so sorry for the late reply!
I couldn't replicate your error on my machine. However, I would suggest deleting tgz88g65ab3.tmp from your video files. I think this file wasn't downloaded fully, that's why it has the .tmp suffix. Once deleted, the script should recognize that this file is missing and try to download it again.
No worries for the late reply :) Thanks, your suggestion worked!
However, I would like to highlight you that the number of videos available to download is decreasing. Consequently, one day there will no enough videos to allow further research to provide fair comparisons to previous studies w.r.t. audio-visual or visual-only settings. Regarding audio waveorms, there is no problem since they are coming from the MTEDx corpus.
I think that, although I can understand what this mean and all the infrastructure it can imply, the database should be shared in a different way, similar to LRS3 and not depending on YouTube availability video clips.
Hi,
I run the following command to download the Italian Datasert from MuAViC:
However, in some moment of the running the script was interrupted. Please find attached the full error trace:
The text was updated successfully, but these errors were encountered: