Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio data is not removed after download in edge case #4

Open
Bklieger opened this issue Jun 23, 2024 · 1 comment
Open

Audio data is not removed after download in edge case #4

Bklieger opened this issue Jun 23, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Bklieger
Copy link
Owner

If the user closes the streamlit app window after the YouTube video is downloaded but before the Whisper transcription is complete, the audio data will not be deleted from the download file. This produces a leak of unnecessarily increasing data storage requirements over time. To patch, a PR should be made with a fix that will delete any downloaded files after a user closes the window.

In main.py:

Line 398:

""" Downloads audio files """

if input_method == "YouTube link":
                display_status("Downloading audio from YouTube link ....")
                audio_file_path = download_video_audio(youtube_link, display_download_status)
[...]

Line 417:

""" Transcribes audio using Whisper which may take 3-10 seconds on average, 
during which the user could close the window and stop the program, making the 
removal function below not reached to execute."""

display_status("Transcribing audio in background....")
transcription_text = transcribe_audio(audio_file)
display_statistics()

Line 421:

""" Removes downloaded audio files from download folder """

delete_download(audio_file_path)
@Bklieger Bklieger added the bug Something isn't working label Jun 23, 2024
Bklieger added a commit that referenced this issue Jun 23, 2024
After the file is read and stored, the original audio file can be deleted. This fixes all cases in #4 that occur when the user closes the app window once the transcription has started.
@Bklieger
Copy link
Owner Author

This bug has been partially mitigated (78feeb7) by deleting the audio files after reading the audio data. This means if the user closes the window after the download is fully complete and processed, the audio files will not be in the downloads folder.

However, this is an incomplete fix as there still remains the case in which a user closes the app window before the full download and processing is complete, leaving data in the downloads folder which will not be deleted automatically.

To patch, a PR should still be made with a fix that will delete any downloaded files after a user closes the window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant