A Python-based Streamlit Video/Audio Summarization Application that allows users to upload audio or video files, transcribe the content, and generate concise summaries for easy reference.
- Upload and process video/audio files for transcription
- Automatic transcription of audio content using Wav2Vec 2.0
- Summarize lengthy transcriptions into brief, digestible content
- User-friendly interface with real-time processing and feedback
Home Screen | |
Transcription Output | |
Summary Output |
-
Clone the repository:
git clone https://github.com/yourusername/VideoAudioSummarizationApp.git cd VideoAudioSummarizationApp
-
Install dependencies:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run app.py
-
Open the application in your browser (typically at
http://localhost:8501
). -
Upload a video or audio file in .mp4, .wav, or .mp3 format.
-
The app will extract audio (if a video file is uploaded), transcribe it, and display the transcription.
-
View the summarized content in the Summary section.
To add sample audio or video files to GitHub:
- Place sample files in a directory within the project, such as
sample_files/
. - In your README, provide links to these files for easy access.
- Use these sample files for demo purposes or to facilitate testing and contributions.
- Python - The programming language used.
- Streamlit - For the interactive web application.
- Hugging Face Transformers - For speech-to-text and summarization models.
- Librosa - For audio processing.
Contributions are welcome! To contribute, please submit a pull request and follow the standard GitHub workflow.
- Hugging Face community for providing state-of-the-art NLP models.
- Inspiration from various NLP resources for implementing the summarization feature.
Lav Kalsi