-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Switched from command line to Textual GUI, switched from .env to conf…
…ig.yaml
- Loading branch information
Showing
13 changed files
with
615 additions
and
240 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# These are supported funding model platforms | ||
|
||
github: [ptmrio] | ||
patreon: # Replace with a single Patreon username | ||
open_collective: # Replace with a single Open Collective username | ||
ko_fi: # Replace with a single Ko-fi username | ||
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel | ||
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry | ||
liberapay: # Replace with a single Liberapay username | ||
issuehunt: # Replace with a single IssueHunt username | ||
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry | ||
polar: # Replace with a single Polar username | ||
buy_me_a_coffee: # Replace with a single Buy Me a Coffee username | ||
custom: ['https://www.paypal.com/paypalme/Petermeir'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,8 @@ | ||
build/ | ||
env/ | ||
venv/ | ||
.env | ||
__pycache__/ | ||
*.pyc | ||
config.yaml | ||
setup.ps1 | ||
build.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,129 +1,139 @@ | ||
# VideoSubtitler | ||
# Video Subtitler & Text-to-Speech Generator | ||
|
||
VideoSubtitler is an essential tool for content creators who need high-quality captions for their videos. Using OpenAI's Whisper model for Speech to Text and Chat Completions, this project extracts audio from video files, transcribes the audio, and corrects dialects while maintaining the timing of the original video. Perfect for YouTube creators, VideoSubtitler ensures your captions are accurate and contextually appropriate, enhancing accessibility and viewer engagement. | ||
## Overview | ||
|
||
This project provides two standalone tools for **Video Subtitling** and **Text-to-Speech** generation using OpenAI's models. The **Video Subtitler** extracts and transcribes audio from video files, while the **Text-to-Speech Generator** converts text files into speech. Both tools are distributed as `.exe` files for easy use, and they leverage OpenAI’s **Whisper** for transcription and **TTS** for generating speech. | ||
|
||
## Features | ||
|
||
- Extracts audio from video files | ||
- Automatically splits the audio into smaller segments (Whisper model max filesize: 25 MB) | ||
- Transcribes audio using OpenAI's Whisper model | ||
- Corrects dialects in the transcription using Chat Completions (context-aware) | ||
### Video Subtitler | ||
|
||
## Prerequisites | ||
- Transcribes audio and video files using OpenAI's **Whisper** speech-to-text model. | ||
- Supports video formats like `.mp4`, `.mov`, `.mpeg`, and more. | ||
- Extracts audio from video files and splits large audio files for optimal transcription. | ||
- Offers transcription correction using GPT for refined and accurate text output. | ||
- Customizable with language and prompt settings. | ||
|
||
### Get an OpenAI API Key | ||
![Text-to-Speech Generator](https://github.com/ptmrio/video-subtitler/blob/main/screenshots/tts.png) | ||
|
||
1. Go to the [OpenAI API Keys page](https://platform.openai.com/api-keys). | ||
2. Log in or sign up for an account. | ||
3. Click on "Create new secret key" and copy your API key. | ||
### Text-to-Speech Generator | ||
|
||
### Install FFmpeg | ||
- Converts text files into speech using OpenAI's **TTS** model. | ||
- Customizable voice selection and speed settings for more control over the output. | ||
- Outputs audio as `.mp3` files. | ||
|
||
#### Windows | ||
![Video Subtitler](https://github.com/ptmrio/video-subtitler/blob/main/screenshots/video-subtitler.png) | ||
|
||
- **Option 1: Download from the official website**: | ||
1. Download the latest build from the [official website](https://ffmpeg.org/download.html). | ||
2. Extract the ZIP file to a location on your computer, e.g., `C:\ffmpeg`. | ||
3. Add the `bin` directory of `ffmpeg` to your system's PATH. | ||
## Prerequisites | ||
|
||
- **Option 2: Install using Chocolatey**: | ||
1. Install Chocolatey from [chocolatey.org](https://chocolatey.org/install). | ||
2. Open Command Prompt as Administrator and run: | ||
```cmd | ||
choco install ffmpeg | ||
``` | ||
### FFmpeg Installation | ||
|
||
FFmpeg is required to handle video and audio processing. Install FFmpeg based on your operating system: | ||
|
||
- **Windows Option 1: Install using winget (recommended)**: | ||
|
||
- **Option 3: Install using winget**: | ||
1. Open Command Prompt as Administrator and run: | ||
```cmd | ||
winget install ffmpeg | ||
``` | ||
#### macOS | ||
- **Windows Option 2: Download from the official website**: | ||
1. Download the latest build from the [official website](https://ffmpeg.org/download.html). | ||
2. Extract the ZIP file to a location on your computer, e.g., `C:\ffmpeg`. | ||
3. Add the `bin` directory of `ffmpeg` to your system's PATH. | ||
- Install FFmpeg using [Homebrew](https://brew.sh/): | ||
- **macOS**: Install via Homebrew: | ||
```bash | ||
brew install ffmpeg | ||
``` | ||
|
||
#### Linux | ||
|
||
- Install FFmpeg using your package manager: | ||
- **Linux**: Install via APT: | ||
```bash | ||
sudo apt-get install ffmpeg | ||
``` | ||
|
||
## Usage (Windows Executable) | ||
|
||
1. Download the latest release from the [Releases](https://github.com/ptmrio/video-subtitler/releases) page. | ||
2. Extract the ZIP file to a directory of your choice. | ||
3. Open the directory and create a `.env` file with your OpenAI API key: | ||
```plaintext | ||
OPENAI_API_KEY=your_openai_api_key_here | ||
``` | ||
4. Open Command Prompt in the directory where `video-subtitler.exe` is located. | ||
5. Run the executable with the required arguments: | ||
```cmd | ||
video-subtitler.exe --file="path_to_your_file.mp4" --language="DE" | ||
``` | ||
|
||
6. After processing, the transcription will be saved to a file named `your_file.transcription.json` in the same directory as the original video or audio file. You can open this file with any text editor, copy the transcription, and paste it into a timing editor like YouTube Studio to auto-match the timing of your captions. | ||
|
||
## Installation (Universal with Python) | ||
|
||
### Clone the Repository | ||
|
||
1. Clone the repository: | ||
```bash | ||
git clone https://github.com/ptmrio/video-subtitler.git | ||
cd video-subtitler | ||
``` | ||
### OpenAI API Key | ||
|
||
- Obtain an OpenAI API key from [OpenAI's platform](https://platform.openai.com/signup). | ||
- Set this API key in the `config.yaml` file as described in the configuration section. | ||
|
||
## Installation | ||
|
||
### Windows | ||
|
||
1. **Download the ZIP**: | ||
|
||
- Go to the [Releases](https://github.com/ptmrio/video-subtitler/releases) page on GitHub and download the latest `.zip` file. | ||
- Extract the contents to a directory on your system. | ||
|
||
2. **Set Up Configuration**: | ||
- Rename the `config.example.yaml` file to `config.yaml` and set up the configuration parameters. | ||
- Basically you only need to set the `openai.api_key` parameter with your OpenAI API key. | ||
- The `prompt` parameter can be used to teach Video Subtitler about special words or phrases (trademarks, unusual names) that may appear in your video. | ||
- Example `config.yaml`: | ||
```yaml | ||
openai: | ||
api_key: sk-XXX | ||
stt_model: whisper-1 | ||
tts_model: tts-1 | ||
completions_model: gpt-4o | ||
temperature: 0 | ||
default: | ||
language: EN | ||
stt_prompt: PhraseVault, Video Subtitler | ||
tts_voice: echo | ||
tts_speed: 1 | ||
``` | ||
### Install Dependencies | ||
### macOS/Linux | ||
1. Install the required Python packages: | ||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
1. **Clone the Repository**: | ||
- Clone the repository to your local machine: | ||
```bash | ||
git clone https://github.com/ptmrio/video-subtitler.git | ||
cd video-subtitler | ||
``` | ||
|
||
2. Create a `.env` file from the provided example: | ||
```bash | ||
cp .env.example .env | ||
``` | ||
2. **Set Up Configuration**: | ||
- see step 2 in the Windows installation instructions. | ||
|
||
3. Add your OpenAI API key to the `.env` file: | ||
```plaintext | ||
OPENAI_API_KEY=your_openai_api_key_here | ||
``` | ||
3. **Install Dependencies**: | ||
- Install the required Python packages: | ||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
4. **Run the Application**: | ||
- Run the application using Python: | ||
```bash | ||
python video_subtitler.py | ||
``` | ||
or | ||
```bash | ||
python text_to_speech.py | ||
``` | ||
|
||
### Running the Python Script | ||
|
||
1. Run the script with the required arguments: | ||
```bash | ||
python video-subtitler.py --file="path_to_your_file.mp4" --language="DE" | ||
``` | ||
## Usage | ||
|
||
## Parameters | ||
### Text-to-Speech Generator | ||
|
||
- `--file` (str, required): Path to the audio or video file. | ||
- `--model` (str, default="whisper-1"): Model to use for transcription. | ||
- `--language` (str): Language of the input audio (ISO-639-1 format). | ||
- `--prompt` (str): Optional text to guide the model's style. | ||
- `--response_format` (str, default="json"): Format of the transcript output. | ||
- `--temperature` (float, default=0): Sampling temperature between 0 and 1. | ||
1. Navigate to the downloaded and extracted folder and run the `text-to-speech.exe` file. | ||
2. Enter the path to your `.txt` text-file, customize the voice and speed (if necessary), and click **Generate Speech**. | ||
3. The application will convert the text into speech and save the result as an `.tts.mp3` file. | ||
|
||
## Contributing | ||
### Video Subtitler | ||
|
||
Contributions are welcome! Please open an issue or submit a pull request. | ||
1. Navigate to the downloaded and extracted folder and run the `video-subtitler.exe` file. | ||
2. Provide the path to your audio or video file and configure optional settings such as language or custom prompts. | ||
3. Click **Transcribe** to begin transcription. The resulting transcription will be saved as a `.transcription.txt` file. | ||
|
||
## License | ||
|
||
This project is licensed under the MIT License. See the [LICENSE](https://github.com/ptmrio/video-subtitler/blob/main/LICENSE) file for details. | ||
This project is licensed under the MIT License. See the `LICENSE` file for more details. | ||
|
||
## Donations | ||
|
||
If you find this project useful, consider donating to support its development: | ||
|
||
- [PayPal](https://paypal.me/Petermeir) | ||
If you find this project useful, consider donating to support its development. | ||
|
||
Thank you for your support! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
openai: | ||
api_key: sk-XXX | ||
stt_model: whisper-1 | ||
tts_model: tts-1 | ||
completions_model: gpt-4o | ||
temperature: 0 | ||
default: | ||
language: EN | ||
stt_prompt: PhraseVault, Video Subtitler | ||
tts_voice: echo | ||
tts_speed: 1 |
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.