Merge pull request #130 from aedocw/main

Trigger container build
aedocw · Dec 25, 2023 · 024ae19 · 024ae19
2 parents 0581d40 + 38983ee
commit 024ae19
Show file tree

Hide file tree

Showing 12 changed files with 503 additions and 159 deletions.
diff --git a/.gitignore b/.gitignore
@@ -16,6 +16,7 @@ var
 sdist
 develop-eggs
 .installed.cfg
+.vscode
 
 # Installer logs
 pip-log.txt

diff --git a/.vscode/settings.json b/.vscode/settings.json
diff --git a/README.md b/README.md
@@ -2,7 +2,12 @@ This script takes an epub (or text file) and reads it to an m4b audiobook file,
 
 I recognize this is not very user friendly, but I wanted to share in case folks thought it was useful. If there are a few more people than myself that find this is useful I will keep working on turning it into something that could be used by someone without dev experience.
 
-**NOTE: BIG UPDATE for XTTS!** The Coqui team released v2 of their XTTS model and the quality is amazing! This latest release includes significant refactoring, and uses streaming inference for XTTS. Suggested usage is to include up to three wav file speaker samples, up to 30 seconds each. Check out the XTTS sample to get an idea of the quality you can expect.
+**NOTE:** The Coqui team released their curated XTTS voice models recently, and they sound great. A recent update here
+allows you to use these voices. You can generate samples of all the voices by running `python utils/generate-speaker-samples.py`. Check these voices out, they're allmost all amazing sounding! (GPU required)
+
+Example usage: `epub2tts my-book.epub --engine xtts --speaker "Damien Black"`
+
+**NOTE:** The Coqui team released v2 of their XTTS model and the quality is amazing! This latest release includes significant refactoring, and uses streaming inference for XTTS. Suggested usage is to include up to three wav file speaker samples, up to 30 seconds each. Check out the XTTS sample to get an idea of the quality you can expect. Also take a look in the utils directory for notes on finetuning your model for exceptional results. (GPU required)
 
 Example usage: `epub2tts my-book.epub --start 4 --end 20 --xtts shadow-1.wav,shadow-2.wav,shadow-3.wav`
 
@@ -17,7 +22,7 @@ Usage:
 
   TEXT: `epub2tts my-book.txt`
 
-To use Coqui XTTS, add: `--xtts <sample-1.wav>,<sample-2.wav>,<sample-3.wav>` (GPU required, slow but sounds amazing!)
+To use Coqui XTTS, add: `--xtts <sample-1.wav>,<sample-2.wav>,<sample-3.wav> --language 'en' book.epub` (slow but sounds amazing!)
 
 To use OpenAI TTS, add: `--openai <your API key>` (Use speaker option to specify voice other than onyx: `--speaker shimmer`)