-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lots of warnings when running prepare.sh #108
Comments
@gpawlowsky1979 Hi what all changes u did inorder to train it,i meant hyperparamters? and how many distinct speakers that you have used to train it. |
After 70 epochs the results are better and now the voices resemble more the ones used as input. Perhaps I had unrealistic expectations about how good the generated voices would sound.
I think the results may have been better if the dataset was properly prepared, without so many word count mismatches. I trained it on the libritts dataset. Here are the parameters I used:
|
@gpawlowsky1979 I'm also training the model. |
When running prepare.sh for preparing the libritts dataset I got lots of warnings like this:
I looks like every single line in the dataset has this kind of problem, so I don't think it's something that can just be ignored safely. I got similar warnings later when using infer.py.
Despite these errors, I was able to train the model and after 60 epochs (20 AR + 40 NAR) it is capable of generating intelligible speech, but it doesn't resemble much the voices I use as an input. This might be due to underfitting, but I'm concerned it may also be related to the warnings in the dataset I mentioned. I also had to reduce a bit the max-duration parameter in order to run on a 16GB GPU.
Here's my tensorboard image after 40 epochs on NAR:
Anybody got some luck getting good speech generation that really resembles input voices after training on LibriTTS?
Also, what's the difference between vall-e and vall-f models? I haven't found much information about vall-f. Is it any better than vall-e?
The text was updated successfully, but these errors were encountered: