-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash in MAS #24
Comments
For MAS debug, I am not good at C++, yet. I can suggest one thing, just run MAS (encoder + spectrogram), everything else can be deleted in batch. At some batch it will fail, open that batch and run one sample at a time, you will the sample which gives problem. |
Thank you for your fast answer. |
Sure, I'll fix AlignerNet synthesis in this week. What is the error during inference? |
Oh you have edited you answer. It was a crash. If you need more info I can try to run it again and will tell you. But I think I may made some mistake. Maybe you can push changes somewhere in separate branch and I will compare. |
Someone else tried aligner net and it worked ok for them. So I am not sure how to debug without error, if it's crash, maybe dataset issue? Does AlignerNet train on the small subset? |
But AlignerNet isn't used during inference |
True. If it crashes during training, I think it is the dataset issue. |
No it doesn't crash during the training. I got crash after I loaded trained checkpoint on synthesis method. |
Send some random input to the duration predictor, does it predict something? |
Is the error something like "out of memory"? |
This evening I will try to run it again and tell you more details. |
If you want we can talk in telegram (https://t.me/TeraSpace) I speak Ukrainian and Russian. |
@p0p4k I have tried again and yes, on inference I got out of memory error. Same as @Tera2Space mentioned. |
So, maybe try a small sentence inference? Does that work? If it does, then just memory issue and not code ksse. |
It is small sentence. No it doesn't work. It doing something very long than crashes. https://drive.google.com/file/d/1WaIYiloaf3oDVtkWb5LH8YN0XWW2KXbR/view?usp=drivesdk |
Yep, same, I believe that AlignerNet didn't converged so duration predictor learn wrong alignments so at inference audio become very long and cause out of memory. |
1500 gibs of vram.... I think it's code issue because it happens at evaluation at training, when with MAS it works fine. |
Give the model some random durations instead of using the duration predictor and try to see the output. (One duration integer per phoneme) |
Sorry, but I don't know how to do that. |
I will try later to clamp out of aligner. |
Now I’m wondering if the problem might be that we use text encoder outputs as input to alignernet, which(text encoder outputs) are passed through convolution (to get dimensions like mel frame)? Because while I was testing pitch predictor it didn't work when conditioned on output of text encoder, but when i tried to use x_emb directly it worked. I will test and if work I will create PR |
Very interesting 🤔 |
also here the wild guess, do you mind trying to use the numpy version of maximum_path search: Long time ago I also had problems with seg faults, running the training with gdb showed that it was related to maximum_path and using the numpy version fixed it. |
We are experiencing a strange issue. With one our big dataset (about 300 hours) MAS is randomly crashes. Core dump shows following line:
We have tried everything but nothing did help.
The only thing that helped was replacing
MAS
withAlignerNet
but there was another issue - crash at inference, maybesynthesis
method requires some changes too?I have successfully trained pflowttss on single speaker dataset which is subset of this bigger dataset and it sounds great. Demo is here - https://tts.patriotyk.name
Also I have built and pushed to registry docker image which can be used to reproduce this issue, just need to pull and run it. I can share url in private message if you need it.
The text was updated successfully, but these errors were encountered: