-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
espeak-ng-mborla sound is worse than espeak-ng-mborla-generic (or espeak-mborla-geneirc) #949
Comments
espeak-ng-mbrola is not supposed to produce worse than espeak-ng-mbrola-generic, they're supposed to be exactly the same, since the code is actually the same: libespeak-ng is the same, and in the non-generic case it calls the external mbrola tool, thus essentially the same as the pipeline in the -generic case. If a difference exists that makes the non-generic worse, it should be spotted to fix it, it's probably something dumb such as some default parameters that for whatever reason don't end up being the same. In the end, espeak-ng-mbrola is supposed to be better, not in terms of audio quality (since they're expected to be exactly the same) but in terms of flexibility (audio pipelining, stopping, etc.) |
Apparently espeak-ng doesn't report the proper audio rate (22KHz instead of 16KHz) |
espeak_ng_GetSampleRate does not report the rate of mbrola voices. Fixes brailcom#949
I believe this is now fixed |
@sthibaul indeed, thanks! However, (trying a patched 0.11.4 ATM, I'll try testing true master at some point) now the sentence end is cut off. I don't know if it's a direct consequence of this or it just reveals a side effect, but it's affecting both spd-say and Orca. |
Was it not the case before patching? |
No, the sound was weird and fast but not cut off at the end, at least not that I can hear. I didn't look into it, but maybe there's another discrepancy with the sample rate leading to incorrect timing computation or something? Or a bug dropping the last sample could have more impact maybe as it spans more? |
That would completely depend on your configuration. Here with master and the pulse backend, I'm not noticing anything. |
Does it also cut off with |
Does the cut-off show up in parecord too? |
I don't have -2, but it doesn't happen with -4. However, this voice always sounds a bit weird (with the generic or not), and didn't change with the patching.
Yes. I tried debugging this a bit, and the issue seems to be that espeak sends a spurious sample rate change event, or that it's not handled in the correct order versus the sample collection. With
While with
I believe this likely explains the issue if the last samples are not actually at rate 22050. |
If I hack to force sample rate to 16000 the sound is good with no cutoff using french-mbrola-1. |
I took a moment to look into this a bit, and I don't know the solution but possibly espeak-ng (1.51) is the issue. It's own code (in speech.c's
but only until you replace any |
Ok, leaving the espeak-ng bug for now, and compensating here, assuming that the event list starts with the proper sample rate change, and we ignore the others. |
Steps to reproduce
Compare sound output between espeak-ng-mbrola (beware of #902) and espeak-ng-mborla-generic: the espeak-ng-mbrola one is a lot less human-like.
I used the following command to capture a sample (using French mbrola voices):
Obtained behavior
The espeak-ng-mbrola one is a lot less human-like, the espeak-ng-mborla-generic one sounds "better".
Expected behavior
This is actually OK if it's not a bug (it might just be that the mbrola synthesizer is better at this, which is fine); but speech-dispatcher lists espeak-ng-mbrola as "better" than espeak-ng-mbrola-generic (in
module_compare()
from src/server/speechd.c). It might be true for the feature set, but it's not for (my) ears.IMO the sorting should take into account the perceived voice quality as well as other factors, especially when two modules otherwise look so similar to the user.
The text was updated successfully, but these errors were encountered: