-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong pinyin tone #1
Comments
Thanks for this. But I'm not sure if this is WRONG. Chinese regular tone changes are not written according to https://resources.allsetlearning.com/chinese/pronunciation/Tone_change_rules#Why_Tone_Changes_Are_Not_Written. Instead, I think it's better to distinguish those two: original vs. rule-applied. |
cedict.txt |
I think some simple rules can help. I'm working on them. I'll be back in hours. |
@begeekmyfriend I've added the pronunciation that tone change rules are applied to. Upgrade the library to check it and please let me know if it is okay. Thanks for pointing this out.
|
for example: |
|
more example:
|
@Jackiexiao Can you clarify what you mean? It's confusing. The current results for the strings above are like: 有一次 第一次。 十一二岁来到戏校 同年十一月 一九八二年英文版 欧洲统一步伐 吉林省一号工程 一是选拔优秀干部 Which parts are incorrect? |
Well it is really confusing when you first learn Chinese on
|
According to https://en.wikipedia.org/wiki/Standard_Chinese_phonology#Tone_sandhi
So are the rules 1 and 2 applied word-internally only? In other words, when 一 is followed by a fourth-tone character which belongs to a separate word, 一 is read as first tone, not second tone? |
That is right for what you have learned. |
give another interesting example:
|
|
I'm looking at the literature about the tone change rules. Unfortunately, most of them are not clear about the boundaries. But some say the tone change rules MAY work across word boundaries. If my understanding is correct, things are more complicated. If we just think all the tone change rules including third tone, 一, and 不 occur word-internally, things are simple, but I'm not sure if that's true. |
I do not think one can do Chinese Pinyin conversion totally correct. There are no rules but conventions. A enoumous pinyin dictionary is indisensable in such issue. That is what we can do about it in all. |
Okay. I've updated it to 0.9.9.3. I tried to refine the rules. Feel free to check it. |
Hi Kyubyong, Do you consider to use machine learning like CRF to predict the tone change of 一? Thanks. |
I have found a well designed Chinese pinyin dictionary from espeak with 21567 single characters plus 36098 compound exceptions (includes 332 added 'yi' and 10720 added 'bu' exceptions, and 9713 extra 2-syllable words for 3rd-tone sandhi blocking). Would you like to replace the original one with it @Kyubyong ? |
It is hard to get correct tone all the time to some characters. As for "不" As for the consistent third tones: As for “子” As for "个" As for “头” Even when on the same character in same word, it will pronounce differently when the speaker have different emotion. |
Should be 'yi4 xin1 yi2 yi4'.
See mozillazg/phrase-pinyin-data#20
The text was updated successfully, but these errors were encountered: