Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delimiter handling exception #931

Open
shenghuang147 opened this issue Jun 14, 2024 · 5 comments
Open

Delimiter handling exception #931

shenghuang147 opened this issue Jun 14, 2024 · 5 comments

Comments

@shenghuang147
Copy link

Summary

The delimiter is not processed correctly when there is no space after the delimiter, as observed when using the spd-sya -w "hello,this,is,a,test" command.

Steps to Reproduce

  • generic.conf => GenericDelimiters ","
  • Run the command spd-sya -w "hello,this,is,a,test

Expected Behavior

The message should be split correctly at each delimiter, regardless of whether there is whitespace around the delimiter. For the input "hello,this,is,a,test", each fragment ("hello", "this", "is", "a", "test") should be returned in order.

Actual Behavior

The function does not segment the message correctly when there are no spaces around the delimiters. It behaves as if the delimiters are not present and returns the entire message as a single segment.

Importance of Fix

In many languages, it is not common to add spaces after punctuation marks, including commas. Addressing this issue is critical to ensure proper functioning across different language conventions and text formatting styles.

Log

epos-generic.log

 Sat Jun 15 02:30:17 2024 [635271]: Added voice zh-CN-XiaoxiaoNeural

Sat Jun 15 02:30:17 2024 [635296]: Added voice zh-CN-XiaoxiaoNeural

Sat Jun 15 02:30:17 2024 [635310]: Added voice zh-CN-XiaoyiNeural

Sat Jun 15 02:30:17 2024 [635351]: Configuration (pre) has been read from "/etc/speech-dispatcher/modules/epos-generic.conf"

Sat Jun 15 02:30:17 2024 [635369]: GenericMaxChunkLength = 300

Sat Jun 15 02:30:17 2024 [635376]: GenericDelimiters = ,?!;

Sat Jun 15 02:30:17 2024 [635383]: GenericExecuteSynth = printf %s '$DATA' | edge-tts --voice $VOICE --file /dev/stdin 2>/dev/null | play -t mp3 -q --ignore-length -

Sat Jun 15 02:30:17 2024 [635390]: GenericCmdDependency = printf

Sat Jun 15 02:30:17 2024 [635397]: GenericPortDependency = 0

Sat Jun 15 02:30:17 2024 [635548]: Generic: creating new thread for generic_speak

Sat Jun 15 02:30:17 2024 [635724]: generic: speaking thread starting.......

Sat Jun 15 02:30:17 2024 [635917]: Opening audio output system
Sat Jun 15 02:30:17 2024 [636259]: Opening audio output system
Sat Jun 15 02:30:17 2024 [721160]: Using pulse audio output method
Sat Jun 15 02:30:18 2024 [178217]: speak()

Sat Jun 15 02:30:18 2024 [178226]: Setting language zh-cn
Sat Jun 15 02:30:18 2024 [178233]: Requested option by key zh-cn not found.

Sat Jun 15 02:30:18 2024 [178241]: Setting voice type 1
Sat Jun 15 02:30:18 2024 [178248]: There are no voices in the table for language=zh

Sat Jun 15 02:30:18 2024 [178255]: Invalid voice type specified or no voice available!
Sat Jun 15 02:30:18 2024 [178261]: Setting voice type 1
Sat Jun 15 02:30:18 2024 [178268]: There are no voices in the table for language=zh

Sat Jun 15 02:30:18 2024 [178274]: Invalid voice type specified or no voice available!
Sat Jun 15 02:30:18 2024 [178285]: Volume: 100
Sat Jun 15 02:30:18 2024 [178292]: HVolume: 100.000000
Sat Jun 15 02:30:18 2024 [178304]: In stripping ssml: |Hello,this is a test,does it work?|
Sat Jun 15 02:30:18 2024 [178316]: Requested data (0): |Hello,this is a test,does it work?|

Sat Jun 15 02:30:18 2024 [178327]: Generic: leaving write() normally

Sat Jun 15 02:30:18 2024 [178333]: Semaphore on

Sat Jun 15 02:30:18 2024 [178536]: Entering parent process, closing pipes
Sat Jun 15 02:30:18 2024 [178578]: Looping...

Sat Jun 15 02:30:18 2024 [178588]: Returned 34 bytes from get_part

Sat Jun 15 02:30:18 2024 [178597]: Sending buf to child:|Hello,this is a test,does it work?| 34

Sat Jun 15 02:30:18 2024 [178606]: going to write 34 bytes
Sat Jun 15 02:30:18 2024 [178618]: written 34 bytes
Sat Jun 15 02:30:18 2024 [178627]: Waiting for response from child...

Sat Jun 15 02:30:18 2024 [178713]: Starting child...

Sat Jun 15 02:30:18 2024 [178740]: UnBlocking user signal
Sat Jun 15 02:30:18 2024 [178752]: Entering child loop

Sat Jun 15 02:30:18 2024 [178762]: read 34 bytes in child
Sat Jun 15 02:30:18 2024 [178770]: text read is: |Hello,this is a test,does it work?|

Sat Jun 15 02:30:18 2024 [178807]: child: escaped text is |Hello,this is a test,does it work?|
Sat Jun 15 02:30:18 2024 [178815]: child: synth command = |set -o pipefail ; printf %s 'Hello,this is a test,does it work?' | edge-tts --voice zh-CN-XiaoxiaoNeural --file /dev/stdin 2>/dev/null | play -t mp3 -q --ignore-length - |
Sat Jun 15 02:30:18 2024 [178822]: Speaking in child...
Sat Jun 15 02:30:18 2024 [178828]: Blocking user signal
Sat Jun 15 02:30:24 2024 [694842]: subchild terminated -: exit?:1 status:0 signal?:0 signal number:0.

Sat Jun 15 02:30:24 2024 [694873]: UnBlocking user signal
Sat Jun 15 02:30:24 2024 [694893]: child->parent: ok, send more data
Sat Jun 15 02:30:24 2024 [694921]: Ok, received report to continue...

Sat Jun 15 02:30:24 2024 [694934]: Looping...

Sat Jun 15 02:30:24 2024 [694944]: Returned -1 bytes from get_part

Sat Jun 15 02:30:24 2024 [694952]: End of data in parent, closing pipes
Sat Jun 15 02:30:24 2024 [694966]: Sat Jun 15 02:30:24 2024 [694969]Waiting for child...:
read 0 bytes in child
Sat Jun 15 02:30:24 2024 [694982]: child: Pipe closed, exiting, closing pipes..

Sat Jun 15 02:30:24 2024 [695008]: Child ended...

Sat Jun 15 02:30:24 2024 [696141]: child terminated -: exit?:1 status:0 signal?:0 signal number:0.

Sat Jun 15 02:30:29 2024 [872848]: generic: stop()

Sat Jun 15 02:30:29 2024 [872863]: generic: close()

@sthibaul
Copy link
Collaborator

This seems to be coming from 22b3cdb

@sthibaul
Copy link
Collaborator

Thanks for the precise report. AIUI we do want to keep the space requirement for the . case, otherwise we'd spuriously split sentences in e.g. numbers.

I'd say we want to add to module_get_message_part a dividers_nospace parameter whose processing does not require a subsequent space. And then the corresponding configuration options in the few modules that are using it, and a useful default value.

Note that module_get_message_part currently only processes in ascii, not utf-8, that's a separate concern that should be also easy to fix thanks to g_utf8_get_char and g_utf8_next_char

@shenghuang147
Copy link
Author

shenghuang147 commented Jun 16, 2024

Thank you for your work, and I'm not sure if I should raise a question in this issue GenericMaxChunkLength seems to judge the length in bytes rather than characters, and I think it would be more appropriate to use characters.
Also I found that with GenericMaxChunkLength enabled, when reading non-ascii text aloud, in some cases the last character of the text that should be read is lost.

I'll try to trigger this later and submit the logs

@sthibaul
Copy link
Collaborator

GenericMaxChunkLength seems to judge the length in bytes rather than characters, and I think it would be more appropriate to use characters

I don't think it's worth changing it: it's a very rough guess anyway.

in some cases the last character of the text that should be read is lost

Which version did you test with? Note that I fixed #806 recently

@shenghuang147
Copy link
Author

Which version did you test with? Note that I fixed #806 recently

I am very sorry, I have confirmed that this issue is not related to speechd, this problem comes from Okular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants