-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Talon microphone check GUI #690
Comments
Interesting idea, tho if I've understood correctly, I'd think you'd back-anchor rather than front-anchor your phrase, eg |
Maybe it should even be unanchored, I've not entirely thought it through. My thinking was the VAD produces a list of segments, and the ^ anchor matches the start of such a segment. So the pipeline might be "this is ... a microphone check" -> VAD -> ["this is", "a microphone check"]. That in turn would result in user.microphone_check_register_breakup() and then user.microphone_check_register_misrecognition() . I'm not sure the logic I wrote in the OP was correct, but I think we'd want it to behave like this (where ellipsis is a period of not speaking which causes a VAD segmentation):
Perhaps I want to have a 'second half' for each of my partial phrase matchers, so ("this", "is a microphone check"), ("this is", "a microphone check") etc. I'd say there'd be a bit of fiddling during implementation. At the moment I'm more interested in if the idea seems plausible and worth doing. |
This will be extremely helpful. I work out of the office on certain days and the environment is definitely more noisy. I would likely make use of this sanity check in the beginning of each in office workday. Sometimes I just find myself second guessing why talon does not understand me instead of actually recording myself and listening back. |
See also TalonCommunity/Wiki#147 |
Possibly out of scope for this issue but it would also potentially be useful to have a visual indication if your microphone volume is too low. This might be something that would be a better fit for the talon HUD cc/ @chaosparrot We'd probably also need some support from talon to figure out whether the microphone level is too low cc/ @lunixbochs |
Users (especially new users) often have issues with recognition accuracy caused by the configuration of their microphone. For example:
I suggest we build a UI in to Talon to help users self-diagnose these issues. It would probably make sense to build this into Talon core in the long term, but in the meantime we should be able to get a similar effect with the 'userspace' APIs. My proposal is as follows.
The mode would use a .talon file like this:
It would also have a .py file registered for the 'pre:phrase' callback so we can get the 'audio_ms' statistic (the length of the audio segmented by the VAD I think). I think we can also extract the raw audio from this (for playback to the user).
The results would be calculated as follows:
@lunixbochs Does this sound like a worthwhile idea? Also, regarding APIs, is audio_ms the right statistic to use for utterance length, and is there a way of getting all audio recognised by Talon within the 10 second window for playback?
The text was updated successfully, but these errors were encountered: