You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 5, 2018. It is now read-only.
Currently, the library we are using is CMU Sphinx: http://cmusphinx.sourceforge.net/
The problem is the unlimited vocabulary recognition is extremely difficult locally. Even Google sends speech to their servers for more accurate speech recognition. http://9to5google.com/2016/03/11/google-accurate-offline-voice-recognition/
The good news is that it is probably possible to get highly accurate speech recognition locally, the bad news is that it will take significant processing power and memory. If we choose to use our own speech recognition system, then we may have to stop using CMU Sphinx or modify their code.
The text was updated successfully, but these errors were encountered:
Well some of us (like me) are running on pretty decent hardware and can support the amount of processing required for accurate speech recognition.
Also I think google provides an API for speech recognition so that may be an option for others.
That said how about changing the webcam select window to "Machine boot properties" window and adding another drop down - "Speech recognition mode" with three options: "Local (fast)", "Local (advanced)" and "Online (send speech to Google)". Or something similar
The problem with a service like Google is that there is only a limited number of requests you can make. Also, their desktop Speech API is currently in preview and may cost money for Google Cloud instances. Additionally, according to this link: http://stackoverflow.com/questions/12721436/google-speech-api, it may not be wise to just latch onto a Google service. If this is the only other option, I am willing to look into it, but it is far from ideal. At least, with CMU Sphinx, we have unlimited requests and no API changes suddenly.
Yeah, I expect this issue to be open for a while. I don't see any near-term solution for this problem. Additionally, to train a system like this, you need thousands of hours of audio with various noise levels and modulations and accents. Accessing open training sets like these are rare to come by.
Currently, the library we are using is CMU Sphinx: http://cmusphinx.sourceforge.net/
The problem is the unlimited vocabulary recognition is extremely difficult locally. Even Google sends speech to their servers for more accurate speech recognition.
http://9to5google.com/2016/03/11/google-accurate-offline-voice-recognition/
The good news is that it is probably possible to get highly accurate speech recognition locally, the bad news is that it will take significant processing power and memory. If we choose to use our own speech recognition system, then we may have to stop using CMU Sphinx or modify their code.
The text was updated successfully, but these errors were encountered: