Audio file
demo.mov
Recognized speech text
He hoped there would be stew for dinner, turnips and carrots and bruised potatoes and fat mutton pieces to be ladled out in thick, peppered, flour-fattened sauce.
This model requires additional module.
pip3 install librosa
pip3 install pyaudio # for microphone input mode
If you use --disable_ailia_tokenizer
option, this model requires additional module.
pip3 install transformers
Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.
For the sample wav,
$ python3 whisper.py
If you want to specify the audio, put the file path after the --input
option.
$ python3 whisper.py --input AUDIO_FILE
By adding the --model_type
option, you can specify model type which is selected from "tiny", "base", "small", "medium". (default is base)
$ python3 whisper.py --model_type small
By giving the --task translate
option, you can translate it into English.
$ python3 whisper.py --task translate
If you specify the -V
option, it will be in input mode from the microphone.
$ python3 whisper.py -V
- speak into the microphone when "Please speak something."
- end the recording after about 0.5 second of silence and do voice recognition
- return to 1 again after displaying the forecast results
- type
Ctrl+c
if you want to exit
Pytorch
ONNX opset=11
encoder_tiny.onnx.prototxt
encoder_base.onnx.prototxt
encoder_small.onnx.prototxt
encoder_medium.onnx.prototxt
decoder_tiny_fix_kv_cache.onnx.prototxt
decoder_base_fix_kv_cache.onnx.prototxt
decoder_small_fix_kv_cache.onnx.prototxt
decoder_medium_fix_kv_cache.onnx.prototxt
encoder_tiny.opt.onnx.prototxt
encoder_base.opt.onnx.prototxt
encoder_small.opt.onnx.prototxt
encoder_medium.opt.onnx.prototxt
decoder_tiny_fix_kv_cache.opt.onnx.prototxt
decoder_base_fix_kv_cache.opt.onnx.prototxt
decoder_small_fix_kv_cache.opt.onnx.prototxt
decoder_tiny.onnx.prototxt
decoder_base.onnx.prototxt
decoder_small.onnx.prototxt
decoder_medium.onnx.prototxt