In this project you can run visual wake words task on GAP9 chip. The NNs used for this task have been taken from open source projects:
- MobileNet: used for TinyML benchmark for best speed
- MIT Proxyless NAS: for best accuracy
The model can be chosen via Kconfig
, NNTool will take the right model path to generate the Autotiler code using the nntool_generate_model.py
script. Once the Autotiler Model is generated it is compiled and run to generate the final NN GAP code.
The same script can be used to test the deployable model executing inference on NNTool with the provided image.
The application has 2 operating modes that can be chosen via Kconfig
described in the following:
In this mode it simply run the NN on samples from files. This mode can emulate the DEMO mode by resizing the image from the CAMERA size (emulating the images coming from a similar camera) by enabling the INFERENCE_RESIZER
in Kconfig
. Otherwise no resize will be applied and the images from files are expected of the correct size.
You can test the expected results using the target test_nntool_inference
: it will use the same image used by the C code on GAP and runs the NNTool inference using nntool_generate_model.py --mode=test
.
NOTE: Only usable on board using camera OV5647
In this mode the application runs on GAP9 taking inputs from the camera and running inference with the selected NN.
To test the accuracy of the models you can use the scripts in the accuracy
folder:
- Download the coco dataset using
download_coco.sh
- create the VWW annotations from the coco dataset with the visualwakewords package using
create_vww_dataset.sh
(it will automatically clone the repo and create the annotations) - Run the accuracy script
test_accuracy.py
: by default it will run the original tflite model, if you provide--test_nntool
flag, it will run it in NNTool.
NOTE: this accuracy has been calculated with the scripts above, the image preprocessing is just a bilinear resize of the original coco image without cropping. Do not compare these numbers with publicly available accuracy metrics since they might differ in the way the images are preprocessed.
Model | TFLite Acc | NNTool Acc |
---|---|---|
visual_wake_quant.tflite | 89.45% (0: 97.13%, 1: 82.75%) | 88.57% (0: 97.71%, 1: 80.59%) |
vww_96_int8.tflite | 79.56% (0: 94.04%, 1: 66.91%) | 79.41% (0: 94.01%, 1: 66.71%) |