Intel® Neural Compressor v1.11 Release
Features
- Quantization
- Supported QDQ as experimental quantization format for ONNX Runtime
- Improved FX symbolic tracing for PyTorch
- Supported multi-metrics for quantization tuning
- Knowledge distillation
- Improved distillation algorithm for intermediate layer knowledge transfer
- Productivity
- Improved quantization productivity for ONNX Runtime through GUI
- Improved PyTorch INT8 model save/load methods
- Ecosystem
- Upstreamed INC quantized Yolov3, DenseNet, Mask-Rcnn, Yolov4 models to ONNX Model Zoo
- Became PyTorch ecosystem tool shortly after published PyTorch INC tutorial
- Examples
- Added INC quantized ResNet50 v1.5 and BERT-Large model for IPEX
- Supported dynamic quantization & weight sharing on bare metal reference engine