Release Intel® Neural Compressor v1.11 Release · intel/neural-compressor

Features

Quantization
- Supported QDQ as experimental quantization format for ONNX Runtime
- Improved FX symbolic tracing for PyTorch
- Supported multi-metrics for quantization tuning
Knowledge distillation
- Improved distillation algorithm for intermediate layer knowledge transfer
Productivity
- Improved quantization productivity for ONNX Runtime through GUI
- Improved PyTorch INT8 model save/load methods
Ecosystem
- Upstreamed INC quantized Yolov3, DenseNet, Mask-Rcnn, Yolov4 models to ONNX Model Zoo
- Became PyTorch ecosystem tool shortly after published PyTorch INC tutorial
Examples
- Added INC quantized ResNet50 v1.5 and BERT-Large model for IPEX
- Supported dynamic quantization & weight sharing on bare metal reference engine

Provide feedback