Intel® Neural Compressor v2.3.2 Release
- Features
- Bug Fixes
Features
- Reduce memory consumption in ONNXRT adaptor (f64833)
- Support MatMulFpQ4 for onnxruntime 1.16 (1beb43)
- Support MatMulNBits for onnxruntime 1.17 (67a31b)
Bug Fixes
- Update ITREX version in ONNXRT WOQ example and fix bugs in hf models (0ca51a)
- Update ONNXRT WOQ example into llama-2-7b (7f2063)
- Fix ONNXRT WOQ failed with None model_path (cbd0a4)
Validated Configurations
- Centos 8.4 & Ubuntu 22.04
- Python 3.10
- TensorFlow 2.13
- ITEX 2.13
- PyTorch/IPEX 2.0.1+cpu
- ONNX Runtime 1.15.1
- MXNet 1.9.1