Intel® Neural Compressor v2.6 Release
- Highlights
- Features
- Improvements
- Examples
- Bug Fixes
- External Contributions
- Validated Configurations
Highlights
- Integrated recent AutoRound with lm-head quantization support and calibration process optimizations
- Migrated ONNX model quantization capability into ONNX project Neural Compressor
Features
- [Quantization] Integrate recent AutoRound with lm-head quantization support and calibration process optimizations (4728fd)
- [Quantization] Support true sequential options in GPTQ (92c942)
Improvements
- [Quantization] Improve WOQ Linear pack/unpack speed with numpy implementation (daa143)
- [Quantization] Auto detect available device when exporting (7be355)
- [Quantization] Refine AutoRound export to support Intel GPU (409231)
- [Benchmarking] Detect the number of sockets when needed (e54b93)
Examples
- Upgrade lm_eval to 0.4.2 in PT and ORT LLM example (fdb509) (54f039)
- Add diffusers/dreambooth example with IPEX (ba4798)
Bug Fixes
- Fix incorrect dtype of unpacked tensor issue in PT (29fdec)
- Fix TF LLM SQ legacy Keras environment variable issue (276449)
- Fix TF estimator issue by adding version check on TF2.16 (855b98)
- Fix missing tokenizer issue in run_clm_no_trainer.py after using lm-eval 0.4.2 (d64029)
- Fix AWQ padding issue in ORT (903da4)
- Fix recover function issue in ORT (ee24db)
- Update model ckpt download url in prepare_model.py (0ba573)
- Fix case where pad_max_length set to None (960bd2)
- Fix a failure for GPU backend (71a9f3)
- Fix numpy versions for rnnt and 3d-unet examples (12b8f4)
- Fix CVEs (5b5579) (25c71a) (47d73b) (41da74)
External Contributions
- Update model ckpt download url in prepare_model.py (0ba573)
- Fix case where pad_max_length set to None (960bd2)
- Add diffusers/dreambooth example with IPEX (ba4798)
Validated Configurations
- Centos 8.4 & Ubuntu 22.04 & Win 11 & MacOS Ventura 13.5
- Python 3.8, 3.9, 3.10, 3.11
- PyTorch/IPEX 2.1, 2.2, 2.3
- TensorFlow 2.14, 2.15, 2.16
- ITEX 2.13.0, 2.14.0, 2.15.0
- ONNX Runtime 1.16, 1.17, 1.18