Skip to content

Intel® Neural Compressor v2.6 Release

Compare
Choose a tag to compare
@chensuyue chensuyue released this 14 Jun 13:55
· 161 commits to master since this release
2928d85
  • Highlights
  • Features
  • Improvements
  • Examples
  • Bug Fixes
  • External Contributions
  • Validated Configurations

Highlights

  • Integrated recent AutoRound with lm-head quantization support and calibration process optimizations
  • Migrated ONNX model quantization capability into ONNX project Neural Compressor

Features

  • [Quantization] Integrate recent AutoRound with lm-head quantization support and calibration process optimizations (4728fd)
  • [Quantization] Support true sequential options in GPTQ (92c942)

Improvements

  • [Quantization] Improve WOQ Linear pack/unpack speed with numpy implementation (daa143)
  • [Quantization] Auto detect available device when exporting (7be355)
  • [Quantization] Refine AutoRound export to support Intel GPU (409231)
  • [Benchmarking] Detect the number of sockets when needed (e54b93)

Examples

  • Upgrade lm_eval to 0.4.2 in PT and ORT LLM example (fdb509) (54f039)
  • Add diffusers/dreambooth example with IPEX (ba4798)

Bug Fixes

  • Fix incorrect dtype of unpacked tensor issue in PT (29fdec)
  • Fix TF LLM SQ legacy Keras environment variable issue (276449)
  • Fix TF estimator issue by adding version check on TF2.16 (855b98)
  • Fix missing tokenizer issue in run_clm_no_trainer.py after using lm-eval 0.4.2 (d64029)
  • Fix AWQ padding issue in ORT (903da4)
  • Fix recover function issue in ORT (ee24db)
  • Update model ckpt download url in prepare_model.py (0ba573)
  • Fix case where pad_max_length set to None (960bd2)
  • Fix a failure for GPU backend (71a9f3)
  • Fix numpy versions for rnnt and 3d-unet examples (12b8f4)
  • Fix CVEs (5b5579) (25c71a) (47d73b) (41da74)

External Contributions

  • Update model ckpt download url in prepare_model.py (0ba573)
  • Fix case where pad_max_length set to None (960bd2)
  • Add diffusers/dreambooth example with IPEX (ba4798)

Validated Configurations

  • Centos 8.4 & Ubuntu 22.04 & Win 11 & MacOS Ventura 13.5
  • Python 3.8, 3.9, 3.10, 3.11
  • PyTorch/IPEX 2.1, 2.2, 2.3
  • TensorFlow 2.14, 2.15, 2.16
  • ITEX 2.13.0, 2.14.0, 2.15.0
  • ONNX Runtime 1.16, 1.17, 1.18