Skip to content

Intel® Neural Compressor v2.3.2 Release

Compare
Choose a tag to compare
@chensuyue chensuyue released this 23 Nov 15:30
· 626 commits to master since this release
  • Features
  • Bug Fixes

Features

  • Reduce memory consumption in ONNXRT adaptor (f64833)
  • Support MatMulFpQ4 for onnxruntime 1.16 (1beb43)
  • Support MatMulNBits for onnxruntime 1.17 (67a31b)

Bug Fixes

  • Update ITREX version in ONNXRT WOQ example and fix bugs in hf models (0ca51a)
  • Update ONNXRT WOQ example into llama-2-7b (7f2063)
  • Fix ONNXRT WOQ failed with None model_path (cbd0a4)

Validated Configurations

  • Centos 8.4 & Ubuntu 22.04
  • Python 3.10
  • TensorFlow 2.13
  • ITEX 2.13
  • PyTorch/IPEX 2.0.1+cpu
  • ONNX Runtime 1.15.1
  • MXNet 1.9.1