Skip to content

GPTQModel v1.7.2

Latest
Compare
Choose a tag to compare
@Qubitium Qubitium released this 19 Jan 03:52
· 16 commits to main since this release
d762379

What's Changed

⚡Effective BPW (bits per weight) will now be logged during load().
⚡Reduce loading time on Intel Arc A770/B580 XPU by 3.3x.
⚡Reduce memory usage in MLX conversion.
🐛 Fix Marlin kernel auto-select not checking CUDA compute version.

Full Changelog: v1.7.0...v1.7.2