GPTQModel v1.4.5
What's Changed
⚡ Windows 11 support added/validated with DynamicCuda
and Torch
kernels.
⚡ Ovis 1.6 VL model support with image data calibration.
⚡ Reduced quantization vram usage.
🐛 Fixed dynamic
controlled layer loading logic
- Refractor by @Qubitium in #895
- Add platform check by @LRL-ModelCloud in #899
- Exclude marlin & exllama on windows by @CSY-ModelCloud in #898
- Remove unnecessary backslash in the expression & typehint by @CSY-ModelCloud in #903
- Add DEVICE.ALL by @LRL-ModelCloud in #901
- [FIX] the error of loading quantized model with dynamic by @ZX-ModelCloud in #907
- [FIX] gpt2 quantize error by @ZX-ModelCloud in #912
- Simplify checking generated str for vllm test & fix transformers version for cohere2 by @CSY-ModelCloud in #914
- [MODEL] add OVIS support by @ZX-ModelCloud in #685
- Fix IDE warning marlin not in all by @CSY-ModelCloud in #920
Full Changelog: v1.4.4...v1.4.5