Neural Coder for Quantization

This feature helps automatically enable quantization on Deep Learning models and automatically evaluates for the best performance on the model. It is a code-free solution that can help users enable quantization algorithms on a model with no manual coding needed. Supported features include Post-Training Static Quantization, Post-Training Dynamic Quantization, and Mixed Precision.

Features Supported

Post-Training Static Quantization for Stock PyTorch (with FX backend)
Post-Training Static Quantization for IPEX
Post-Training Dynamic Quantization for Stock PyTorch
Mixed Precision for Stock PyTorch

Models Supported

HuggingFace Transformers models
torchvision models
Broad models (under development)

Usage

PyPI distribution with a one-line API call
JupyterLab extension

Example

PyPI distribution:

HuggingFace Transformers models: text-classification/run_glue.py

from neural_coder import auto_quant
auto_quant(
    code="https://github.com/huggingface/transformers/blob/v4.21-release/examples/pytorch/text-classification/run_glue.py",
    args="--model_name_or_path albert-base-v2 --task_name sst2 --do_eval --output_dir result",
)

torchvision models: imagenet/main.py

from neural_coder import auto_quant
auto_quant(
    code="https://github.com/pytorch/examples/blob/main/imagenet/main.py",
    args="-a alexnet --pretrained -e /path/to/imagenet/",
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization.md

Quantization.md

Neural Coder for Quantization

Features Supported

Models Supported

Usage

Example

PyPI distribution:

Files

Quantization.md

Latest commit

History

Quantization.md

File metadata and controls

Neural Coder for Quantization

Features Supported

Models Supported

Usage

Example

PyPI distribution: