Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problematic difference between FakeQuantize for activations and weights #651

Open
Veccoy opened this issue Aug 29, 2024 · 1 comment
Open

Comments

@Veccoy
Copy link

Veccoy commented Aug 29, 2024

Checklist

  • I have searched related issues but cannot get the expected help.
  • I have read related documents and don't know what to do.

Describe the question you meet

Hi, I'm trying to export a quantized model after an QAT experiment. I have implemented the following hook to export it into ONNX format using the get_deploy_model method of MMArchitectureQuant.

def after_train(self, runner) -> None:
    """Export the quantized model to onnx format

    Args:
        runner (Runner): The runner of the training, validation or testing
            process.
    """
    try:
        check_torch_version()
    except AssertionError as err:
        print(repr(err))
        return
    
    if runner.distributed:
        quantized_model = runner.model.module.get_deploy_model(self.mode)
    else:
        quantized_model = runner.model.get_deploy_model(self.mode)
    quantized_model.eval()

    dataset_type = runner.cfg.get("dataset_type")
    if dataset_type == CityscapesDataset:
        input = torch.randn(1, 3, 512, 1024)
    elif dataset_type == CocoDataset:
        input = torch.randn(1, 3, 800, 1333)
    elif dataset_type == ImageNet:
        input = torch.randn(1, 3, 224, 224)
    else:
        raise TypeError(f"Dataset type {dataset_type} is not supported yet. You can add it in the above code lines.")
    
    torch.onnx.export(quantized_model, input,
                        runner.work_dir + '/quantized_model.onnx',
                        input_names=["images"], output_names=["output"],
                        operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK,
                        verbose=True,
                        do_constant_folding=True,
                        opset_version=17,
                        dynamic_axes=None)
    
    print(f"Export of quantized model to onnx completed, save to {runner.work_dir + '/quantized_model.onnx'}")

In this built-in method, the post_process_for_deploy from the NativeQuantizer is applied and seems to process specifically the weight FakeQuantize modules. Moreover, the end of the get_deploy_model method of MMArchitectureQuant seems to give a specific postprocess to activation FakeQuantize module by copying it and changing the nature of the Torch class used (?).

Then, when I visualize my ONNX on Netron, the traduction made for activation and weight FakeQuantize modules are different. This is problematic for execution on specific hardware as the traduction used for weight FakeQuantize modules is not recognized.

image

How can I get the same ONNX QuantizeLinear+DequantizeLinear layers for both activation and weight FakeQuantize modules?

Post related information

Here is my quantization configuration used with OpenVINOQuantizer:

global_qconfig = dict(
    w_observer=dict(type='PerChannelMinMaxObserver'),
    a_observer=dict(type='MovingAverageMinMaxObserver'),
    w_fake_quant=dict(type='FakeQuantize'),
    a_fake_quant=dict(type='FakeQuantize'),
    w_qscheme=dict(
        qdtype='qint8', bit=8, is_symmetry=True, is_symmetric_range=True),
    a_qscheme=dict(qdtype='quint8', bit=8, is_symmetry=True),
)
@Veccoy
Copy link
Author

Veccoy commented Sep 13, 2024

I think this is coming from the use of specific PyTorch QAT modules that wrap the weights FakeQuantize modules as attributes, while activation FakeQuantize modules appear as standalone nodes in the Torch fx graph.

image
image

This structure is made when preparing the Torch fx graph and is kept until the export of the model, which generate ONNX files with ATen operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant