Discrepancy between TRT engines for same models - TensorRT Issue cross-reference #113

bernardrb · 2024-05-24T08:56:32Z

We are trying to recreate the results for EfficientViT-SAM, but are having issues with certain models. In this case, we reported issues with l2_encoder.onnx. In summary, these are the results:

L2 - FP16
{"all": 0.0, "large": 0.0, "medium": 0.0, "small": 0.0}

L2 - FP32
{"all": 79.12385607181146, "large": 83.05853600575689, "medium": 81.50597370444349, "small": 74.8830670481846}

All details about the setup is available through the link: Accuracy failure of TensorRT 8.6.3 when running trtexec built engine on GPU RTX4090

ovunctuzel-bc · 2024-05-29T23:10:01Z

I was able to resolve a similar issue by setting some layers in the attention block to FP32 precision. Might help with this case as well. I was able to retain 2x speedup compared to a full FP32 model.
See: #116 (comment)

bernardrb mentioned this issue May 29, 2024

NaN values with FP16 TensorRT Inference #116

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy between TRT engines for same models - TensorRT Issue cross-reference #113

Discrepancy between TRT engines for same models - TensorRT Issue cross-reference #113

bernardrb commented May 24, 2024 •

edited

Loading

ovunctuzel-bc commented May 29, 2024 •

edited

Loading

Discrepancy between TRT engines for same models - TensorRT Issue cross-reference #113

Discrepancy between TRT engines for same models - TensorRT Issue cross-reference #113

Comments

bernardrb commented May 24, 2024 • edited Loading

ovunctuzel-bc commented May 29, 2024 • edited Loading

bernardrb commented May 24, 2024 •

edited

Loading

ovunctuzel-bc commented May 29, 2024 •

edited

Loading