Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNX] Fix sporadic results in BC #3081

Merged
merged 18 commits into from
Nov 25, 2024

Conversation

kshpv
Copy link
Collaborator

@kshpv kshpv commented Nov 13, 2024

Changes

  1. This PR addresses an issue using ONNXRuntime==1.19.2 where a tensor used as both an input and output in a model shares the same memory. This causes unexpected behavior: updating the input tensor inadvertently modifies the statistics data due to memory overlap.
    The issue was confirmed by calling np.shares_memory(input_data['image'], outputs['image']), which returned True, indicating that the input and output tensors share memory. After applying the proposed changes, the same check now returns False, confirming that memory sharing is resolved.
    To fix this, the ONNXEngine logic has been updated to create a copy of any output tensor that is also used as a model input. This ensures that the input tensor and statistics data remain independent, avoiding unintended side effects.

  2. Merge RawReducer and NoopReducer

  3. Minor fixes (remove warnings + fix bug in BC)

Reason for changes

Regression

Related tickets

156025

Tests

PTQ run 549

@github-actions github-actions bot added NNCF ONNX Pull requests that updates NNCF ONNX NNCF PTQ Pull requests that updates NNCF PTQ labels Nov 13, 2024
@kshpv
Copy link
Collaborator Author

kshpv commented Nov 13, 2024

547 job - ONNX
548 job - all

@kshpv kshpv changed the title [ONNX] Fix ModelTransformer [ONNX] Fix sporadic results in BC Nov 14, 2024
@kshpv
Copy link
Collaborator Author

kshpv commented Nov 14, 2024

549 ptq - passed
post_training_quantization_performance 80 - passed
post_training_weight_compression 251 - passed
post_training_weight_compression_performance 19 - passed

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF Common Pull request that updates NNCF Common NNCF OpenVINO Pull requests that updates NNCF OpenVINO labels Nov 14, 2024
@kshpv kshpv marked this pull request as ready for review November 14, 2024 14:05
@kshpv kshpv requested a review from a team as a code owner November 14, 2024 14:05
Comment on lines -125 to 132
ra = fns.where(qval < level_high, qval / (qval - level_high) * right_border, left_border)
with warnings.catch_warnings():
# If `qval` is 0 `rb` will equal `right_border`, and we don't want to show an unnecessary division by 0 warning
# The same for (qval - level_high)
warnings.simplefilter("ignore")
ra_then_result = qval / (qval - level_high) * right_border
rb_then_result = (qval - level_high) / qval * left_border
ra = fns.where(qval < level_high, ra_then_result, left_border)
rb = fns.where(qval > 0.0, rb_then_result, right_border)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please expand

def test_tune_range_zero_division_warning():
to cover the new case?

Copy link
Collaborator Author

@kshpv kshpv Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. test passes on PR and does not on develop

@KodiaqQ
Copy link
Collaborator

KodiaqQ commented Nov 15, 2024

@kshpv, feel free to merge this PR as you need.

@KodiaqQ KodiaqQ requested review from KodiaqQ and removed request for KodiaqQ November 15, 2024 12:03
@kshpv kshpv marked this pull request as draft November 15, 2024 12:04
@kshpv kshpv requested a review from alexsu52 November 18, 2024 14:02
@kshpv kshpv marked this pull request as ready for review November 18, 2024 14:02
warnings.simplefilter("ignore")
rb_then_result = (qval - level_high) / qval * left_border
# Avoid division by zero
qval_nonzero = fns.where(qval == 0, fns.ones_like(qval), qval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, check the performance of the function after your changes. @nikita-savelyevv, it looks like we discussed it already. Could you remember us the solution?

Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I remember ignoring the warning was the solution with the least impact on performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, should I rollback or keep this one?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined towards moving the line under the catch_warning context manager.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perf measurment:
with warnings context manager: ~17.648 sec
without warning context manager but with: ~20.809 sec

qval_nonzero = fns.where(qval == 0, fns.ones_like(qval), qval)
qval_not_high = fns.where(qval - level_high == 0, fns.ones_like(qval), qval - level_high)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshpv you can use just 1.0 instead of fns.ones_like(qval)

qval_nonzero = fns.where(qval == 0, 1.0 , qval)
qval_not_high = fns.where(qval - level_high == 0, 1.0 , qval - level_high)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshpv you can use just 1.0 instead of fns.ones_like(qval)

qval_nonzero = fns.where(qval == 0, 1.0 , qval)
qval_not_high = fns.where(qval - level_high == 0, 1.0 , qval - level_high)

I checked your proposed version and it has the same performance as with fns.ones_like(qval) :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more about avoiding creating extra instances of tensor than performance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to use implementation with the best performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rollbacked

@kshpv
Copy link
Collaborator Author

kshpv commented Nov 19, 2024

Found the open issue with the same problem on ONNXRuntime - microsoft/onnxruntime#21922

@kshpv kshpv requested a review from alexsu52 November 20, 2024 10:51
Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AlexanderDokuchaev AlexanderDokuchaev merged commit 2284df5 into openvinotoolkit:develop Nov 25, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental NNCF Common Pull request that updates NNCF Common NNCF ONNX Pull requests that updates NNCF ONNX NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants