Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate allocation sizes and strides more widely. #3375

Merged
merged 3 commits into from
Nov 13, 2024
Merged

Conversation

wujingyue
Copy link
Collaborator

@wujingyue wujingyue commented Nov 8, 2024

Currently, GetMetaData::evaluate validates allocation sizes and strides
only when the TV has allocation that's different from logical. With this
PR, it will validate allocation even when it's the same as logical.

This extra validation captured many bugs including #3386 and #3194.
These bugs are either fixed in this PR or in follow-ups.

Prepares for #3282

@wujingyue
Copy link
Collaborator Author

!test

@wujingyue
Copy link
Collaborator Author

!test

@wujingyue wujingyue changed the base branch from main to bug3386 November 13, 2024 00:03
@wujingyue wujingyue changed the title Fix validateAllocationSizesAndStrides and apply it more broadly. Validate allocation sizes and strides more often. Nov 13, 2024
@wujingyue wujingyue changed the title Validate allocation sizes and strides more often. Validate allocation sizes and strides more widely. Nov 13, 2024
@wujingyue wujingyue marked this pull request as ready for review November 13, 2024 00:05
@@ -209,65 +209,66 @@ class BackwardTraverseFromLogicalToAlloc {
};

void validateAllocationSizesAndStrides(
const std::vector<IterDomain*>& alloc_dom_no_reductions,
const std::vector<IterDomain*>& alloc_dom,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a clean-up. I may have fixed a few bugs and/or hardened the validation, but I don't quite remember what was broken exactly...

} else {
metadata->alloc_size = metadata->logical_size;
metadata->alloc_stride = metadata->logical_stride;
// TODO: validateAllocationSizesAndStrides
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now done unconditionally in inferAndValidateAllocationSizesAndStrides.

metadata->alloc_stride = metadata->logical_stride;
// TODO: validateAllocationSizesAndStrides
}
auto [sizes, strides] =
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a micro optimization to reuse input.sizes() and input.strides(). I didn't do that because I chose to optimize for simplicity for now. Let me know if there's a compelling reason otherwise.

@wujingyue wujingyue requested a review from jjsjann123 November 13, 2024 00:09
@wujingyue
Copy link
Collaborator Author

!test

Copy link
Collaborator

@jjsjann123 jjsjann123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple nitpick/questions.

benchmarks/cpp/lstm_cell.cpp Outdated Show resolved Hide resolved
benchmarks/cpp/lstm_cell.cpp Show resolved Hide resolved
tests/cpp/test_gpu1.cpp Show resolved Hide resolved
csrc/tensor_metadata.cpp Show resolved Hide resolved
csrc/tensor_metadata.cpp Show resolved Hide resolved
Copy link
Collaborator

@jjsjann123 jjsjann123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Base automatically changed from bug3386 to main November 13, 2024 15:03
Currently, GetMetaData::evaluate validates allocation sizes and strides
only when the TV has allocation that's different from logical. With this
PR, it will validate allocation even when it's the same as logical.

This extra validation captured many bugs including #3386 and #3194.
These bugs are either fixed in this PR or in follow-ups.

Fixes #3194.
@wujingyue
Copy link
Collaborator Author

!test

@wujingyue wujingyue merged commit 1a94e02 into main Nov 13, 2024
42 of 43 checks passed
@wujingyue wujingyue deleted the wjy/validate branch November 13, 2024 18:56
@wujingyue wujingyue added bug Something isn't working enhancement New feature or request and removed enhancement New feature or request labels Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants