[TOPI][CUDA] Fix Winograd Kernel Size Support #4276
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The merged PR #4260 fixes the padding issue when building Winograd conv2d for CUDA, but we found the kernel size is still a problem.
The constraints of using Winograd on CUDA has been released in the previous PR #3553. Specifically, the original Winograd limits the kernel size (=3x3), padding (=1x1) and strides (=1x1). PR #3553 released them to square kernel size (e.g., 3x3, 5x5, 7x7), strides (=1x1), and arbitrary padding.
However, even PR #4260 fixes the miscalculation issue caused by padding size, the miscalculation caused by kernel size is still there. Here is a code snippet in
conv2d_winograd.py
after PR #4260:As can be seen, kernel size is forced to 3x3 in the pre-computed case, but it could be any size according to PR #3553. This PR supports the kernel size recovery as follows:
Here is another pending issue that I haven't resolved in this PR: the fixed errors in #4260 as well as this PR should be detected by the unit tests but they weren't. The reaons is that the unit test doesn't cover the part with
pre_computed=True
, but I have no idea how to cover it.@cbalint13 @vinx13 could you review and suggest how to improve the unit test? Thanks.