The OpenCL implementation is faster in processing neighbouring pixels (with a window of 31x19) #7907
Unanswered
immortalsalomon
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, everyone.
The general concept of the program is simple. For each pixel I want to find the maximum and minimum values in the neighbouring pixels using a window of 31x19 pixels. Furthermore, before considering a pixel in the window I want to check that it is valid. Let us say that a pixel is valid if its value is in a certain range.
Excluding for now the search for the maximum and minimum pixel to simplify things.
I don't understand how the implementation in Halide takes around 70ms (without min max search) while the one in OpenCL takes only 3-4ms (with min max search).
Halide Algorithm:
Halide Scheduling:
OpenCL implementation:
I am sure that what slows down are the check for the validity of a point and the double for loop to process all neighbouring pixels. Could it be that there are parameters to optimise the implementation for OpenCL?
I found this issue related to the median filter #7302 . Do you think the answer is also feasible in this situation? If so, could you give me some tips.
Thank you in advance for your time :).
Beta Was this translation helpful? Give feedback.
All reactions