Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug on Arm when using simde/x86/sse.h #897

Closed
phridley opened this issue Aug 9, 2021 · 6 comments
Closed

Bug on Arm when using simde/x86/sse.h #897

phridley opened this issue Aug 9, 2021 · 6 comments

Comments

@phridley
Copy link

phridley commented Aug 9, 2021

Hi,

in simde/x86/sse.h the replacement for mm_prefetch is defined as

simde_mm_prefetch (char const* p, int i)

This falls back to the compiler's buildin_prefetch which is defined as __builtin_prefetch (const void *addr[, rw[, locality]])

(https://developer.arm.com/documentation/101458/2100/Optimize/Prefetching-with---builtin-prefetch)

So in simde/x86/sse.h, it should really be the following

simde_mm_prefetch (const void *p, int i)

Please can someone take a look.

Thanks

@aqrit
Copy link
Contributor

aqrit commented Aug 9, 2021

I don't understand...

simde_mm_prefetch emulates _mm_prefetch (char const* p, int i).
const char* and char const* are the same thing.
__builtin_prefetch should accept char pointers without warnings.

Perhaps you're looking for a generic prefetcher:
nemequ/hedley#31 ?

@phridley
Copy link
Author

phridley commented Aug 9, 2021

Please have a look at line 2244 in https://github.com/DLTcollab/sse2neon/blob/master/sse2neon.h

That implementation works fine with GCC on Arm. In this case, the compilers don't accept char const* p, they need const void *p to match what __builtin_prefetch has.

@aqrit
Copy link
Contributor

aqrit commented Aug 9, 2021

The function signature of _mm_prefetch is char const* and should not be changed (this is a bug in sse2neon).
One could cast to a void pointer within simde_mm_prefetch (eg: __builtin_prefetch((const void *)p)).

As far as I can tell, such a cast should not be needed (godbolt).

But don't be discouraged, I'm not a part of this project.

@nemequ
Copy link
Member

nemequ commented Aug 9, 2021

@aqrit is right, SSE2NEON is using the wrong type. So are GCC and clang, though. MSVC and ICC use the correct type (char const*). Intel's documentation is the authoritative source. I agree that the type of the parameter to _mm_prefetch should have been const void* not const char*, but that's not up to us (and it's not even close to the top of the list of really bad decisions Intel made with their SIMD APIs).

I've gone back and forth a bit on whether to change it. Using char const* is pretty clearly the most correct option, but since GCC and clang are already more permissive using const void* makes for less friction when coming from those compilers, and we would still correctly handle all valid input (we would just also accept some invalid input, the same set that GCC and clang already do).

I think I'll go ahead and change SIMDe to accept const void*, but to be clear all this will really do is hide a bug in your code… if you try to compile your code on MSVC, for example, you're likely to end up with a warning or error.

@phridley
Copy link
Author

phridley commented Aug 9, 2021

Thanks @aqrit and @nemequ for the explanation here - understood.

@nemequ nemequ closed this as completed in 26d515f Aug 10, 2021
@nemequ
Copy link
Member

nemequ commented Aug 10, 2021

This should work without the cast now. This also reminded me that the implementation had a lot of room for improvement, so I added a much more complete implementation which should work pretty reliably on most compilers and architectures.

Thanks for reporting this, please let me know if you notice any other issues :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants