Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

SIMD detection doesn't work on MSVC x64 targets #25

Open
cdwfs opened this issue Aug 16, 2017 · 5 comments
Open

SIMD detection doesn't work on MSVC x64 targets #25

cdwfs opened this issue Aug 16, 2017 · 5 comments

Comments

@cdwfs
Copy link

cdwfs commented Aug 16, 2017

The checks in mathfu/utilities.h to enable MATHFU_COMPILE_WITH_SIMD don't cover Visual Studio x64 targets. The code currently checks the existence (and value) of _M_IX86_FP, which is only defined for x86 (32-bit) targets. SSE and SSE2 support is implicit on x64 targets, which can be handled with the following addition (or similar):

#elif defined(_M_IX86_FP)
#if _M_IX86_FP >= 1        // SSE enabled
#define MATHFU_COMPILE_WITH_SIMD
#endif  // _M_IX86_FP >= 1
+ #elif (defined(_M_AMD64) || defined(_M_X64))
+ #define MATHFU_COMPILE_WITH_SIMD // MSVC targeting X64 implies SSE+SSE2
#endif
#endif  // !defined(MATHFU_COMPILE_WITHOUT_SIMD_SUPPORT)
@stewartmiles
Copy link
Contributor

Wanna send us a pull request?

@cdwfs
Copy link
Author

cdwfs commented Aug 16, 2017

Sure, will do

cdwfs added a commit to cdwfs/mathfu that referenced this issue Aug 16, 2017
@ghost
Copy link

ghost commented Jan 13, 2018

@cdwfs do you see performance improvement while using SIMD on MSVC target?
For me the Vector Benchmark is 50% slower using SIMD...

@cdwfs
Copy link
Author

cdwfs commented Jan 17, 2018

I see the same results: a ~50% slowdown when enabling SIMD. See the comments on my pull request #26.

@ghost
Copy link

ghost commented Jan 17, 2018

Just did, thank you. Indeed x64 is irrelevant to the issue you mentioned, I was running my bench in x86.
For me the way the lib was designed and coded make its hard to really benefit of SSE2 Instruction as there're still conversion, copy constructors, temp objects...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants