-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_mm_rsqrt_ss not matching simde_mm_rsqrt_ss fail #1222
Comments
This code gives different results when run on intel and with simd-everywhere headers on cortex-a72 void ldump_debug (char *t, void *_d, int len) __m128 t = { 0x00002041, 00, 00, 00 } ; On Cortex-a72: LOCAL: 00 00 34 3C 00 00 00 00 00 00 00 00 00 00 00 00 |
Hello @YileKu and thank you for your report Did you try compiling with |
I will try that thank you.
…On Tue, Sep 17, 2024 at 6:48 AM Michael R. Crusoe ***@***.***> wrote:
Hello @YileKu <https://github.com/YileKu> and thank you for your report
Did you try compiling with -DSIMDE_ACCURACY_PREFERENCE=2, or adding #define
SIMDE_ACCURACY_PREFERENCE 2 before including the SIMDe header in your
application?
—
Reply to this email directly, view it on GitHub
<#1222 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWCEAKYNC7FJ7BZZ4X3OTZXAQIRAVCNFSM6AAAAABOIHS22WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJVGY3DEMZUGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
So isn’t the precision implicit in the API?
Are there other AVX apis that need a clarification when being mapped to
NEON?
…On Wed, Sep 18, 2024 at 10:01 AM Yile Ku ***@***.***> wrote:
I will try that thank you.
On Tue, Sep 17, 2024 at 6:48 AM Michael R. Crusoe <
***@***.***> wrote:
> Hello @YileKu <https://github.com/YileKu> and thank you for your report
>
> Did you try compiling with -DSIMDE_ACCURACY_PREFERENCE=2, or adding #define
> SIMDE_ACCURACY_PREFERENCE 2 before including the SIMDe header in your
> application?
>
> —
> Reply to this email directly, view it on GitHub
> <#1222 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABKWCEAKYNC7FJ7BZZ4X3OTZXAQIRAVCNFSM6AAAAABOIHS22WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJVGY3DEMZUGA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
That's a good question. I didn't write this code. I think https://github.com/simd-everywhere/simde?tab=readme-ov-file#caveats should be updated with this information |
Tried with the #define above and it still didn't work. |
The rsqrt instructions are interesting. They're not actually specified to require bit-accurate implementations, but are instead specified as being mathematically accurate to a given precision. See the Intel API docs:
The instructions aren't even bit-compatible across CPU manufacturers… Intel and AMD return different values. I'm not saying the implementation is perfect, only that bit-accurate results are not expected. It's possible some implementations have a higher maximum relative error than specified, but they should be pretty comparable, at least with a higher accuracy preference selected. |
Thanks for the explanation.
…On Thu, Sep 26, 2024 at 8:05 AM Evan Nemerson ***@***.***> wrote:
The rsqrt instructions are interesting. They're not actually specified to
require bit-accurate implementations, but are instead specified as being
mathematically accurate to a given precision. See the Intel API docs
<https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_rsqrt_ss&ig_expand=5647>
:
The maximum relative error for this approximation is less than 1.5*2^-12.
The instructions aren't even bit-compatible across CPU manufacturers… Intel
and AMD return different values
<https://robert.ocallahan.org/2021/09/rr-trace-portability-diverging-behavior.html>
.
I'm not saying the implementation is perfect, only that bit-accurate
results are not expected. It's possible some implementations have a higher
maximum relative error than specified, but they *should* be pretty
comparable, at least with a higher accuracy preference selected.
—
Reply to this email directly, view it on GitHub
<#1222 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWCECZTWLKEGZMGKH2JJDZYQIDBAVCNFSM6AAAAABOIHS22WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZXGA3TKNJWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
A0: 00 00 40 40 00 00 00 00 00 00 00 00 00 00 00 00
B0: 00 00 80 3F 00 00 00 00 00 00 00 00 00 00 00 00
mul_a0: 00 00 10 41 00 00 00 00 00 00 00 00 00 00 00 00
mul_b0: 00 00 80 3F 00 00 00 00 00 00 00 00 00 00 00 00
add_ss: 00 00 20 41 00 00 00 00 00 00 00 00 00 00 00 00
root: 00 E0 A1 2E 00 00 00 00 00 00 00 00 00 00 00 00
On a Cortex-A72 using simde_mm_rsqrt_ss:
A0: 00 00 40 40 00 00 00 00 00 00 00 00 00 00 00 00
B0: 00 00 80 3F 00 00 00 00 00 00 00 00 00 00 00 00
add_ss: 00 00 20 41 00 00 00 00 00 00 00 00 00 00 00 00
root: 00 80 A1 3E 00 00 00 00 00 00 00 00 00 00 00 00
The text was updated successfully, but these errors were encountered: