You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am currently implementing some of the SVE intrinsics in SIMDe (primarily RISCV and emulated version).
However, there are some problems regarding integer divided by zero and floating point NaN/INF division and comparison.
On AArch64, interger divided by zero simply resulting in zero without invoking any hardware exception, which is different than RISCV and X86.
Integer divided by zero resulting in an all-bits-set value on RISCV, and on X86, SIGFPE raised and crash the program.
So my naive implementation would be something like this, the idea is to check if the divsor is zero to decide we should do the division or not.
There are similar problems in floating point comparision (max, min) related functions when NaN is involved.
My qeustion is, do we need to consider such situations when implement SIMDe? Or we just leave it to simpliest implementation without checking the divisor? Because in current NEON implementation in SIMDe, I didn't see anything try to address such problems. Thanks for any advise in advance.
The text was updated successfully, but these errors were encountered:
SIMDe has a series of macros in simde-common.h for dealing with problems like these. Unfortunately, it means some ifdefs and alternate implementations, which can definitely be annoying to implement :(
For the floating-point min/max operations, the right one would be SIMDE_FAST_NANS (which will be defined by default if the -ffinite-math-only flag is passed).
The division by zero thing is more interesting. This can be very tricky with floating point operations; it's actually quite a bit worse than just different behavior on different platforms; on some platforms you can use CPU flags to control whether or not division by zero generates an exception or not. Some platforms share a register for FP env flags between SIMD and non-SIMD code, some don't.
Luckily the issue here is integer division by zero. In C integer division by zero is undefined behavior. I'd be interested to know how you checked what x86 did; the only integer division functions (_mm_div_epi32, etc.) are in SVML and don't actually correspond to individual instructions in hardware, but if you just did a / on a couple of vectors (or scalars) then the operation would be undefined.
The question here is whether division by 0 is defined in the RISC-V ISA, or if it's undefined and the implementation you're seeing happens to yield ~0. I'm honestly not sure where exactly the formal specification for the instructions are, but https://github.com/riscv-software-src/riscv-isa-sim/blob/master/riscv/insns/vdiv_vv.h seems to point it being defined (which, honestly, is what I would expect).
Given that, I think the right solution here would probably be to add a SIMDE_FAST_IDIV0 macro (or something similar). If that is not defined do the check, if it is defined then don't do the check.
Hi, I am currently implementing some of the SVE intrinsics in SIMDe (primarily RISCV and emulated version).
However, there are some problems regarding integer divided by zero and floating point NaN/INF division and comparison.
On AArch64, interger divided by zero simply resulting in zero without invoking any hardware exception, which is different than RISCV and X86.
Integer divided by zero resulting in an all-bits-set value on RISCV, and on X86, SIGFPE raised and crash the program.
So my naive implementation would be something like this, the idea is to check if the divsor is zero to decide we should do the division or not.
See the following example:
There are similar problems in floating point comparision (max, min) related functions when NaN is involved.
My qeustion is, do we need to consider such situations when implement SIMDe? Or we just leave it to simpliest implementation without checking the divisor? Because in current NEON implementation in SIMDe, I didn't see anything try to address such problems. Thanks for any advise in advance.
The text was updated successfully, but these errors were encountered: