You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Code:
void f (void * restrict in, void * restrict out, int n, int cond)
{
size_t vl = 101;
for (size_t i = 0; i < n; i++)
{
vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
__riscv_vse8_v_i8mf8 (out + i, v, vl);
}
for (size_t i = 0; i < n; i++)
{
vuint8mf8_t index = __riscv_vle8_v_u8mf8 (in + i + 300, vl);
vfloat32mf2_t v = __riscv_vle32_v_f32mf2 (in + i + 600, vl);
v = __riscv_vle32_v_f32mf2_tu (v, in + i + 800, vl);
__riscv_vsoxei8_v_f32mf2 (out + i + 200, index, v, vl);
}
}
GCC by default enable VTYPE && POLICY fusion of vsetvli as long as they are compatible:
I believe most of the cases, that GCC codegen is better.
However, for some vendor RVV CPU which has vector register renaming
&& vsetvli special optimization (vsetvli execution latency almost consume 0 cycle most of the time),
I believe this following codegen is better:
void f (void * restrict in, void * restrict out, int n, int cond)
{
size_t vl = 101;
for (size_t i = 0; i < n; i++)
{
vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
__riscv_vse8_v_i8mf8 (out + i, v, vl);
}
for (size_t i = 0; i < n; i++)
{
vuint8mf8_t index = __riscv_vle8_v_u8mf8 (in + i + 300, vl);
vfloat32mf2_t v = __riscv_vle32_v_f32mf2 (in + i + 600, vl);
__riscv_vsoxei8_v_f32mf2 (out + i + 200, index, v, vl);
}
}
However, Policy fusion is not always the optimal, Is it resonable adding such compile option (-mprefer-agnostic) to disable tail Policy && mask policy
fusion in vsetvli ?
Thanks
The text was updated successfully, but these errors were encountered:
That's highly depended on the uarch, so I would prefer just tie to -mtune like other cost model for GCC, but I think it's harmless to just add that in GCC first to see if that's useful, then implement to LLVM and then document that option here.
Personally I would prefer do not document those optimization option in this repo since those flags are compiler-dependent, and just document for necessary common interface here like -march, -mabi and -mcmodel here.
Agreed. This is going to be dependent on multiple features of the uarch.
So I think the question is whether or not any such implementations exist or will exist in the near future. If not, then let's not complicate things right now. If it looks like such architectures are on the horizon, then we might as well be prepared for them.
Consider this following case:
https://godbolt.org/z/oTWvrsGhE
GCC by default enable VTYPE && POLICY fusion of vsetvli as long as they are compatible:
I believe most of the cases, that GCC codegen is better.
However, for some vendor RVV CPU which has vector register renaming
&& vsetvli special optimization (vsetvli execution latency almost consume 0 cycle most of the time),
I believe this following codegen is better:
I think fusing VTYPE is always optimal, for example:
https://godbolt.org/z/dfx93jzrv
code:
optimal codegen:
However, Policy fusion is not always the optimal, Is it resonable adding such compile option (-mprefer-agnostic) to disable tail Policy && mask policy
fusion in vsetvli ?
Thanks
The text was updated successfully, but these errors were encountered: