You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I test latest code on MI210, I got 0.36TOPS,
if i removed #program unroll,I got 0.88TOPS.
compares to nvidia-A100 4.3 Tops, it's only 20%.
considering A100 and Mi210 has similar fp32 ops capability, is it reasonable or still have space to improve mi210 performance?
Hi Alice, thanks for looking into this, and sorry about the extremely late response.
We have been noticing poor performance on MI250X compared to A100. We are not yet sure what is causing this... We will let you know if we figure out what is causing our performance loss.
I test latest code on MI210, I got 0.36TOPS,
if i removed #program unroll,I got 0.88TOPS.
compares to nvidia-A100 4.3 Tops, it's only 20%.
considering A100 and Mi210 has similar fp32 ops capability, is it reasonable or still have space to improve mi210 performance?
configs:
Lattice = 64 64 64 20
SIMULATeQCD code: main branch, git commit 2022 Aug 2, 551ccca
-DARCHITECTURE="gfx90a" //for mi210
mpiexec -np 1 --allow-run-as-root ./MultiRHSProf (8rhs)
The text was updated successfully, but these errors were encountered: