You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the t_mesplq with the default local volume (4x2x2x4) produces non-constant results when run repeatedly. The problem goes away if the number if threads is less than the number of sites.
Several fixes are possible:
Simplest: fix qdp_dispatch: only threads with ID < numSites will dispatch
Longer Term: Since OpenMP is now the norm, the dispatch function could be recoded:
instead of low/high site indices, the loops cold be hoisted out of the dispatched functions
into the dispatcher with an OMP for
The text was updated successfully, but these errors were encountered:
}
construct. This if there were more threads than sites, some would have tried to update the seed with their original wrong value. I've now guarded this with
int myId = omp_get_thread_num()
if ( myId < nodeSites ) {
// only active threads update seed
#pragma omp critical
{
// update seed
}
}
I have replicated this pattern throughout qdp_parscalar_specific.h wherever critical occurs in a similar situation (mostly on sums/inner products).
NB: In case the question comes: why not use OMP reductions for sums, the answer is that we are summing into a complex type. I can experiment with using OMP reductions for that later.
SInce I am doing this in a devel branch I won't mark it closed just yet.
Running the t_mesplq with the default local volume (4x2x2x4) produces non-constant results when run repeatedly. The problem goes away if the number if threads is less than the number of sites.
Several fixes are possible:
Simplest: fix qdp_dispatch: only threads with ID < numSites will dispatch
Longer Term: Since OpenMP is now the norm, the dispatch function could be recoded:
instead of low/high site indices, the loops cold be hoisted out of the dispatched functions
into the dispatcher with an OMP for
The text was updated successfully, but these errors were encountered: