Sets, subsets in parscalarvec and masked operations #2

bjoo · 2013-09-10T18:26:21Z

The way unordered subsets are managed in QDP++ is cumbersome for parscalarvec.

The subset is represented by a site table, referring to the 'linear site'. This is cumbersome to thread and vectorize. As an example consider how right now we would do a sum over an unordered subset as of commit:

We do a loop over sites in the subset (this can be parallelized over threads BUT.... see later)
We must find the block for the site
We do redundant operations (we compute the whole block)
We sum only one site from the block (actually I've generalized this to summation under a mask, but the mask has only 1 true element)

This can have several inefficiencies:
i) redundant computation within a thread if there is more than 1 site in the same outer block belonging to the thread. This also brings with it some additional memory traffic, tho it may be OK (ugh) if the repeatedly accessed memory stays in cache.

ii) potentially redundant computation carried out in several threads, if sites in the same outer block are scheduled to different threads. This will also duplicate memory traffic and may cause memory pingponging.

A natural table for parscalarvec would split into two tables:

a table of 'outer blocks' in the subset
for each 'outer block' a table of inner sites in the subset, or a mask

This latter approach would allow multi-threading over the outer blocks,
and vectorization (under mask) for the ILattice bits.

However, creating the tables from the 'site' table is like histogramming (go through sites and 'bin' them into 'outer blocks'). This can have an issue of parallelization (write contention on the binning.). For sets like rb, all, etc this is not a biggie as it can be done at startup and amortized. However, it can be a cost for SftMom in chroma which creates sets 'on the fly' or for user defined sets /subsets which create things on the fly, this can be a problem.

Thoughts anyone?

Debug

bjoo pushed a commit that referenced this issue Feb 6, 2015

Merge pull request #2 from azrael417/debug

cb93e00

Debug

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sets, subsets in parscalarvec and masked operations #2

Sets, subsets in parscalarvec and masked operations #2

bjoo commented Sep 10, 2013

Sets, subsets in parscalarvec and masked operations #2

Sets, subsets in parscalarvec and masked operations #2

Comments

bjoo commented Sep 10, 2013