Skip to content

Commit

Permalink
Updated description in section 6 to remove ambiguity. Closes HPCE#6.
Browse files Browse the repository at this point in the history
  • Loading branch information
m8pple committed Feb 6, 2015
1 parent d4f9ee9 commit c4c2eba
Showing 1 changed file with 15 additions and 2 deletions.
17 changes: 15 additions & 2 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -732,8 +732,8 @@ task based version, so there is only one kind of parallelism.

### Apply the loop transformation

Apply the loop transformation described above,
without introducing any parallelism, and check it works
First apply the loop transformation described above,
_without_ introducing any parallelism, and check it works
with various values of K, via the environment variable
`HPCE_FFT_LOOP_K`.

Expand All @@ -743,6 +743,19 @@ recursion. A simple solution is to use a guarded
version, such that if m < = K the original code is used,
and if m > K the new code is used.

Once you have got it working with a non parallel chunked
loop, replace the outer loop with a `parallel_for` loop
using `simple_partitioner`, and check that it still works
for different values of `HPCE_FFT_LOOP_K`. You will
probably not see much speed-up here, as the dominant
cost tends to be the recursive part.

_Note: edited to make the instructions clearer, as
@bwh10 correctly pointed out it [was ambiguous](https://github.com/HPCE/hpce_2014_cw3/issues/4).
The intent is for people to get the chunking working first
in a sequential context, then to add the parallelism (the
first part is more complex, the second part is easy)._

As before, if `HPCE_FFT_LOOP_K` is not set, choose a sensible
default based on your analysis of the scaling with n, and/or
experiments. Though remember, it should be a sensible default
Expand Down

0 comments on commit c4c2eba

Please sign in to comment.