Skip to content

Commit

Permalink
build based on 7fc5ad5
Browse files Browse the repository at this point in the history
  • Loading branch information
Documenter.jl committed Feb 2, 2024
1 parent 14b1d47 commit 025b452
Show file tree
Hide file tree
Showing 9 changed files with 50 additions and 23 deletions.
2 changes: 1 addition & 1 deletion dev/.documenter-siteinfo.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"documenter":{"julia_version":"1.10.0","generation_timestamp":"2024-02-01T19:36:19","documenter_version":"1.2.1"}}
{"documenter":{"julia_version":"1.10.0","generation_timestamp":"2024-02-02T10:33:03","documenter_version":"1.2.1"}}
2 changes: 1 addition & 1 deletion dev/examples/juliaset/juliaset/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,4 @@
63.707 ms (39 allocations: 3.30 KiB)
</code></pre><p>As hoped, the parallel implementation is faster. But can we improve the performance further?</p><h3 id="Tuning-nchunks"><a class="docs-heading-anchor" href="#Tuning-nchunks">Tuning <code>nchunks</code></a><a id="Tuning-nchunks-1"></a><a class="docs-heading-anchor-permalink" href="#Tuning-nchunks" title="Permalink"></a></h3><p>As stated above, the per-pixel computation is non-uniform. Hence, we might benefit from load balancing. The simplest way to get it is to increase <code>nchunks</code> to a value larger than <code>nthreads</code>. This divides the overall workload into smaller tasks than can be dynamically distributed among threads (by Julia&#39;s scheduler) to balance the per-thread load.</p><pre><code class="language-julia hljs">@btime compute_juliaset_parallel!($img; schedule=:dynamic, nchunks=N) samples=10 evals=3;</code></pre><pre><code class="nohighlight hljs"> 32.000 ms (12013 allocations: 1.14 MiB)
</code></pre><p>Note that if we opt out of dynamic scheduling and set <code>schedule=:static</code>, this strategy doesn&#39;t help anymore (because chunks are naively distributed up front).</p><pre><code class="language-julia hljs">@btime compute_juliaset_parallel!($img; schedule=:static, nchunks=N) samples=10 evals=3;</code></pre><pre><code class="nohighlight hljs"> 63.439 ms (42 allocations: 3.37 KiB)
</code></pre><hr/><p><em>This page was generated using <a href="https://github.com/fredrikekre/Literate.jl">Literate.jl</a>.</em></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../../mc/mc/">« Parallel Monte Carlo</a><a class="docs-footer-nextpage" href="../../../refs/api/">Public API »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="auto">Automatic (OS)</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.2.1 on <span class="colophon-date" title="Thursday 1 February 2024 19:36">Thursday 1 February 2024</span>. Using Julia version 1.10.0.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
</code></pre><hr/><p><em>This page was generated using <a href="https://github.com/fredrikekre/Literate.jl">Literate.jl</a>.</em></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../../mc/mc/">« Parallel Monte Carlo</a><a class="docs-footer-nextpage" href="../../../refs/api/">Public API »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="auto">Automatic (OS)</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.2.1 on <span class="colophon-date" title="Friday 2 February 2024 10:33">Friday 2 February 2024</span>. Using Julia version 1.10.0.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
14 changes: 13 additions & 1 deletion dev/examples/mc/mc.jl
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ using Base.Threads: nthreads

using OhMyThreads: @spawn

function mc_parallel_manual(N; nchunks=nthreads())
function mc_parallel_manual(N; nchunks = nthreads())
tasks = map(chunks(1:N; n = nchunks)) do idcs # TODO: replace by `tmap` once ready
@spawn mc(length(idcs))
end
Expand All @@ -78,3 +78,15 @@ mc_parallel_manual(N)
# And this is the performance:

@btime mc_parallel_manual($N) samples=10 evals=3;

# It is faster than `mc_parallel` above because the task-local computation
# `mc(length(idcs))` is faster than the implicit task-local computation within
# `tmapreduce` (which itself is a `mapreduce`).

idcs = first(chunks(1:N; n = nthreads()))

@btime mapreduce($+, $idcs) do i
rand()^2 + rand()^2 < 1.0
end samples=10 evals=3;

@btime mc($(length(idcs))) samples=10 evals=3;
22 changes: 15 additions & 7 deletions dev/examples/mc/mc/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

N = 100_000_000

mc(N)</code></pre><pre><code class="nohighlight hljs">3.14169568</code></pre><h2 id="Parallelization-with-tmapreduce"><a class="docs-heading-anchor" href="#Parallelization-with-tmapreduce">Parallelization with <code>tmapreduce</code></a><a id="Parallelization-with-tmapreduce-1"></a><a class="docs-heading-anchor-permalink" href="#Parallelization-with-tmapreduce" title="Permalink"></a></h2><p>To parallelize the Monte Carlo simulation, we use <a href="../../../refs/api/#OhMyThreads.tmapreduce"><code>tmapreduce</code></a> with <code>+</code> as the reduction operator. For the map part, we take <code>1:N</code> as our input collection and &quot;throw one dart&quot; per element.</p><pre><code class="language-julia hljs">using OhMyThreads
mc(N)</code></pre><pre><code class="nohighlight hljs">3.141517</code></pre><h2 id="Parallelization-with-tmapreduce"><a class="docs-heading-anchor" href="#Parallelization-with-tmapreduce">Parallelization with <code>tmapreduce</code></a><a id="Parallelization-with-tmapreduce-1"></a><a class="docs-heading-anchor-permalink" href="#Parallelization-with-tmapreduce" title="Permalink"></a></h2><p>To parallelize the Monte Carlo simulation, we use <a href="../../../refs/api/#OhMyThreads.tmapreduce"><code>tmapreduce</code></a> with <code>+</code> as the reduction operator. For the map part, we take <code>1:N</code> as our input collection and &quot;throw one dart&quot; per element.</p><pre><code class="language-julia hljs">using OhMyThreads

function mc_parallel(N)
M = tmapreduce(+, 1:N) do i
Expand All @@ -22,25 +22,33 @@
return pi
end

mc_parallel(N)</code></pre><pre><code class="nohighlight hljs">3.14169096</code></pre><p>Let&#39;s run a quick benchmark.</p><pre><code class="language-julia hljs">using BenchmarkTools
mc_parallel(N)</code></pre><pre><code class="nohighlight hljs">3.14159924</code></pre><p>Let&#39;s run a quick benchmark.</p><pre><code class="language-julia hljs">using BenchmarkTools
using Base.Threads: nthreads

@assert nthreads() &gt; 1 # make sure we have multiple Julia threads
@show nthreads() # print out the number of threads

@btime mc($N) samples=10 evals=3;
@btime mc_parallel($N) samples=10 evals=3;</code></pre><pre><code class="nohighlight hljs">nthreads() = 5
317.421 ms (0 allocations: 0 bytes)
88.188 ms (37 allocations: 3.02 KiB)
318.467 ms (0 allocations: 0 bytes)
88.553 ms (37 allocations: 3.02 KiB)
</code></pre><h2 id="Manual-parallelization"><a class="docs-heading-anchor" href="#Manual-parallelization">Manual parallelization</a><a id="Manual-parallelization-1"></a><a class="docs-heading-anchor-permalink" href="#Manual-parallelization" title="Permalink"></a></h2><p>First, using the <code>chunks</code> function, we divide the iteration interval <code>1:N</code> into <code>nthreads()</code> parts. Then, we apply a regular (sequential) <code>map</code> to spawn a Julia task per chunk. Each task will locally and independently perform a sequential Monte Carlo simulation. Finally, we fetch the results and compute the average estimate for <span>$\pi$</span>.</p><pre><code class="language-julia hljs">using OhMyThreads: @spawn

function mc_parallel_manual(N; nchunks=nthreads())
function mc_parallel_manual(N; nchunks = nthreads())
tasks = map(chunks(1:N; n = nchunks)) do idcs # TODO: replace by `tmap` once ready
@spawn mc(length(idcs))
end
pi = sum(fetch, tasks) / nchunks
return pi
end

mc_parallel_manual(N)</code></pre><pre><code class="nohighlight hljs">3.1414561999999995</code></pre><p>And this is the performance:</p><pre><code class="language-julia hljs">@btime mc_parallel_manual($N) samples=10 evals=3;</code></pre><pre><code class="nohighlight hljs"> 63.512 ms (31 allocations: 2.80 KiB)
</code></pre><hr/><p><em>This page was generated using <a href="https://github.com/fredrikekre/Literate.jl">Literate.jl</a>.</em></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../../../">« OhMyThreads</a><a class="docs-footer-nextpage" href="../../juliaset/juliaset/">Julia Set »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="auto">Automatic (OS)</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.2.1 on <span class="colophon-date" title="Thursday 1 February 2024 19:36">Thursday 1 February 2024</span>. Using Julia version 1.10.0.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
mc_parallel_manual(N)</code></pre><pre><code class="nohighlight hljs">3.1415844</code></pre><p>And this is the performance:</p><pre><code class="language-julia hljs">@btime mc_parallel_manual($N) samples=10 evals=3;</code></pre><pre><code class="nohighlight hljs"> 63.825 ms (31 allocations: 2.80 KiB)
</code></pre><p>It is faster than <code>mc_parallel</code> above because the task-local computation <code>mc(length(idcs))</code> is faster than the implicit task-local computation within <code>tmapreduce</code> (which itself is a <code>mapreduce</code>).</p><pre><code class="language-julia hljs">idcs = first(chunks(1:N; n = nthreads()))

@btime mapreduce($+, $idcs) do i
rand()^2 + rand()^2 &lt; 1.0
end samples=10 evals=3;

@btime mc($(length(idcs))) samples=10 evals=3;</code></pre><pre><code class="nohighlight hljs"> 87.617 ms (0 allocations: 0 bytes)
63.398 ms (0 allocations: 0 bytes)
</code></pre><hr/><p><em>This page was generated using <a href="https://github.com/fredrikekre/Literate.jl">Literate.jl</a>.</em></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../../../">« OhMyThreads</a><a class="docs-footer-nextpage" href="../../juliaset/juliaset/">Julia Set »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="auto">Automatic (OS)</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.2.1 on <span class="colophon-date" title="Friday 2 February 2024 10:33">Friday 2 February 2024</span>. Using Julia version 1.10.0.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
13 changes: 10 additions & 3 deletions dev/examples/tomarkdown.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,16 @@ const repourl = "https://github.com/JuliaFolds2/OhMyThreads.jl/blob/main/docs"
using Literate
using Pkg

dirs = filter(isdir, readdir())
if length(ARGS) > 0
dirs = ARGS
if length(ARGS) == 0
println("Error: Please provide the folder names of the examples you want to compile to markdown. " *
"Alternatively, you can pass \"all\" as the first argument to compile them all.")
exit()
else
if first(ARGS) == "all"
dirs = filter(isdir, readdir())
else
dirs = ARGS
end
end
@show dirs

Expand Down
Loading

0 comments on commit 025b452

Please sign in to comment.