From 97b98047ea074ad04e8a098003d99b865e38125d Mon Sep 17 00:00:00 2001 From: George Bisbas Date: Fri, 30 Jun 2023 15:54:07 +0100 Subject: [PATCH] docs: Update for Custom topology --- FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FAQ.md b/FAQ.md index ace1b72b400..258d45a5050 100644 --- a/FAQ.md +++ b/FAQ.md @@ -567,7 +567,7 @@ By default, Devito compiles the generated code using flags that maximize the run ## Can I control the MPI domain decomposition -Until Devito v3.5 included, domain decomposition occurs along the fastest axis. As of later versions, domain decomposition occurs along the slowest axis, for performance reasons. And yes, it is possible to control the domain decomposition in user code, but this is not neatly documented. Take a look at `test_custom_topology` in [this file](https://github.com/devitocodes/devito/blob/master/tests/test_mpi.py). In essence, `Grid` accepts the optional argument `topology`, which allows the user to pass a custom topology as an n-tuple, where `n` is the number of distributed dimensions. For example, for a two-dimensional grid, the topology `(4, 1)` will decompose the slowest axis into four partitions, one partition per MPI rank, while the fastest axis will be replicated over all MPI ranks. +Until Devito v3.5 included, domain decomposition occurs along the fastest axis. As of later versions, domain decomposition occurs along the slowest axis, for performance reasons. And yes, it is possible to control the domain decomposition in user code, but this is not neatly documented. Take a look at `class CustomTopology` in [distributed.py](https://github.com/devitocodes/devito/blob/master/devito/mpi/distributed.py) and `test_custom_topology` in [this file](https://github.com/devitocodes/devito/blob/master/tests/test_mpi.py). In essence, `Grid` accepts the optional argument `topology`, which allows the user to pass a custom topology as an n-tuple, where `n` is the number of distributed dimensions. For example, for a two-dimensional grid, the topology `(4, 1)` will decompose the slowest axis into four partitions, one partition per MPI rank, while the fastest axis will be replicated over all MPI ranks. [top](#Frequently-Asked-Questions)