This repository has been archived by the owner on Oct 23, 2020. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The model initialisation for MPAS can take a long time for large meshes and large number of MPI tasks. A detailed profiling exercise has shown that most of the time is spent in the reading of the METIS graph decomposition file by the master MPI task and the scattering of this information to all MPI tasks. This issue will be dealt with in a separate PR. The second item on the list is the setup of the blocks and halos, more precisely in the calls to mpas_dmpar_get_exch_list, which performs a large number of calls to mpas_binary_search.
Adding threading support to the loops calling mpas_binary_search and a simple modification of mpas_binary_search itself can reduce the model initialisation times greatly. This is addressed in this PR. For full details, see the attached PDF document: report_mpas_heinzeller.pdf
Since threading is handled differently in the MPAS cores, it would be great if the maintainers of the different cores could check if this PR breaks any of their functionality or has adverse impacts on the runtimes.