Skip to content

Commit

Permalink
Merge branch 'main' into subcell-limiting-outsource-saving-errors
Browse files Browse the repository at this point in the history
  • Loading branch information
bennibolm authored Feb 5, 2024
2 parents bf9fb62 + 14151e6 commit cad2c0a
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 3 deletions.
12 changes: 9 additions & 3 deletions docs/literate/src/files/index.jl
Original file line number Diff line number Diff line change
Expand Up @@ -108,20 +108,26 @@
# software in the Trixi.jl ecosystem, and then run a simulation using Trixi.jl on said mesh.
# In the end, the tutorial briefly explains how to simulate an example using AMR via `P4estMesh`.

# ### [15 Explicit time stepping](@ref time_stepping)
# ### [15 P4est mesh from gmsh](@ref p4est_from_gmsh)
#-
# This tutorial describes how to obtain a [`P4estMesh`](@ref) from an existing mesh generated
# by [`gmsh`](https://gmsh.info/) or any other meshing software that can export to the Abaqus
# input `.inp` format. The tutorial demonstrates how edges/faces can be associated with boundary conditions based on the physical nodesets.

# ### [16 Explicit time stepping](@ref time_stepping)
#-
# This tutorial is about time integration using [OrdinaryDiffEq.jl](https://github.com/SciML/OrdinaryDiffEq.jl).
# It explains how to use their algorithms and presents two types of time step choices - with error-based
# and CFL-based adaptive step size control.

# ### [16 Differentiable programming](@ref differentiable_programming)
# ### [17 Differentiable programming](@ref differentiable_programming)
#-
# This part deals with some basic differentiable programming topics. For example, a Jacobian, its
# eigenvalues and a curve of total energy (through the simulation) are calculated and plotted for
# a few semidiscretizations. Moreover, we calculate an example for propagating errors with Measurement.jl
# at the end.

# ### [17 Custom semidiscretization](@ref custom_semidiscretization)
# ### [18 Custom semidiscretization](@ref custom_semidiscretization)
#-
# This tutorial describes the [semidiscretiations](@ref overview-semidiscretizations) of Trixi.jl
# and explains how to extend them for custom tasks.
Expand Down
11 changes: 11 additions & 0 deletions docs/src/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,3 +267,14 @@ requires. It can thus be seen as a proxy for "energy used" and, as an extension,
timing result, you need to set the analysis interval such that the
`AnalysisCallback` is invoked at least once during the course of the simulation and
discard the first PID value.

## Performance issues with multi-threaded reductions
[False sharing](https://en.wikipedia.org/wiki/False_sharing) is a known performance issue
for systems with distributed caches. It also occurred for the implementation of a thread
parallel bounds checking routine for the subcell IDP limiting
in [PR #1736](https://github.com/trixi-framework/Trixi.jl/pull/1736).
After some [testing and discussion](https://github.com/trixi-framework/Trixi.jl/pull/1736#discussion_r1423881895),
it turned out that initializing a vector of length `n * Threads.nthreads()` and only using every
n-th entry instead of a vector of length `Threads.nthreads()` fixes the problem.
Since there are no processors with caches over 128B, we use `n = 128B / size(uEltype)`.
Now, the bounds checking routine of the IDP limiting scales as hoped.

0 comments on commit cad2c0a

Please sign in to comment.