-
Notifications
You must be signed in to change notification settings - Fork 61
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #9 from CEED/setup-vulcan
Improved robustness for high number refinements and DOFs [setup-vulcan]
- Loading branch information
Showing
19 changed files
with
955 additions
and
869 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,9 +17,9 @@ discretization and explicit high-order time-stepping. | |
|
||
Laghos is based on the discretization method described in the following article: | ||
|
||
> V. Dobrev, Tz. Kolev and R. Rieben,<br> | ||
> [High-order curvilinear finite element methods for Lagrangian hydrodynamics](https://doi.org/10.1137/120864672), <br> | ||
> *SIAM Journal on Scientific Computing*, (34) 2012, pp.B606–B641. | ||
> V. Dobrev, Tz. Kolev and R. Rieben <br> | ||
> [High-order curvilinear finite element methods for Lagrangian hydrodynamics](https://doi.org/10.1137/120864672) <br> | ||
> *SIAM Journal on Scientific Computing*, (34) 2012, pp. B606–B641. | ||
Laghos captures the basic structure of many compressible shock hydrocodes, | ||
including the [BLAST code](http://llnl.gov/casc/blast) at [Lawrence Livermore | ||
|
@@ -54,10 +54,9 @@ Laghos supports two options for deriving and solving the ODE system, namely the | |
algorithm of interest for high orders. For low orders (e.g. 2nd order in 3D), | ||
both algorithms are of interest. | ||
|
||
The full assembly options relies on constructing and utilizing global mass and | ||
force matrices stored in compressed sparse row (CSR) format. | ||
|
||
The [partial assembly](http://ceed.exascaleproject.org/ceed-code) option defines | ||
The full assembly option relies on constructing and utilizing global mass and | ||
force matrices stored in compressed sparse row (CSR) format. In contrast, the | ||
[partial assembly](http://ceed.exascaleproject.org/ceed-code) option defines | ||
only the local action of those matrices, which is then used to perform all | ||
necessary operations. As the local action is defined by utilizing the tensor | ||
structure of the finite element spaces, the amount of data storage, memory | ||
|
@@ -86,14 +85,14 @@ Other computational motives in Laghos include the following: | |
preparation and the application costs are important for this operator. | ||
- Domain-decomposed MPI parallelism. | ||
- Optional in-situ visualization with [GLVis](http:/glvis.org) and data output | ||
for visualization / data analysis with [VisIt](http://visit.llnl.gov). | ||
for visualization and data analysis with [VisIt](http://visit.llnl.gov). | ||
|
||
## Code Structure | ||
|
||
- The file `laghos.cpp` contains the main driver with the time integration loop | ||
starting around line 370. | ||
starting around line 431. | ||
- In each time step, the ODE system of interest is constructed and solved by | ||
the class `LagrangianHydroOperator`, defined around line 312 of `laghos.cpp` | ||
the class `LagrangianHydroOperator`, defined around line 375 of `laghos.cpp` | ||
and implemented in files `laghos_solver.hpp` and `laghos_solver.cpp`. | ||
- All quadrature-based computations are performed in the function | ||
`LagrangianHydroOperator::UpdateQuadratureData` in `laghos_solver.cpp`. | ||
|
@@ -119,7 +118,7 @@ Other computational motives in Laghos include the following: | |
Laghos has the following external dependencies: | ||
|
||
- *hypre*, used for parallel linear algebra, we recommend version 2.10.0b<br> | ||
https://computation.llnl.gov/casc/hypre/software.html, | ||
https://computation.llnl.gov/casc/hypre/software.html | ||
|
||
- METIS, used for parallel domain decomposition (optional), we recommend [version 4.0.3](http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/OLD/metis-4.0.3.tar.gz) <br> | ||
http://glaros.dtc.umn.edu/gkhome/metis/metis/download | ||
|
@@ -128,10 +127,10 @@ Laghos has the following external dependencies: | |
https://github.com/mfem/mfem | ||
|
||
To build the miniapp, first download *hypre* and METIS from the links above | ||
and put everything on the same level as Laghos: | ||
and put everything on the same level as the `Laghos` directory: | ||
```sh | ||
~> ls | ||
Laghos/ hypre-2.10.0b.tar.gz metis-4.0.tar.gz | ||
Laghos/ hypre-2.10.0b.tar.gz metis-4.0.tar.gz | ||
``` | ||
|
||
Build *hypre*: | ||
|
@@ -142,6 +141,8 @@ Build *hypre*: | |
~/hypre-2.10.0b/src> make -j | ||
~/hypre-2.10.0b/src> cd ../.. | ||
``` | ||
For large runs (problem size above 2 billion unknowns), add the | ||
`--enable-bigint` option to the above `configure` line. | ||
|
||
Build METIS: | ||
```sh | ||
|
@@ -151,22 +152,29 @@ Build METIS: | |
~/metis-4.0.3> cd .. | ||
~> ln -s metis-4.0.3 metis-4.0 | ||
``` | ||
This build is optional, as MFEM can be build without METIS by specifying | ||
`MFEM_USE_METIS = NO` below. | ||
|
||
Clone and build the parallel version of MFEM: | ||
```sh | ||
~> git clone [email protected]:mfem/mfem.git ./mfem | ||
~> cd mfem/ | ||
~/mfem> git checkout laghos-v1.0 | ||
~/mfem> make parallel -j | ||
~/mfem> cd .. | ||
``` | ||
The above uses the `laghos-v1.0` tag of MFEM, which is guaranteed to work with | ||
Laghos v1.0. Alternatively, one can use the latest versions of the MFEM and | ||
Laghos `master` branches (provided there are no conflicts. See the [MFEM | ||
building page](http://mfem.org/building/) for additional details. | ||
|
||
Build Laghos | ||
```sh | ||
~> cd Laghos/ | ||
~> make | ||
~/Laghos> make | ||
``` | ||
|
||
For more details, see the [MFEM building page](http://mfem.org/building/). | ||
This can be followed by `make test` and `make install` to check and install the | ||
build respectively. See `make help` for additional options. | ||
|
||
## Running | ||
|
||
|
@@ -181,7 +189,8 @@ mpirun -np 8 laghos -p 1 -m data/square01_quad.mesh -rs 3 -tf 0.8 -no-vis -pa | |
mpirun -np 8 laghos -p 1 -m data/cube01_hex.mesh -rs 2 -tf 0.6 -no-vis -pa | ||
``` | ||
|
||
The latter produces the following density plot (when run with `-vis` instead of `-no-vis`) | ||
The latter produces the following density plot (when run with the `-vis` instead | ||
of the `-no-vis` option) | ||
|
||
![Sedov blast image](data/sedov.png) | ||
|
||
|
@@ -197,7 +206,8 @@ mpirun -np 8 laghos -p 0 -m data/square01_quad.mesh -rs 3 -tf 0.5 -no-vis -pa | |
mpirun -np 8 laghos -p 0 -m data/cube01_hex.mesh -rs 1 -cfl 0.1 -tf 0.25 -no-vis -pa | ||
``` | ||
|
||
The latter produces the following velocity magnitude plot (when run with `-vis` instead of `-no-vis`) | ||
The latter produces the following velocity magnitude plot (when run with the | ||
`-vis` instead of the `-no-vis` option) | ||
|
||
![Taylor-Green image](data/tg.png) | ||
|
||
|
@@ -212,7 +222,8 @@ mpirun -np 8 laghos -p 3 -m data/rectangle01_quad.mesh -rs 2 -tf 2.5 -cfl 0.025 | |
mpirun -np 8 laghos -p 3 -m data/box01_hex.mesh -rs 1 -tf 2.5 -cfl 0.05 -no-vis -pa | ||
``` | ||
|
||
The latter produces the following specific internal energy plot (when run with `-vis` instead of `-no-vis`) | ||
The latter produces the following specific internal energy plot (when run with | ||
the `-vis` instead of the `-no-vis` option) | ||
|
||
![Triple-point image](data/tp.png) | ||
|
||
|
@@ -245,30 +256,53 @@ round-off distance from the above reference values. | |
|
||
## Performance Timing and FOM | ||
|
||
Each time step in Laghos contains 4 major distinct computations: | ||
Each time step in Laghos contains 3 major distinct computations: | ||
|
||
1. The inversion of the global kinematic mass matrix (CG H1). | ||
2. The inversion of the local thermodynamic mass matrices (CG L2). | ||
3. The force operator evaluation from degrees of freedom to quadrature points (Forces). | ||
4. The physics kernel in quadrature points (UpdateQuadData). | ||
2. The force operator evaluation from degrees of freedom to quadrature points (Forces). | ||
3. The physics kernel in quadrature points (UpdateQuadData). | ||
|
||
By default Laghos is instrumented to report the total execution times and rates, | ||
in terms of millions of degrees of freedom (megadofs), for each of these | ||
computational phases. | ||
in terms of millions of degrees of freedom per second (megadofs), for each of | ||
these computational phases. (The time for inversion of the local thermodynamic | ||
mass matrices (CG L2) is also reported, but that takes a small part of the | ||
overall computation.) | ||
|
||
Laghos also reports the total rate for these major kernels, which is a proposed | ||
**Figure of Merit (FOM)** for benchmarking purposes. Given a computational | ||
allocation, the FOM should be reported for different problem sizes and finite | ||
element orders, as illustrated in the sample scripts in the [timing](./timing) | ||
directory. | ||
|
||
A sample run on the [Vulcan](https://computation.llnl.gov/computers/vulcan) BG/Q | ||
machine at LLNL is: | ||
|
||
``` | ||
srun -n 393216 laghos -pa -p 1 -tf 0.6 -no-vis | ||
-pt 322 -m data/cube_12_hex.mesh | ||
--cg-tol 0 --cg-max-iter 50 --max-steps 2 | ||
-ok 3 -ot 2 -rs 5 -rp 3 | ||
``` | ||
This is Q3-Q2 3D computation on 393,216 MPI ranks (24,576 nodes) that produces | ||
rates of approximately 168497, 74221, and 16696 megadofs, and a total FOM of | ||
about 2073 megadofs. | ||
|
||
To make the above run 8 times bigger, one can either weak scale by using 8 times | ||
as many MPI tasks and increasing the number of serial refinements: `srun -n | ||
3145728 ... -rs 6 -rp 3`, or use the same number of MPI tasks but increase the | ||
local problem on each of them by doing more parallel refinements: `srun -n | ||
393216 ... -rs 5 -rp 4`. | ||
|
||
## Versions | ||
|
||
In addition to the main MPI-based CPU implementation in https://github.com/CEED/Laghos, | ||
the following versions of Laghos have been developed | ||
|
||
- A serial version in the [serial](./serial) directory. | ||
- [GPU version](https://github.com/dmed256/Laghos/tree/occa-dev) based on [OCCA](http://libocca.org/). | ||
- [GPU version](https://github.com/dmed256/Laghos/tree/occa-dev) based on | ||
[OCCA](http://libocca.org/). | ||
- A [RAJA](https://software.llnl.gov/RAJA/)-based version in the | ||
[raja-dev](https://github.com/CEED/Laghos/tree/raja-dev) branch. | ||
|
||
## Contact | ||
|
||
|
Oops, something went wrong.