Skip to content

Casper Standalone Workflow

gdicker1 edited this page Jun 12, 2020 · 7 revisions

This guide demonstrates how to build and run the standalone MOM6 and some of the test cases given in MOM6 Examples. Sample build and submission scripts are given below in Sample Scripts or in our MOM6_WorldShared (which you may need to request access to) along with the template Makefile for PGI on Casper.

Guide

After creating your own fork of MOM6-examples, clone it into a desired directory

git clone {LINK_TO_YOUR_REPO} --recursive
cd MOM6-examples

Once there, if you haven't pushed this change already, modify .gitmodules and change the URL for MOM6 to point to your repo. Synchronize the change with

git submodule sync --recursive

Make sure your MOM6_WorldShared is up to date (git fetch --all && git pull will do it), and copy over the Makefile template, build script, and run script

cp {PATH_TO_MOM6_WORLDSHARED}/StandAlone/Casper/casper-pgi.mk ./
cp {PATH_TO_MOM6_WORLDSHARED}/StandAlone/Casper/casper_create_build_make.sh ./
cp {PATH_TO_MOM6_WORLDSHARED}/StandAlone/Casper/casper_submit.sh ./

You can modify the TEST variable in the run script to any of the options in MOM6-examples/ocean-only.

NOTE: Make sure you modify the BASE_DIR variable in the build and run scripts to point to the correct path for your case. Also, the build script can be run directly or by submitting it to the scheduler with sbatch casper_create_build_make.sh. The run script can only be submmitted with sbatch casper_submit.sh. MOM6 will write some useful information to logfile.000000.out in the corresponding test directory while stdout and stderr from the job will be written to the MOM6.log.{JOBNUM}.out.

Sample Scripts

Below is a sample build script that can also be submitted to the queue on Casper. This will create a MOM6 executable called mom.exe in your BLD_OCN directory. You can soft-link this executable in a test you want to run, or you can give the path to the executable in the mpirun command of your submit script.

#!/bin/bash -l
# Batch directives
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --account NTDD0002
#SBATCH --partition=dav
#SBATCH --time=00:15:00
#SBATCH --output=MOM6.bldlog.%j
#SBATCH --job-name=bld_MOM6

# Load the modules needed
module purge
module load pgi/20.4
module load openmpi/4.0.3
module load netcdf/4.7.3
module list

# Edit these paths for your specific run
BASE_DIR=/glade/scratch/${USER}/ocn/Summer2020/StandAlone/MOM6-examples
BLD_SHR=${BASE_DIR}/build/pgi/shared/repro/
BLD_OCN=${BASE_DIR}/build/pgi/ocean_only/repro/
MK_TEMP=${BASE_DIR}/casper-pgi.mk
EXE=mom.exe

# Remove any existing builds
rm -rf ${BLD_SHR}
rm -rf ${BLD_OCN}

# Make build dir and setup the make using mkmf
mkdir -p ${BLD_SHR}
cd ${BLD_SHR}
(cd ${BLD_SHR}; rm -f path_names; \
../../../../src/mkmf/bin/list_paths -l ../../../../src/FMS; \
../../../../src/mkmf/bin/mkmf -t ${MK_TEMP} -p libfms.a -c "-Duse_libMPI -Duse_netCDF -DSPMD -I/glade/u/apps/dav/opt/netcdf/4.7.3/pgi/20.4/include/ -I/glade/u/apps/dav/opt/openmpi/4.0.3/pgi/20.4/include/ -L/glade/u/apps/dav/opt/netcdf/4.7.3/pgi/20.4/lib/ -lnetcdf -L/glade/u/apps/dav/opt/openmpi/4.0.3/pgi/20.4/lib/ -lmpi" path_names)

# Make libfms.a
(cd ${BLD_SHR}; source ../../env; make NETCDF=3 REPRO=1 libfms.a -j)


# Make ocean build dir and setup the make using mkmf
mkdir -p ${BLD_OCN}
(cd ${BLD_OCN}; rm -f path_names; \
../../../../src/mkmf/bin/list_paths -l ./ ../../../../src/MOM6/{config_src/dynamic,config_src/solo_driver,src/{*,*/*}}/ ; \
../../../../src/mkmf/bin/mkmf -t ${MK_TEMP} -o '-I../../shared/repro -I/glade/u/apps/dav/opt/netcdf/4.7.3/pgi/20.4/include/ -I/glade/u/apps/dav/opt/openmpi/4.0.3/pgi/20.4/include/' -p ${EXE} -l '-L../../shared/repro -lfms -L/glade/u/apps/dav/opt/netcdf/4.7.3/pgi/20.4/lib/ -lnetcdf -L/glade/u/apps/dav/opt/openmpi/4.0.3/pgi/20.4/lib/ -lmpi -lmpi_mpifh' -c '-Duse_libMPI -Duse_netCDF -DSPMD' path_names)

# Make the MOM6 executable
(cd ${BLD_OCN}; source ../../env; make NETCDF=3 REPRO=1 ${EXE} -j)

A sample submission script is shown below. There will be a MOM6.log.{JOBNUM}.out file created wherever you submit the job from that stdout and stderr from the job write to.

#!/bin/bash -l
# Batch directives
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=36
#SBATCH --account NTDD0002
#SBATCH --partition=dav
###SBATCH --reservation=TDD_4xV100
#SBATCH --time=00:15:00
#SBATCH --output=MOM6.log.%j.out
#SBATCH --job-name=bnchmk_MOM6


# Adjust these paths and values depending on your test
TEST=benchmark
BASE_DIR=/glade/scratch/${USER}/ocn/Summer2020/StandAlone/MOM6-examples
TEST_DIR=${BASE_DIR}/ocean_only/${TEST}
EXE_NAME=mom.exe
EXE_PATH=${BASE_DIR}/build/pgi/ocean_only/repro/${EXE_NAME}

echo "MOM6 running test \"${TEST}\" from directory ${TEST_DIR} with executable coming from ${EXE_PATH}"

# Load the modules needed
module purge
module load pgi/20.4
module load openmpi/4.0.3
module load netcdf/4.7.3
module list

# Add NetCDF to LD_LIBRARY_PATH
LD_LIBRARY_PATH=/glade/u/apps/dav/opt/netcdf/4.7.3/pgi/20.4/lib:${LD_LIBRARY_PATH}
echo -e "PATH=${PATH}\n"
echo -e "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}\n"

# Go to the test directory and start the job
cd ${TEST_DIR}
# Have to make sure that a RESTART directory exists for each test you run or it fails early
mkdir -p ${TEST_DIR}/RESTART
# IMPORTANT: Make sure the number here does not exceed ntasks-per-node*nodes at the top
mpirun -np 36 ${EXE_PATH}

Other Useful Info

  • A guide to gitsubmodules can be found here
Clone this wiki locally