-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Save emulation training data using call_py_fort (#212)
This PR adds the beginning infrastructure to this repository for using call_py_fort to save out training data and run prognostic emulation experiments. Currently it adds hooks into the emulation python code from the GFS_physics_driver.f90 module which are conditionally enabled during compilation (configure.fv3.gnu_docker). The call_py_fort environment is part of the prepared build dependencies, and the functional compiled image can be generated using make build_emulation from this repo. This version will microphysics parameterization data to the working directory in a zarr file (state_output.zarr) and individual netcdf files by tile and time (under netcdf_output). * Add call-py-fort * Add initial training stubs * Switch 'lev' to 'z' in dimension names * Add parameter metadata for microphysics * Add initial microphysics state saving * Add temporary variables for state setting * remove extra level addition * Add requirements to setup.py * Add callpy compiler flags for fv3 * Fix state reference to Stateout for intermediate vars * Successful compilation with training saving * Correct module call for store * Remove problematic function call * Working ZC scheme run * Fix rain output name * Fix data output shape and parameter standard names * Update names to human readable * Update names to human readable * Fix key -> attr mapping function * Add direct to netcdf saving from the monitor * Add time filtering and direct netcdf saving * Switch to cwd usage for some monitor options * Add conditional call-py-fort statements * Make specific call_py_fort configuration * Start call py for dockerfile install * Add callpyfort build targets * Add make target for emulation * Fix numpy/tflow version conflicts * Fix ifdef statements to start of line * Fix callpyfort lib paths for compilation * Use call_py_fort environment as fv3gfs-environment for emulation target * Adjust makefile targets for emulation build * Remove whitespace changes * Remove unused image alias * Fix serialbox image typo * Update the emulation package readme * Move compile configuration into emulation environment * Remove whitespace * Add back in the correct configuration file handling * Revert to original lib/include paths * Fix emulation function names in GFS_physics_driver.f90 * Fix requirements and python installation for callpyfort * Undo duplicatedimage name for callpyfort dep image * Update docker/Dockerfile Co-authored-by: Noah D. Brenowitz <[email protected]> * Remove pfunit from callpyfort install * Remove call_py_fort submodule and use clone of v0.2.0 instead * Add emulation build to ci * Remove unsaved conflict leftovers * Remove callpyfort specific configuration in place of conditional flags * Adjust attribute name away from special attr 'dims' * Fix cmake conditional setting in makefile * Requirements pinned in setup file * Fix build step linking to callpyfort and python install * Change dataset attributes to serialized data * Add emulation build test * Remove code root * Remove unused package, glob Co-authored-by: Noah D. Brenowitz <[email protected]>
- Loading branch information
Showing
14 changed files
with
1,169 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
emulation | ||
========= | ||
|
||
This is a stripped down set of modules for adding into a prognostic run using `call_py_fort`. It's currently used to create training datasets directly from data from the microphysics parameterization. | ||
|
||
The interaction points with python are located in `GFS_physics_driver.f90` as the `set_state` or `get_state` commands. These are enabled for the emulation image (compiled using `make build_emulation`) when `-DENABLE_CALLPYFORT` is passed to the compiler. | ||
|
||
### Example snippet | ||
|
||
``` | ||
#ifdef ENABLE_CALLPYFORT | ||
do k=1,levs | ||
do i=1,im | ||
qv_post_precpd(i,k) = Stateout%gq0(i,k,1) | ||
qc_post_precpd(i,k) = Stateout%gq0(i,k,ntcw) | ||
enddo | ||
enddo | ||
call set_state("air_temperature_output", Stateout%gt0) | ||
call set_state("specific_humidity_output", qv_post_precpd) | ||
call set_state("cloud_water_mixing_ratio_output", qc_post_precpd) | ||
``` | ||
|
||
## Training Data | ||
By default, the training data is saved out to the current working directory with a zarr monitor to state_output.zarr (time, tile, x, y), or individual netCDF files for each time and tile under $(cwd)/netcdf_output. | ||
|
||
To change the frequency for which data are saved (defaults to 5 hours [18,000 s]), prescribe the `OUTPUT_FREQ_SEC` environment variable in the runtime image. |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# keras has several routines which interact with file paths directly as opposed to | ||
# filesystem objects, which means we need these wrappers so we can allow remote paths | ||
|
||
import contextlib | ||
import tempfile | ||
import fsspec | ||
import os | ||
|
||
|
||
@contextlib.contextmanager | ||
def put_dir(path: str): | ||
with tempfile.TemporaryDirectory() as tmpdir: | ||
yield tmpdir | ||
fs, _, _ = fsspec.get_fs_token_paths(path) | ||
fs.makedirs(os.path.dirname(path), exist_ok=True) | ||
# cannot use fs.put as it cannot merge directories | ||
_put_directory(tmpdir, path) | ||
|
||
|
||
@contextlib.contextmanager | ||
def get_dir(path: str): | ||
with tempfile.TemporaryDirectory() as tmpdir: | ||
fs, _, _ = fsspec.get_fs_token_paths(path) | ||
# fsspec places the directory inside the tmpdir, as a subdirectory | ||
fs.get(path, tmpdir, recursive=True) | ||
yield tmpdir | ||
|
||
|
||
def _put_directory( | ||
local_source_dir: str, dest_dir: str, fs: fsspec.AbstractFileSystem = None, | ||
): | ||
"""Copy the contents of a local directory to a local or remote directory. | ||
""" | ||
if fs is None: | ||
fs, _, _ = fsspec.get_fs_token_paths(dest_dir) | ||
fs.makedirs(dest_dir, exist_ok=True) | ||
for token in os.listdir(local_source_dir): | ||
source = os.path.join(os.path.abspath(local_source_dir), token) | ||
dest = os.path.join(dest_dir, token) | ||
if os.path.isdir(source): | ||
_put_directory(source, dest, fs=fs) | ||
else: | ||
fs.put(source, dest) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
""" | ||
A module for testing/debugging call routines | ||
""" | ||
import logging | ||
import os | ||
import traceback | ||
import numpy as np | ||
from datetime import datetime | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
def print_errors(func): | ||
def new_func(*args, **kwargs): | ||
try: | ||
return func(*args, **kwargs) | ||
except Exception as e: | ||
logger.error(traceback.print_exc()) | ||
raise e | ||
|
||
return new_func | ||
|
||
|
||
def print_arr_info(state): | ||
|
||
logger = logging.getLogger(__name__) | ||
for varname, arr in state.items(): | ||
logger.info(f"{varname}: shape[{arr.shape}] isfortran[{np.isfortran(arr)}]") | ||
|
||
|
||
def print_location_ping(state): | ||
|
||
logger = logging.getLogger(__name__) | ||
logger.info("Ping reached!") | ||
|
||
|
||
def dump_state(state): | ||
|
||
DUMP_PATH = str(os.environ.get("STATE_DUMP_PATH")) | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
try: | ||
rank = state.get("rank") | ||
except KeyError: | ||
logger.info("Could not save state. No rank included in state.") | ||
return | ||
|
||
time_str = datetime.now().strftime("%Y%m%d.%H%M%S") | ||
filename = f"state_dump.{time_str}.tile{int(rank.squeeze()[0])}.npz" | ||
outfile = os.path.join(DUMP_PATH, filename) | ||
logger.info(f"Dumping state to {outfile}") | ||
np.savez(outfile, **state) |
Oops, something went wrong.