Skip to content

Commit

Permalink
Merge pull request #156 from TeaganKing/cleanup_2_v2
Browse files Browse the repository at this point in the history
Cleanup PR #2 (v2)
  • Loading branch information
mnlevy1981 authored Nov 22, 2024
2 parents 71c4ef8 + 6aba58d commit ee4ac42
Show file tree
Hide file tree
Showing 14 changed files with 239 additions and 120 deletions.
14 changes: 7 additions & 7 deletions NCARtips.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,25 +8,25 @@ There are two ways to request multiple cores on either casper or derecho.
Both cases are requesting 12 cores and 120 GB of memory.


The recommended approach releases the cores immediately after `cupid-run` finishes:
The recommended approach releases the cores immediately after `cupid-diagnostics` finishes:

```
[login-node] $ conda activate cupid-dev
(cupid-dev) [login-node] $ qcmd -l select=1:ncpus=12:mem=120GB -- cupid-run
[login-node] $ conda activate cupid-infrastructure
(cupid-infrastructure) [login-node] $ qcmd -l select=1:ncpus=12:mem=120GB -- cupid-diagnostics
```

Alternatively, you can start an interactive session and remain on the compute nodes after `cupid-run` completes:
Alternatively, you can start an interactive session and remain on the compute nodes after `cupid-diagnostics` completes:

```
[login-node] $ qinteractive -l select=1:ncpus=12:mem=120GB
[compute-node] $ conda activate cupid-dev
(cupid-dev) [compute-node] $ cupid-run
[compute-node] $ conda activate cupid-infrastructure
(cupid-infrastructure) [compute-node] $ cupid-diagnostics
```

Notes:
1. If you chose to run on derecho, specify the `develop` queue by adding the option `-q develop` to either `qcmd` or `qinteractive`
(the `develop` queue is a shared resource and you are charged by the core hour rather than the node hour).
1. `cupid-build` is not computationally expensive, and can be run on a login node for either machine.
1. `cupid-webpage` is not computationally expensive, and can be run on a login node for either machine.

## Looking at Output

Expand Down
28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ Then `cd` into the `CUPiD` directory and build the necessary conda environments

``` bash
$ cd CUPiD
$ mamba env create -f environments/dev-environment.yml
$ conda activate cupid-dev
$ which cupid-run
$ mamba env create -f environments/cupid-infrastructure.yml
$ conda activate cupid-infrastructure
$ which cupid-diagnostics
$ mamba env create -f environments/cupid-analysis.yml
```

Expand All @@ -38,14 +38,14 @@ If you do not have `mamba` installed, you can still use `conda`... it will just
(To see what version of conda you have installed, run `conda --version`.)
1. If the subdirectories in `externals/` are all empty, run `git submodule update --init` to clone the submodules.
1. For existing users who cloned `CUPiD` prior to the switch from manage externals to git submodule, we recommend removing `externals/` before checking out main, running `git submodule update --init`, and removing `manage_externals` (if it is still present after `git submodule update --init`).
1. If `which cupid-run` returned the error `which: no cupid-run in ($PATH)`, then please run the following:
1. If `which cupid-diagnostics` returned the error `which: no cupid-diagnostics in ($PATH)`, then please run the following:

``` bash
$ conda activate cupid-dev
$ conda activate cupid-infrastructure
$ pip install -e . # installs cupid
```

1. In the `cupid-dev` environment, run `pre-commit install` to configure `git` to automatically run `pre-commit` checks when you try to commit changes from the `cupid-dev` environment; the commit will only proceed if all checks pass. Note that CUPiD uses `pre-commit` to ensure code formatting guidelines are followed, and pull requests will not be accepted if they fail the `pre-commit`-based Github Action.
1. In the `cupid-infrastructure` environment, run `pre-commit install` to configure `git` to automatically run `pre-commit` checks when you try to commit changes from the `cupid-infrastructure` environment; the commit will only proceed if all checks pass. Note that CUPiD uses `pre-commit` to ensure code formatting guidelines are followed, and pull requests will not be accepted if they fail the `pre-commit`-based Github Action.
1. If you plan on contributing code to CUPiD,
whether developing CUPiD itself or providing notebooks for CUPiD to run,
please see the [Contributor's Guide](https://ncar.github.io/CUPiD/contributors_guide.html).
Expand All @@ -56,11 +56,11 @@ CUPiD currently provides an example for generating diagnostics.
To test the package out, try to run `examples/key-metrics`:

``` bash
$ conda activate cupid-dev
$ conda activate cupid-infrastructure
$ cd examples/key_metrics
$ # machine-dependent: request multiple compute cores
$ cupid-run
$ cupid-build # Will build HTML from Jupyter Book
$ cupid-diagnostics
$ cupid-webpage # Will build HTML from Jupyter Book
```

After the last step is finished, you can use Jupyter to view generated notebooks in `${CUPID_ROOT}/examples/key-metrics/computed_notebooks`
Expand All @@ -74,7 +74,7 @@ Notes:
(cupid-analysis) $ python -m ipykernel install --user --name=cupid-analysis
```

Furthermore, to clear the `computed_notebooks` folder which was generated by the `cupid-run` and `cupid-build` commands, you can run the following command:
Furthermore, to clear the `computed_notebooks` folder which was generated by the `cupid-diagnostics` and `cupid-webpage` commands, you can run the following command:

``` bash
$ cupid-clear
Expand All @@ -87,8 +87,8 @@ This will clear the `computed_notebooks` folder which is at the location pointed
Most of CUPiD's configuration is done via the `config.yml` file, but there are a few command line options as well:

```bash
(cupid-dev) $ cupid-run -h
Usage: cupid-run [OPTIONS] CONFIG_PATH
(cupid-infrastructure) $ cupid-diagnostics -h
Usage: cupid-diagnostics [OPTIONS] CONFIG_PATH

Main engine to set up running all the notebooks.

Expand Down Expand Up @@ -122,8 +122,8 @@ client

#### Specifying components

If no component flags are provided, all component diagnostics listed in `config.yml` will be executed by default. Multiple flags can be used together to select a group of components, for example: `cupid-run -ocn -ice`.
If no component flags are provided, all component diagnostics listed in `config.yml` will be executed by default. Multiple flags can be used together to select a group of components, for example: `cupid-diagnostics -ocn -ice`.


### Timeseries File Generation
CUPiD also has the capability to generate single variable timeseries files from history files for all components. To run timeseries, edit the `config.yml` file's timeseries section to fit your preferences, and then run `cupid-run -ts`.
CUPiD also has the capability to generate single variable timeseries files from history files for all components. To run timeseries, edit the `config.yml` file's timeseries section to fit your preferences, and then run `cupid-timeseries`.
4 changes: 4 additions & 0 deletions cupid/clear.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,7 @@ def clear(config_path):
# Delete the "computed_notebooks" folder and all the contents inside of it
shutil.rmtree(run_dir)
logger.info(f"All contents in {run_dir} have been cleared.")


if __name__ == "__main__":
clear()
2 changes: 1 addition & 1 deletion cupid/build.py → cupid/cupid_webpage.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
@click.argument("config_path", default="config.yml")
def build(config_path):
"""
Build a Jupyter book based on the TOC in CONFIG_PATH. Called by `cupid-build`.
Build a Jupyter book based on the TOC in CONFIG_PATH. Called by `cupid-webpage`.
Args:
CONFIG_PATH: str, path to configuration file (default config.yml)
Expand Down
95 changes: 6 additions & 89 deletions cupid/run.py → cupid/run_diagnostics.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,12 @@
This script sets up and runs all the specified notebooks and scripts according to the configurations
provided in the specified YAML configuration file.
Usage: cupid-run [OPTIONS]
Usage: cupid-diagnostics [OPTIONS]
Main engine to set up running all the notebooks.
Options:
-s, --serial Do not use LocalCluster objects
-ts, --time-series Run time series generation scripts prior to diagnostics
-atm, --atmosphere Run atmosphere component diagnostics
-ocn, --ocean Run ocean component diagnostics
-lnd, --land Run land component diagnostics
Expand All @@ -29,7 +28,6 @@
import intake
import ploomber

import cupid.timeseries
import cupid.util

CONTEXT_SETTINGS = dict(help_option_names=["-h", "--help"])
Expand All @@ -40,7 +38,6 @@

@click.command(context_settings=CONTEXT_SETTINGS)
@click.option("--serial", "-s", is_flag=True, help="Do not use LocalCluster objects")
@click.option("--time-series", "-ts", is_flag=True, help="Run time series generation scripts prior to diagnostics")
# Options to turn components on or off
@click.option("--atmosphere", "-atm", is_flag=True, help="Run atmosphere component diagnostics")
@click.option("--ocean", "-ocn", is_flag=True, help="Run ocean component diagnostics")
Expand All @@ -49,10 +46,9 @@
@click.option("--landice", "-glc", is_flag=True, help="Run land ice component diagnostics")
@click.option("--river-runoff", "-rof", is_flag=True, help="Run river runoff component diagnostics")
@click.argument("config_path", default="config.yml")
def run(
def run_diagnostics(
config_path,
serial=False,
time_series=False,
all=False,
atmosphere=False,
ocean=False,
Expand Down Expand Up @@ -106,89 +102,6 @@ def run(

####################################################################

if time_series:
timeseries_params = control["timeseries"]

# general timeseries arguments for all components
num_procs = timeseries_params["num_procs"]

for component, comp_bool in component_options.items():
if comp_bool:

# set time series input and output directory:
# -----
if isinstance(timeseries_params["case_name"], list):
ts_input_dirs = []
for cname in timeseries_params["case_name"]:
ts_input_dirs.append(global_params["CESM_output_dir"]+"/"+cname+f"/{component}/hist/")
else:
ts_input_dirs = [
global_params["CESM_output_dir"] + "/" +
timeseries_params["case_name"] + f"/{component}/hist/",
]

if "ts_output_dir" in timeseries_params:
if isinstance(timeseries_params["ts_output_dir"], list):
ts_output_dirs = []
for ts_outdir in timeseries_params["ts_output_dir"]:
ts_output_dirs.append([
os.path.join(
ts_outdir,
f"{component}", "proc", "tseries",
),
])
else:
ts_output_dirs = [
os.path.join(
timeseries_params["ts_output_dir"],
f"{component}", "proc", "tseries",
),
]
else:
if isinstance(timeseries_params["case_name"], list):
ts_output_dirs = []
for cname in timeseries_params["case_name"]:
ts_output_dirs.append(
os.path.join(
global_params["CESM_output_dir"],
cname,
f"{component}", "proc", "tseries",
),
)
else:
ts_output_dirs = [
os.path.join(
global_params["CESM_output_dir"],
timeseries_params["case_name"],
f"{component}", "proc", "tseries",
),
]
# -----

# fmt: off
# pylint: disable=line-too-long
cupid.timeseries.create_time_series(
component,
timeseries_params[component]["vars"],
timeseries_params[component]["derive_vars"],
timeseries_params["case_name"],
timeseries_params[component]["hist_str"],
ts_input_dirs,
ts_output_dirs,
# Note that timeseries output will eventually go in
# /glade/derecho/scratch/${USER}/archive/${CASE}/${component}/proc/tseries/
timeseries_params["ts_done"],
timeseries_params["overwrite_ts"],
timeseries_params[component]["start_years"],
timeseries_params[component]["end_years"],
timeseries_params[component]["level"],
num_procs,
serial,
logger,
)
# fmt: on
# pylint: enable=line-too-long

# Grab paths

run_dir = os.path.realpath(os.path.expanduser(control["data_sources"]["run_dir"]))
Expand Down Expand Up @@ -326,3 +239,7 @@ def run(
dag.build()

return None


if __name__ == "__main__":
run_diagnostics()
Loading

0 comments on commit ee4ac42

Please sign in to comment.