fix merge conflicts

geoschem · Aug 18, 2023 · b504ee9 · b504ee9
2 parents c615b80 + a00b906
commit b504ee9
Show file tree

Hide file tree

Showing 16 changed files with 333 additions and 83 deletions.
diff --git a/config.yml b/config.yml
@@ -54,6 +54,10 @@ BufferDeg: 5
 LandThreshold: 0.25
 OffshoreEmisThreshold: 0
 
+## Point source datasets
+## Used for visualizations and state vector clustering
+PointSourceDatasets: ["SRON"]
+
 ## Clustering Options
 ReducedDimensionStateVector: false
 DynamicKFClustering: false
@@ -180,4 +184,4 @@ PreviewDryRun: true
 SpinupDryrun: true
 ProductionDryRun: true
 PosteriorDryRun: true
-BCdryrun: true
+BCdryrun: true
diff --git a/docs/source/advanced/imi-docker-container.rst b/docs/source/advanced/imi-docker-container.rst
@@ -1,6 +1,6 @@
-=================
-The IMI Container
-=================
+==============================
+Using the IMI Docker container
+==============================
 
 What is a container?
 ====================
@@ -40,9 +40,7 @@ the section on `Using Singularity instead of Docker <#using-singularity-instead-
 -----------------
 Pulling the image
 -----------------
-To run the container you will first need to pull the image from our cloud repository
-
-::
+To run the container you will first need to pull the image from our cloud repository::
 
     $ docker pull public.ecr.aws/w1q7j9l2/imi-docker-image:latest
 
@@ -62,21 +60,18 @@ The IMI needs input data in order to run the inversion. If you do not have the n
 locally then you will need to give the IMI container access to S3 on AWS, where the input data is available. This 
 can be done by specifying your 
 `aws credentials <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html#envvars-set>`__ in 
-the ``environment`` section of the compose.yml file. Eg:
-
-::
+the ``environment`` section of the compose.yml file. Eg:::
+
     environment:
         - AWS_ACCESS_KEY_ID=your_access_key_id
         - AWS_SECRET_ACCESS_KEY=your_secret_access_key
         - AWS_DEFAULT_REGION=us-east-1
 
-
 Note: these credentials are sensitive, so do not post them publicly in any repository.
 
 If you already have the necessary input data available locally, then you can mount it to the IMI container in the 
-`volumes` section of the compose.yml file without setting your aws credentials. Eg:
+`volumes` section of the compose.yml file without setting your aws credentials. Eg:::
 
-::
     volumes:
         - /local/input/data:/home/al2/ExtData # mount input data directory
 
@@ -85,9 +80,8 @@ Storing the output data
 -----------------------
 In order to access the files from the inversion it is best to mount a volume from your local system onto the docker 
 container. This allows the results of the inversion to persist after the container exits. We recommend making a 
-dedicated IMI output directory using `mkdir`.
+dedicated IMI output directory using `mkdir`.::
 
-::
     volumes:
         - /local/output/dir/imi_output:/home/al2/imi_output_dir # mount output directory
         - /local/container/config.yml:/home/al2/integrated_methane_inversion/config.yml # mount desired config file
@@ -101,9 +95,8 @@ mechanisms to update the config.yml file:
 1. If you would only like to update specific variables you can pass them in as environment variables:
 
 All environment variables matching the pattern ``IMI_<config-variable-name>`` will update their corresponding config.yml 
-variable. For example:
+variable. For example:::
 
-::
     environment:
         - IMI_StartDate=20200501 
         - IMI_EndDate=20200601
@@ -113,9 +106,8 @@ will replace the ``StartDate`` and ``EndDate`` in the IMI config.yml file.
 2. Replace the entire config.yml file with one from the host system:
 
 To apply a config.yml file from your local system to the docker container, specify it in your compose.yml file as a 
-volume. Then set the ``IMI_CONFIG_PATH`` environment variable to point to that path. Eg:
+volume. Then set the ``IMI_CONFIG_PATH`` environment variable to point to that path. Eg:::
 
-::
     volumes:
         - /local/path/to/config.yml:/home/al2/integrated_methane_inversion/config.yml # mount desired config file
     environment:
@@ -127,9 +119,8 @@ Note: any env variables matching the pattern specified in option 1 will overwrit
 
 Example compose.yml file
 ------------------------
-This is an example of what a fully filled out compose.yml file looks like:
+This is an example of what a fully filled out compose.yml file looks like:::
 
-::
     # IMI Docker Compose File
     # This file is used to run the IMI Docker image
     # and define important parameters for the container
@@ -149,12 +140,14 @@ This is an example of what a fully filled out compose.yml file looks like:
           - AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
           - AWS_DEFAULT_REGION=us-east-1
 
-## Running the IMI
-Once you have configured the compose.yml file, you can run the IMI by running:
 
-::
+Running the IMI
+---------------
+Once you have configured the compose.yml file, you can run the IMI by running:::
+
     $ docker compose up
 
+
 from the same directory as your ``compose.yml`` file. This will start the IMI container and run the inversion. 
 The output will be saved to the directory you specified in the compose.yml file. 
 
@@ -170,11 +163,10 @@ Singularity is a container engine designed to run on HPC systems and local clust
 Docker to be installed.
 Note: using Singularity to run the IMI is untested and may not work as expected.
 
-First pull the image:
-::
+First pull the image:::
+
     $ singularity pull public.ecr.aws/w1q7j9l2/imi-docker-image:latest
 
-Then run the image:
+Then run the image:::
 
-::
-    $ singularity run imi-docker-repository_latest.sif
+    $ singularity run imi-docker-repository_latest.sif
diff --git a/docs/source/advanced/using-clustering-options.rst b/docs/source/advanced/using-clustering-options.rst
@@ -67,16 +67,22 @@ native resolution element is preserved during the aggregation. In order for the
 preserve the element, you must have enough ``NumberOfElements`` specified to accomodate the 
 number of gridcells you would like to force to be native resolution.
 
+Additionally, the ``PointSourceDatasets`` config variable can be used to automatically scrape emission 
+hotspots from external point source datasets. Currently, the only supported dataset is the ``"SRON"`` 
+`weekly plumes dataset <https://earth.sron.nl/methane-emissions/>`_.
+
 yaml list example:
 ::
     
+    PointSourceDatasets: ["SRON"]
     ForcedNativeResolutionElements:
       - [31.5, -104]
       - [32.5, -103.5]
 
 csv file example:
 ::
     
+    PointSourceDatasets: ["SRON"]
     ForcedNativeResolutionElements: "/path/to/point_source_locations.csv"
 
 The csv file should have a header row with the column names ``lat`` and ``lon`` using lowercase letters. 

diff --git a/docs/source/getting-started/imi-config-file.rst b/docs/source/getting-started/imi-config-file.rst
@@ -24,6 +24,8 @@ General
      - S3 path to upload files to (eg. ``s3://imi-output-dir/example-output/``). Only used if ``S3Upload`` is ``true``.
    * - ``S3UploadFiles``
      - Files to upload from the IMI Output directory (eg. ``[*]`` will upload everything). Only used if ``S3Upload`` is ``true``.
+   * - ``PointSourceDataset``
+     - Files to upload from the IMI Output directory (eg. ``[*]`` will upload everything). Only used if ``S3Upload`` is ``true``.
 
 Period of interest
 ~~~~~~~~~~~~~~~~~~

diff --git a/envs/Harvard-Cannon/config.harvard-cannon.yml b/envs/Harvard-Cannon/config.harvard-cannon.yml
@@ -54,6 +54,10 @@ BufferDeg: 5
 LandThreshold: 0.25
 OffshoreEmisThreshold: 0
 
+## Point source datasets
+## Used for visualizations and state vector clustering
+PointSourceDatasets: ["SRON"]
+
 ## Clustering Options
 ReducedDimensionStateVector: false
 DynamicKFClustering: false

diff --git a/envs/Harvard-Cannon/imi_env.yml b/envs/Harvard-Cannon/imi_env.yml
@@ -22,4 +22,5 @@ dependencies:
   - ipykernel=6.15.0
   - jupyter=1.0.0
   - bottleneck=1.3.5
-  - boto3=1.26.161
+  - bs4=4.12.2
+  - boto3=1.26.161
diff --git a/resources/containers/container_config.yml b/resources/containers/container_config.yml
@@ -54,6 +54,10 @@ BufferDeg: 5
 LandThreshold: 0.25
 OffshoreEmisThreshold: 0
 
+## Point source datasets
+## Used for visualizations and state vector clustering
+PointSourceDatasets: ["SRON"]
+
 ## Clustering Options
 ReducedDimensionStateVector: false
 DynamicKFClustering: false

diff --git a/run_imi.sh b/run_imi.sh
@@ -4,6 +4,7 @@
 #SBATCH -n 1
 #SBATCH -o "imi_output.log"
 
+
 # This script will run the Integrated Methane Inversion (IMI) with GEOS-Chem.
 # For documentation, see https://imi.readthedocs.io.
 #
@@ -196,6 +197,7 @@ cd $InversionPath
 cp $ConfigFile "${RunDirs}/config_${RunName}.yml"
 
 # Upload output to S3 if specified
+cd $InversionPath
 python src/utilities/s3_upload.py $ConfigFile
 
 exit 0
diff --git a/src/components/shell_variable_manifest.md b/src/components/shell_variable_manifest.md
@@ -239,5 +239,5 @@ Note: This does not include variables defined in python scripts.
 | x                                         | ['src/components/kalman_component/kalman.sh', 'src/components/jacobian_component/jacobian.sh', 'src/geoschem_run_scripts/run_jacobian_simulations.sh']                                                                                                                                                                                                                                                                                                                                                                        | ['src/components/jacobian_component/jacobian.sh', 'src/geoschem_run_scripts/run_jacobian_simulations.sh']                                                                                                                                    | ['setup_jacobian']                                                                | ['run_period', 'setup_jacobian', 'run_jacobian']                                                                                                                                                                                                                                                               |
 | xstr                                      | ['src/components/kalman_component/kalman.sh', 'src/components/jacobian_component/jacobian.sh', 'src/geoschem_run_scripts/run_prior_simulation.sh', 'src/geoschem_run_scripts/run_jacobian_simulations.sh']                                                                                                                                                                                                                                                                                                                    | ['src/components/kalman_component/kalman.sh', 'src/components/jacobian_component/jacobian.sh', 'src/geoschem_run_scripts/run_prior_simulation.sh', 'src/geoschem_run_scripts/run_jacobian_simulations.sh']                                   | ['run_period', 'setup_jacobian']                                                  | ['setup_kf', 'run_kf', 'run_period', 'setup_jacobian', 'run_jacobian']                                                                                                                                                                                                                                         |
 | xUSE                                      | ['src/components/jacobian_component/jacobian.sh']                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | ['src/components/jacobian_component/jacobian.sh']                                                                                                                                                                                            | ['setup_jacobian']                                                                | ['setup_jacobian', 'run_jacobian']                                                                                                                                                                                                                                                                             |
-| YEAR                                      | ['src/utilities/find_corrupt_files.sh']                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | ['src/utilities/find_corrupt_files.sh']                                                                                                                                                                                                      | []                                                                                | ['download_aws_files', 'report']                                                                                                                                                                                                                                                                               |
-| Year                                      | ['src/utilities/crop_met.sh']                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | ['src/utilities/crop_met.sh']                                                                                                                                                                                                                | []                                                                                | []                                                                                                                                                                                                                                                                                                             |
+| Year                                      | ['src/utilities/crop_met.sh']                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | ['src/utilities/crop_met.sh']                                                                                                                                                                                                                | []                                                                                | []                                                                                                                                                                                                                                                                                                             |
+| YEAR                                      | ['src/utilities/find_corrupt_files.sh']                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | ['src/utilities/find_corrupt_files.sh']                                                                                                                                                                                                      | []                                                                                | ['download_aws_files', 'report']                                                                                                                                                                                                                                                                               |
diff --git a/src/components/statevector_component/aggregation.py b/src/components/statevector_component/aggregation.py
@@ -5,9 +5,8 @@
 import yaml
 import xarray as xr
 import numpy as np
-import pandas as pd
-import yaml
-import sys
+
+from src.inversion_scripts.point_sources import get_point_source_coordinates
 from src.inversion_scripts.imi_preview import (
     estimate_averaging_kernel,
     map_sensitivities_to_sv,
@@ -296,41 +295,6 @@ def generate_cluster_pairs(config, sensitivities):
     return sorted(cluster_pairs, key=lambda x: x[0])
 
 
-def read_coordinates(coord_var):
-    """
-    Description:
-        Read coordinates either from a list of lists or a csv file
-    arguments:
-        coord_var   [] or String : either a list of coordinates or a csv file
-    Returns:                [[]] : list of [lat, lon] coordinates of floats
-    """
-
-    # handle path to csv file containg coordinates
-    if isinstance(coord_var, str):
-        if not coord_var.endswith(".csv"):
-            raise Exception(
-                "ForcedNativeResolutionElements expects either a .csv file or a list of lists."
-            )
-        coords_df = pd.read_csv(coord_var)
-
-        # check if lat and lon columns are present
-        if not ("lat" in coords_df.columns and "lon" in coords_df.columns):
-            raise Exception(
-                "lat or lon columns are not present in the csv file."
-                + " csv file must have lat and lon in header using lowercase."
-            )
-        # select lat and lon columns and convert to list of lists
-        return coords_df[["lat", "lon"]].values.tolist()
-
-    # handle list of lists
-    elif isinstance(coord_var, list):
-        return coord_var
-    else:
-        # Variable is neither a string nor a list
-        print("Warning: No ForcedNativeResolutionElements specified or invalid format.")
-        return None
-
-
 def force_native_res_pixels(config, clusters, sensitivities):
     """
     Description:
@@ -343,10 +307,13 @@ def force_native_res_pixels(config, clusters, sensitivities):
         cluster_pairs    [(tuple)]: cluster pairings
     Returns:             [double] : updated sensitivities
     """
-    coords = read_coordinates(config["ForcedNativeResolutionElements"])
+    coords = get_point_source_coordinates(config)
 
-    if coords is None:
+    if len(coords) == 0:
         # No forced pixels inputted
+        print(
+            f"No forced native pixels specified or in {config['PointSourceDatasets']} dataset."
+        )
         return sensitivities
 
     if config["Res"] == "0.25x0.3125":
@@ -356,6 +323,17 @@ def force_native_res_pixels(config, clusters, sensitivities):
         lat_step = 0.5
         lon_step = 0.625
 
+    for lat, lon in coords:
+        lon = np.floor(lon / lon_step) * lon_step
+        lat = np.floor(lat / lat_step) * lat_step
+
+    # Remove any duplicate coordinates within the same gridcell.
+    coords = sorted(set(map(tuple, coords)), reverse=True)
+    coords = [list(coordinate) for coordinate in coords]
+
+    if len(coords) > config["NumberOfElements"]:
+        coords = coords[0 : config["NumberOfElements"] - 1]
+
     for lat, lon in coords:
         binned_lon = np.floor(lon / lon_step) * lon_step
         binned_lat = np.floor(lat / lat_step) * lat_step

diff --git a/src/geoschem_run_scripts/ch4_run.template b/src/geoschem_run_scripts/ch4_run.template
@@ -1,5 +1,4 @@
 #!/bin/bash
-
 ##SBATCH -N 1
 ##SBATCH --mail-type=END
 

diff --git a/src/geoschem_run_scripts/run_jacobian_simulations.sh b/src/geoschem_run_scripts/run_jacobian_simulations.sh
@@ -1,5 +1,4 @@
 #!/bin/bash
-
 #SBATCH -J {RunName}
 #SBATCH -N 1