Skip to content

Commit

Permalink
Feature/optimize domain edges (#169)
Browse files Browse the repository at this point in the history
This update implements boundary condition optimization as part of the inversion, making use of the new boundary condition perturbation options from @nicholasbalasus in geoschem 14.2.1.

This is done by effectively increasing the number of state vector elements by 4, and performing 4 additional perturbation simulations. Notably, we do the inversion calculations for these final 4 elements in concentration space instead of calculating a correction factor. The calculated increase/decrease in ppb is then applied in the posterior simulation.

These changes have been tested using our standard 1-week test case, but the differences are minimal when using buffer cells and BC optimization. Additional testing was done for 1-month over Tasmania to see the effect of having no buffer cells for the domain with and without BC optimization.
  • Loading branch information
laestrada authored Oct 4, 2023
1 parent c7cd77f commit e236518
Show file tree
Hide file tree
Showing 15 changed files with 194 additions and 86 deletions.
7 changes: 7 additions & 0 deletions config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ nBufferClusters: 8
BufferDeg: 5
LandThreshold: 0.25
OffshoreEmisThreshold: 0
OptimizeBCs: false

## Point source datasets
## Used for visualizations and state vector clustering
Expand All @@ -71,7 +72,10 @@ StateVectorFile: "/home/ubuntu/integrated_methane_inversion/resources/statevecto
ShapeFile: "/home/ubuntu/integrated_methane_inversion/resources/shapefiles/PermianBasin_Extent_201712.shp"

## Inversion
## Note PriorError is a relative percentage
## and PriorErrorBCs is in ppb
PriorError: 0.5
PriorErrorBCs: 10.0
ObsError: 15
Gamma: 1.0
PrecomputedJacobian: false
Expand Down Expand Up @@ -125,7 +129,10 @@ SchedulerPartition: "debug"
##--------------------------------------------------------------------

## Jacobian settings
## Note PerturbValue is a relative scalefactor and
## PerturbValueBCs is in ppb
PerturbValue: 1.5
PerturbValueBCs: 10.0

## Apply scale factors from a previous inversion?
UseEmisSF: false
Expand Down
6 changes: 6 additions & 0 deletions docs/source/getting-started/imi-config-file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ State vector
- Land-cover fraction below which to exclude GEOS-Chem grid cells from the state vector when creating the state vector file. Default value is ``0.25``.
* - ``OffshoreEmisThreshold``
- Offshore GEOS-Chem grid cells with oil/gas emissions above this threshold will be included in the state vector. Default value is ``0``.
* - ``OptimizeBCs``
- Boolean to optimize boundary conditions during the inversion. Must also include ``PerturbValueBCs`` and ``PriorErrorBCs`` Default value is ``false``.

Clustering Options
^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -130,6 +132,8 @@ Inversion

* - ``PriorError``
- Error in the prior estimates (1-sigma; relative). Default is ``0.5`` (50%) error.
* - ``PriorErrorBCs``
- Error in the prior estimates (using ppb). Default is ``10`` ppb error.
* - ``ObsError``
- Observational error (1-sigma; absolute; ppb). Default value is ``15`` ppb error.
* - ``Gamma``
Expand Down Expand Up @@ -232,6 +236,8 @@ These settings are intended for advanced users who wish to modify additional GEO

* - ``PerturbValue``
- Value to perturb emissions by in each sensitivity simulation. Default value is ``1.5``.
* - ``PerturbValueBCs``
- Number of ppb to perturb emissions by for domain edges (North, South, East, West) if using `OptimizeBCs`. Default value is ``10.0`` ppb.
* - ``UseEmisSF``
- Boolean to apply emissions scale factors derived from a previous inversion. This file should be provided as a netCDF file and specified in HEMCO_Config.rc. Default value is ``false``.
* - ``UseOHSF``
Expand Down
7 changes: 7 additions & 0 deletions envs/Harvard-Cannon/config.harvard-cannon.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ nBufferClusters: 8
BufferDeg: 5
LandThreshold: 0.25
OffshoreEmisThreshold: 0
OptimizeBCs: false

## Point source datasets
## Used for visualizations and state vector clustering
Expand All @@ -71,7 +72,10 @@ StateVectorFile: "/n/holyscratch01/jacob_lab/msulprizio/Test_IMI/StateVector.nc"
ShapeFile: "resources/shapefiles/PermianBasin_Extent_201712.shp"

## Inversion
## Note PriorError is a relative percentage
## and PriorErrorBCs is in ppb
PriorError: 0.5
PriorErrorBCs: 10.0
ObsError: 15
Gamma: 1.0
PrecomputedJacobian: false
Expand Down Expand Up @@ -114,7 +118,10 @@ SchedulerPartition: "seas_compute,huce_cascade,huce_intel"
##--------------------------------------------------------------------

## Jacobian settings
## Note PerturbValue is a relative scalefactor and
## PerturbValueBCs is in ppb
PerturbValue: 1.5
PerturbValueBCs: 10.0

## Apply scale factors from a previous inversion?
UseEmisSF: false
Expand Down
7 changes: 7 additions & 0 deletions resources/containers/container_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ nBufferClusters: 8
BufferDeg: 5
LandThreshold: 0.25
OffshoreEmisThreshold: 0
OptimizeBCs: false

## Point source datasets
## Used for visualizations and state vector clustering
Expand All @@ -71,7 +72,10 @@ StateVectorFile: "/home/al2/integrated_methane_inversion/resources/statevectors/
ShapeFile: "/home/al2/integrated_methane_inversion/resources/shapefiles/PermianBasin_Extent_201712.shp"

## Inversion
## Note PriorError is a relative percentage
## and PriorErrorBCs is in ppb
PriorError: 0.5
PriorErrorBCs: 10.0
ObsError: 15
Gamma: 1.0
PrecomputedJacobian: false
Expand Down Expand Up @@ -125,7 +129,10 @@ SchedulerPartition: "debug"
##--------------------------------------------------------------------

## Jacobian settings
## Note PerturbValue is a relative scalefactor and
## PerturbValueBCs is in ppb
PerturbValue: 1.5
PerturbValueBCs: 10.0

## Apply scale factors from a previous inversion?
UseEmisSF: false
Expand Down
39 changes: 36 additions & 3 deletions src/components/jacobian_component/jacobian.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# Functions available in this file include:
# - setup_jacobian
# - run_jacobian
# - generate_BC_perturb_values

# Description: Setup jacobian run directory
# Usage:
Expand Down Expand Up @@ -81,9 +82,26 @@ setup_jacobian() {
fi
fi

# Update settings in geoschem_config.yml
sed -i -e "s|emission_perturbation: 1.0|emission_perturbation: ${PerturbValue}|g" \
-e "s|state_vector_element_number: 0|state_vector_element_number: ${xUSE}|g" geoschem_config.yml
# Update settings in geoschem_config.yml except for the base run
if [ $x -ne 0 ]; then
sed -i -e "s|emission_perturbation: 1.0|emission_perturbation: ${PerturbValue}|g" \
-e "s|state_vector_element_number: 0|state_vector_element_number: ${xUSE}|g" geoschem_config.yml
fi

# BC optimization setup
if "$OptimizeBCs"; then
bcThreshold=$(($nElements - 4))
# The last four state vector elements are reserved for BC optimization of NSEW
# domain edges. If the current state vector element is one of these, then
# turn on BC optimization for the corresponding edge and revert emission perturbation
if [ $x -gt $bcThreshold ]; then
PerturbBCValues=$(generate_BC_perturb_values $bcThreshold $x $PerturbValueBCs)
sed -i -e "s|CH4_boundary_condition_ppb_increase_NSEW:.*|CH4_boundary_condition_ppb_increase_NSEW: ${PerturbBCValues}|g" \
-e "s|perturb_CH4_boundary_conditions: false|perturb_CH4_boundary_conditions: true|g" \
-e "s|emission_perturbation: ${PerturbValue}|emission_perturbation: 1.0|g" \
-e "s|state_vector_element_number: ${xUSE}|state_vector_element_number: 0|g" geoschem_config.yml
fi
fi

# Update settings in HISTORY.rc
# Only save out hourly pressure fields to daily files for base run
Expand Down Expand Up @@ -185,3 +203,18 @@ run_jacobian() {
printf "Got Jacobian scale factors\n"
fi
}

# Description: Print perturbation string for BC optimization
# based on the current state vector element
# Returns [float, float, float, float]
# Usage:
# generate_BC_perturb_values <bcThreshold> <element-number> <pert-value>
generate_BC_perturb_values() {
python -c "import sys;\
bc_perturb = [0.0, 0.0, 0.0, 0.0];\
bcThreshold = int(sys.argv[1]) + 1;\
element = int(sys.argv[2]);\
pert_index = element % bcThreshold;\
bc_perturb[pert_index] = float(sys.argv[3]);\
print(bc_perturb)" $1 $2 $3
}
2 changes: 1 addition & 1 deletion src/components/kalman_component/kalman.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ setup_kf() {
mkdir -p ${RunDirs}/archive_sf

# Number of state vector elements
nElements=$(ncmax StateVector ${StateVectorFile})
nElements=$(ncmax StateVector ${StateVectorFile} ${OptimizeBCs})
}

# Description: Run Kalman filter inversions
Expand Down
26 changes: 25 additions & 1 deletion src/components/posterior_component/posterior.sh
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,21 @@ run_posterior() {
source ${GEOSChemEnv}
fi

if "$OptimizeBCs"; then
if "$KalmanMode"; then
inv_result_path="${RunDirs}/kf_inversions/period${period_i}/inversion_result.nc"
else
inv_result_path="${RunDirs}/inversion/inversion_result.nc"
fi
# set BC optimal delta values
PerturbBCValues=$(generate_optimized_BC_values $inv_result_path)
# add BC optimization delta to boundary condition edges
sed -i -e "s|CH4_boundary_condition_ppb_increase_NSEW:.*|CH4_boundary_condition_ppb_increase_NSEW: ${PerturbBCValues}|g" \
-e "s|perturb_CH4_boundary_conditions: false|perturb_CH4_boundary_conditions: true|g" geoschem_config.yml

printf "\n=== BC OPTIMIZATION: BC optimized perturbation values for NSEW set to: ${PerturbBCValues} ===\n"
fi

# Submit job to job scheduler
printf "\n=== SUBMITTING POSTERIOR SIMULATION ===\n"
sbatch --mem $SimulationMemory \
Expand Down Expand Up @@ -147,7 +162,7 @@ run_posterior() {
LonMaxInvDomain=$(ncmax lon ${RunDirs}/StateVector.nc)
LatMinInvDomain=$(ncmin lat ${RunDirs}/StateVector.nc)
LatMaxInvDomain=$(ncmax lat ${RunDirs}/StateVector.nc)
nElements=$(ncmax StateVector ${RunDirs}/StateVector.nc)
nElements=$(ncmax StateVector ${RunDirs}/StateVector.nc ${OptimizeBCs})
FetchTROPOMI="False"
isPost="True"
buildJacobian="False"
Expand All @@ -160,3 +175,12 @@ run_posterior() {
# convert vizualization notebooks to html
run_notebooks
}

# Description: Generates the updated NSEW perturbation to apply to domain edge BCs
# Usage:
# generate_optimized_BC_values <path-to-inversion-result> <bc-pert-value>
generate_optimized_BC_values() {
python -c "import sys; import xarray;\
xhat = xarray.open_dataset(sys.argv[1])['xhat'].values[-4:];\
print(xhat.tolist())" $1
}
2 changes: 1 addition & 1 deletion src/components/setup_component/setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ setup_imi() {
fi

# Determine number of elements in state vector file
nElements=$(ncmax StateVector ${RunDirs}/StateVector.nc)
nElements=$(ncmax StateVector ${RunDirs}/StateVector.nc ${OptimizeBCs})
printf "\nNumber of state vector elements in this inversion = ${nElements}\n\n"

# Define inversion domain lat/lon bounds
Expand Down
2 changes: 1 addition & 1 deletion src/components/statevector_component/statevector.sh
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ reduce_dimension() {
mkdir -p ${RunDirs}/archive_sv
cp $state_vector_path ${RunDirs}/archive_sv/StateVector_${period_i}.nc
fi
nElements=$(ncmax StateVector ${RunDirs}/StateVector.nc)
nElements=$(ncmax StateVector ${RunDirs}/StateVector.nc ${OptimizeBCs})
printf "\nNumber of state vector elements in this inversion = ${nElements}\n\n"
printf "\n=== DONE REDUCING DIMENSION OF STATE VECTOR FILE ===\n"
}
36 changes: 33 additions & 3 deletions src/inversion_scripts/calc_sensi.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,22 +15,43 @@ def zero_pad_num(n):
nstr = "0" + nstr
return nstr

def test_GC_output_for_BC_perturbations(e, nelements, sensitivities):

"""
Ensures that CH4 boundary condition perturbation in GEOS-Chem is working as intended
sensitivities = (pert-base)/perturbationBC which should equal 1e-9 inside the perturbed borders
example: the north boundary is perturbed by 10 ppb
pert-base=10e-9 mol/mol in the 3 grid cells that have been perturbed
perturbationBC=10 ppb
sensitivities = (pert-base)/perturbationBC = 1e-9
"""

if e == (nelements - 4): # North boundary
check = np.mean(sensitivities[:,-3:,3:-3])
elif e == (nelements - 3): # South boundary
check = np.mean(sensitivities[:,0:3,3:-3])
elif e == (nelements - 2): # East boundary
check = np.mean(sensitivities[:,:,-3:])
elif e == (nelements - 1): # West boundary
check = np.mean(sensitivities[:,:,0:3])
assert abs(check - 1e-9) < 1e-11, f"GC CH4 perturb not working... perturbation is off by {abs(check - 1e-9)} mol/mol/ppb"

def calc_sensi(
nelements, perturbation, startday, endday, run_dirs_pth, run_name, sensi_save_pth
nelements, perturbation, startday, endday, run_dirs_pth, run_name, sensi_save_pth, perturbationBC
):
"""
Loops over output data from GEOS-Chem perturbation simulations to compute sensitivities
for the Jacobian matrix.
Arguments
nelements [int] : Number of state vector elements
perturbation [float] : Size of perturbation (e.g., 0.5)
perturbation [float] : Size of perturbation (e.g., 1.5)
startday [str] : First day of inversion period; formatted YYYYMMDD
endday [str] : Last day of inversion period; formatted YYYYMMDD
run_dirs_pth [str] : Path to directory containing GC Jacobian run directories
run_name [str] : Simulation run name; e.g. 'CH4_Jacobian'
sensi_save_pth [str] : Path to save the sensitivity data
perturbationBC [float] : Size of BC perturbation in ppb (eg. 10.0)
Resulting 'sensi' files look like:
Expand Down Expand Up @@ -62,6 +83,8 @@ def calc_sensi(
sensi[element,:,:,:] = sens
save sensi as netcdf with appropriate coordinate variables
"""
# subtract by 1 because here we assume .5 is a +50% perturbation
perturbation = perturbation - 1

# Make date range
days = []
Expand Down Expand Up @@ -106,7 +129,12 @@ def process(h):
# Get the data for the current hour
pert = pert_data["SpeciesConcVV_CH4"][h, :, :, :]
# Compute and store the sensitivities
sensitivities = (pert.values - base.values) / perturbation
if (perturbationBC is not None) and (e >= (nelements-4)):
sensitivities = (pert.values - base.values) / perturbationBC
if h != 0: # because we take the first hour on the first day from spinup
test_GC_output_for_BC_perturbations(e, nelements, sensitivities)
else:
sensitivities = (pert.values - base.values) / perturbation
sensi[e, :, :, :] = sensitivities
# Save sensi as netcdf with appropriate coordinate variables
sensi = xr.DataArray(
Expand Down Expand Up @@ -140,6 +168,7 @@ def process(h):
run_dirs_pth = sys.argv[5]
run_name = sys.argv[6]
sensi_save_pth = sys.argv[7]
perturbationBC = float(sys.argv[8]) if len(sys.argv) > 8 else None

calc_sensi(
nelements,
Expand All @@ -149,4 +178,5 @@ def process(h):
run_dirs_pth,
run_name,
sensi_save_pth,
perturbationBC,
)
Loading

0 comments on commit e236518

Please sign in to comment.