Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/spatial temporal stats tool updates #150

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 50 additions & 75 deletions ush/SpatialTemporalStatsTool/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,23 @@
### April 2024
### November 2024
### Azadeh Gholoubi
# Python Tool for Time/Space (2D) Evaluation
# Spatial and Temporal Analysis Tool for Satellite Observation Data

## Overview
This tool provides functionalities for processing and analyzing data over time and space.
**Purpose**: This tool performs spatial and temporal analysis for satellite observation data, allowing users to create customizable grids, filter data by time and region, and generate statistical and summary plots.

The `SpatialTemporalStats` class is designed to perform spatial and temporal statistics of data stored in NetCDF files. It includes features for generating grids, reading observational values, filtering data, plotting observations, and creating summary plots based on user settings.
### Key Functionalities:
- Grid-based Data Summaries: Creates spatial grids for data aggregation.
- Data Filtering: Processes data across specified time frames and geographical regions.
- Visualization: Generates evaluation plots for different data attributes and regions

The `SpatialTemporalStats` class is central to this tool, with methods for creating grids, reading observational data, filtering, plotting, and producing summary statistics.

### Important Methods of the SpatialTemporalStats Class
- `generate_grid(resolution=1)`: Generates a grid for spatial averaging based on the specified resolution. (default resolution is 1X1)
- `read_obs_values()`: Reads observational values from NetCDF files, filters them based on various criteria, performs spatial averaging, and returns the averaged values.
- `plot_obs()`: Plots observational data on a map, showing different regions and their corresponding data values.
- `list_variable_names(file_path)`: Lists variable names from a NetCDF file.
- `make_summary_plots()`: Generates summary plots of observational data, including scatter plots of counts, means, and standard deviations.
- `generate_grid(resolution=1)`: Generates a spatial grid with specified resolution (default: 1x1 degree).
- `read_obs_values()`: Reads and filters observational data from NetCDF files, performs spatial averaging, and returns averaged values.
- `plot_obs()`: Plots observation data on a map, with options for different regions and grid sizes.
- `list_variable_names(file_path)`: Lists variable names available in a specified NetCDF file.
- `make_summary_plots()`: Generates scatter plots for counts, means, and standard deviations of observational data.

## Requirements
User need to load EVA environment when working on Hera, use the following commands:
Expand All @@ -23,91 +28,61 @@ module load EVA/hera
```

## Usage
`user_Analysis.py` contains the `SpatialTemporalStats` class, which encapsulates the functionalities of the tool. Here's how to use it:

1. Import the `SpatialTemporalStats` class:

```python
from SpatialTemporalStats import SpatialTemporalStats
2. Create an instance of the SpatialTemporalStats class:

```python
my_tool = SpatialTemporalStats()

3. Specify the parameters based on the type of plots that you want:

- `input_path`: Directory for input .nc files
- `output_path`: Path to output plots
- `sensor`: Sensor name
- `channel_no`: Channel number (e.g., 1, 2, 3, 5)
- `var_name`: variable name
- `start_date, end_date`: Start and End date of the input files for evaluations
- `region`: Insert a number to select Global or Regional ouput plots (1: global (default), 2: polar region, 3: mid-latitudes region, 4: tropics region, 5: southern mid-latitudes region, 6: southern polar region)
- `resolution`: Resolution for grid generation (1: 1X1 degree(default), 2:2X2 degree, 3:3X3 degree)
- `filter_by_vars`: Filter by variable to generate plots based on surface type (land, water, snow, seaice) or can be an empty list for no filtering.

4. Call `read_obs_values` to Read observational values and perform analysis:

```python
o_minus_f_gdf = my_tool.read_obs_values(
input_path,
sensor,
var_name,
channel_no,
start_date,
end_date,
filter_by_vars,
QC_filter)
```
5. Call `plot_obs` to plot evaluation plots based on your setting for grid size, channel, region, surface type, and filtering values:

To get started, run the following command to see all available options and argument formats:
```python
my_tool.plot_obs(o_minus_f_gdf, var_name, region, resolution, output_path)
python SpatialTemporalStats.py -h
```
6. Call `make_summary_plots` to generate summary plots:
This command will display detailed information on how to input your settings. Key parameters include:

- input: Path to input data files
- output: Path for saving the results
- sensor: Satellite sensor name (e.g., "atms_n20")
- var: Variable to analyze (e.g., "Obs_Minus_Forecast_adjusted")
- ch: Channel number for the analysis (e.g., 1)
- grid: Grid resolution for spatial analysis (choices: 0.5, 1, 2; default: 1)
- region: Region code for map plot:
1 – Global
2 – Polar region (+60° latitude and above)
3 – Northern mid-latitudes (20° to 60° latitude)
4 – Tropics (-20° to 20° latitude)
5 – Southern mid-latitudes (-60° to -20° latitude)
6 – Southern polar region (below -60° latitude)
- sdate / -edate: Start and end dates for the time period (e.g., "2023-01-27" to "2023-01-28")
These parameters allow you to customize the spatial and temporal analysis to suit specific data and regions.



```python
summary_results = my_tool.make_summary_plots(
input_path, sensor, var_name, start_date, end_date, QC_filter, output_path
)
```
## Notes
Ensure that the `obs_files_path` and `output_path` variables are correctly set to the paths of observational files and output directory, respectively.
Adjust method parameters and plotting settings as needed for your specific use case.
Make sure to define the `filter_by_variable` method as needed for filtering observational data based on variable values.

To run the tool:

```
python user_Analysis.py
## Example Usage

```python
python SpatialTemporalStats.py -input /PATH/TO/INPUT/DIAG/FILES -output ./Results -sensor "atms_n20" -var "Obs_Minus_Forecast_adjusted" -ch 1 -grid 2 -region 1 -sdate "2023-01-27" -edate "2023-01-28"
```

## Example Usage
Here's a sample script demonstrating how to use the`SpatialTemporalStats` tool:
![image](https://github.com/NOAA-EMC/PyGSI/assets/51101867/4379cb6e-e1a7-4167-8859-ae881f2c61c1)

## Example output plots using different settings
```python
var_name = "Obs_Minus_Forecast_adjusted"
region = 1
resolution = 2
filter_by_vars=[]
-sensor "atms_n20" -var "Obs_Minus_Forecast_adjusted" -ch 1 -grid 2 -region 1 -sdate "2023-01-27" -edate "2023-01-28"
```
Calling `read_obs_values` and then `my_tool.plot_obs()` method will produce three plots for ave,count, rms as shown below:
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_Average_region_1](https://github.com/NOAA-EMC/PyGSI/assets/51101867/b838ae92-3303-45ca-b7ba-35b11c01213c)
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_Count_region_1](https://github.com/NOAA-EMC/PyGSI/assets/51101867/113ef427-9771-462a-b543-f36166ed978e)
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_RMS_region_1](https://github.com/NOAA-EMC/PyGSI/assets/51101867/ed4bc44c-6364-451b-811e-b2c8a0ce5d2a)
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_Average_region_1](https://github.com/user-attachments/assets/e0ddcf64-8ce1-4175-b646-71d1d38ec3d4)
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_Count_region_1](https://github.com/user-attachments/assets/a33dd6c4-bfb0-4ae9-a46d-02086f7dc960)
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_RMS_region_1](https://github.com/user-attachments/assets/f9b34e74-7511-464d-a27d-82f08cfa5c6b)


Example plot for filtering out the locations where the land fraction is less than 0.9
```python
filter_by_vars = [("Land_Fraction", "lt", 0.9),]
-filter_by_vars Land_Fraction,lt,0.9
```
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_Average_region_1](https://github.com/NOAA-EMC/PyGSI/assets/51101867/978e2677-4a7b-45b3-a2e2-67674bf0803e)
![atms_n20_ch1_Obs_Minus_Forecast_adjusted_Average_region_1](https://github.com/user-attachments/assets/bc6b7215-9d26-41c8-b51d-0f51d42238c3)

Example of the summary plots:
![atms_n20_Obs_Minus_Forecast_adjusted_mean_std](https://github.com/user-attachments/assets/99b09315-1faa-4fd1-9c26-e7b591dba2fc)
![atms_n20_Obs_Minus_Forecast_adjusted_sumamryCounts](https://github.com/user-attachments/assets/449cd174-f50d-4521-ab9f-e0d4b6f5ad9b)


Calling read_obs_values and then my_tool.make_summary_plots() method will generate two summary plots:
![atms_n20_Obs_Minus_Forecast_adjusted_mean_std](https://github.com/NOAA-EMC/PyGSI/assets/51101867/28cc26f4-c024-4713-82e1-b9a7ed5f5d1b)
![atms_n20_Obs_Minus_Forecast_adjusted_sumamryCounts](https://github.com/NOAA-EMC/PyGSI/assets/51101867/fd835f41-5b9c-4a14-be85-4c74d49571f6)



Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
23 changes: 23 additions & 0 deletions ush/SpatialTemporalStatsTool/Results/atms_n20_summary.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
channel,count,std,mean,rms
1,45262,2.6988273643037384,0.19821635086344594,2.7060966102607162
2,44462,2.885330599655865,0.44061785540288817,2.918780012918158
3,45276,2.3529656988789394,0.13292025534906543,2.3567170755911815
4,45672,1.4469203953076863,0.0448353262127089,1.4476148786310565
5,45700,0.4795219456469957,0.017871514061625335,0.4798548607359764
6,45744,0.15752856235399282,0.0006352198961456739,0.15752984308261478
7,52929,0.11603420279762479,0.0021500649409356577,0.11605412098728322
8,84112,0.11362915543792347,0.007189899951947341,0.1138563991475909
9,84122,0.1242098780941416,0.005015079193266936,0.12431108090382417
10,84119,0.1678040669750801,0.0012762595843600758,0.1678089202989674
11,84063,0.20609831023729452,0.0049344418110122325,0.20615737240917775
12,83916,0.25259764753168185,0.0011790454284033073,0.25260039922111327
13,82352,0.358631419062375,-0.01701693121436197,0.35903491569296797
14,83415,0.6214590289702845,-0.08582188340839891,0.6273569321849265
15,83772,1.0779297883385832,-0.9148381735994018,1.4138109889452872
16,40843,3.4015874021960975,0.12291585633857981,3.4038074508583764
17,42105,2.0949495712374997,-0.0925232459163654,2.0969917160216016
18,42368,1.3821771839342785,-0.010803419354404722,1.38221940431261
19,45173,1.368377491943836,0.0239483131097599,1.3685870385764136
20,45107,1.4126293284373275,0.09488016782235158,1.4158120870395723
21,44935,1.4810928233580214,0.134929813972193,1.4872262793876698
22,44728,1.5653715176881486,0.1405545452597037,1.5716690391372394
Loading
Loading