Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New reader for Himawari L2 NOAA enterprise cloud products. #2558

Merged
merged 19 commits into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions satpy/etc/readers/ahi_l2_nc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
reader:
name: ahi_l2_nc
short_name: AHI L2 NetCDF4
long_name: Himawari-8/9 AHI Level 2 products in netCDF4 format from NOAA enterprise
status: Beta
supports_fsspec: true
sensors: ['ahi']
reader: !!python/name:satpy.readers.yaml_reader.FileYAMLReader


file_types:
ahi_l2_mask:
file_reader: !!python/name:satpy.readers.ahi_l2_nc.HIML2NCFileHandler
file_patterns:
- '{sensor:3s}-CMSK_{version:4s}_{platform:3s}_s{start_time:%Y%m%d%H%M%S%f}_e{end_time:%Y%m%d%H%M%S%f}_c{creation_time:%Y%m%d%H%M%S%f}.nc'

ahi_l2_type:
file_reader: !!python/name:satpy.readers.ahi_l2_nc.HIML2NCFileHandler
file_patterns:
- '{sensor:3s}-CPHS_{version:4s}_{platform:3s}_s{start_time:%Y%m%d%H%M%S%f}_e{end_time:%Y%m%d%H%M%S%f}_c{creation_time:%Y%m%d%H%M%S%f}.nc'

ahi_l2_height:
file_reader: !!python/name:satpy.readers.ahi_l2_nc.HIML2NCFileHandler
file_patterns:
- '{sensor:3s}-CHGT_{version:4s}_{platform:3s}_s{start_time:%Y%m%d%H%M%S%f}_e{end_time:%Y%m%d%H%M%S%f}_c{creation_time:%Y%m%d%H%M%S%f}.nc'

datasets:
# Products from the cloud mask files
cloud_mask:
name: cloud_mask
file_key: CloudMask
file_type: [ ahi_l2_mask ]

cloud_mask_binary:
name: cloud_mask_binary
file_key: CloudMaskBinary
file_type: [ ahi_l2_mask ]

cloud_probability:
name: cloud_probability
file_key: CloudProbability
file_type: [ ahi_l2_mask ]

ice_cloud_probability:
name: ice_cloud_probability
file_key: IceCloudProbability
file_type: [ ahi_l2_mask ]

phase_uncertainty:
name: phase_uncertainty
file_key: PhaseUncertainty
file_type: [ ahi_l2_mask ]

dust_mask:
name: dust_mask
file_key: Dust_Mask
file_type: [ ahi_l2_mask ]

fire_mask:
name: fire_mask
file_key: Fire_Mask
file_type: [ ahi_l2_mask ]

smoke_mask:
name: smoke_mask
file_key: Smoke_Mask
file_type: [ ahi_l2_mask ]

# Products from the cloud phase / type files
cloud_phase:
name: cloud_phase
file_key: CloudPhase
file_type: [ ahi_l2_type ]

cloud_phase_flag:
name: cloud_phase_flag
file_key: CloudPhaseFlag
file_type: [ ahi_l2_type ]

cloud_type:
name: cloud_type
file_key: CloudType
file_type: [ ahi_l2_type ]

# Products from the cloud height files
cloud_optical_depth:
name: cloud_optical_depth
file_key: CldOptDpth
file_type: [ ahi_l2_height ]

cloud_top_emissivity:
name: cloud_top_emissivity
file_key: CldTopEmss
file_type: [ ahi_l2_height ]

cloud_top_pressure:
name: cloud_top_pressure
file_key: CldTopPres
file_type: [ ahi_l2_height ]

cloud_top_pressure_low:
name: cloud_top_pressure_low
file_key: CldTopPresLow
file_type: [ ahi_l2_height ]

cloud_top_temperature:
name: cloud_top_temperature
file_key: CldTopTemp
file_type: [ ahi_l2_height ]

cloud_top_temperature_low:
name: cloud_top_temperature_low
file_key: CldTopTempLow
file_type: [ ahi_l2_height ]

cloud_height_quality:
name: cloud_height_quality
file_key: CloudHgtQF
file_type: [ ahi_l2_height ]

retrieval_cost:
name: retrieval_cost
file_key: Cost
file_type: [ ahi_l2_height ]

inversion_flag:
name: inversion_flag
file_key: InverFlag
file_type: [ ahi_l2_height ]

latitude_parallax_corrected:
name: latitude_parallax_corrected
file_key: Latitude_Pc
file_type: [ ahi_l2_height ]

longitude_parallax_corrected:
name: longitude_parallax_corrected
file_key: Longitude_Pc
file_type: [ ahi_l2_height ]

cloud_top_pressure_error:
name: cloud_top_pressure_error
file_key: PcError
file_type: [ ahi_l2_height ]

processing_order:
name: processing_order
file_key: ProcOrder
file_type: [ ahi_l2_height ]

shadow_mask:
name: shadow_mask
file_key: Shadow_Mask
file_type: [ ahi_l2_height ]

cloud_top_temperature_error:
name: cloud_top_temperature_error
file_key: TcError
file_type: [ ahi_l2_height ]

cloud_top_height_error:
name: cloud_top_height_error
file_key: ZcError
file_type: [ ahi_l2_height ]

# Datasets in all three file types
latitude:
name: latitude
file_key: Latitude
file_type: [ ahi_l2_height, ahi_l2_type, ahi_l2_mask ]

longitude:
name: longitude
file_key: Longitude
file_type: [ ahi_l2_height, ahi_l2_type, ahi_l2_mask ]
136 changes: 136 additions & 0 deletions satpy/readers/ahi_l2_nc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (c) 2023 Satpy developers
#
# This file is part of satpy.
#
# satpy is free software: you can redistribute it and/or modify it under the
# terms of the GNU General Public License as published by the Free Software
# Foundation, either version 3 of the License, or (at your option) any later
# version.
#
# satpy is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# satpy. If not, see <http://www.gnu.org/licenses/>.
"""Reader for Himawari L2 cloud products from NOAA's big data programme.
simonrp84 marked this conversation as resolved.
Show resolved Hide resolved

For more information about the data, see: <https://registry.opendata.aws/noaa-himawari/>.

These products are generated by the NOAA enterprise cloud suite and have filenames like:
AHI-CMSK_v1r1_h09_s202308240540213_e202308240549407_c202308240557548.nc

The second letter grouping (CMSK above) indicates the product type:
CMSK - Cloud mask

CHGT - Cloud height

CPHS - Cloud type and phase

These products are generated from the AHI sensor on Himawari-8 and Himawari-9, and are
produced at the native instrument resolution for the IR channels (2km at nadir).

NOTE: This reader is currently only compatible with full disk scenes. Unlike level 1 himawari
data, the netCDF files do not contain the required metadata to produce an appropriate area
definition for the data contents, and hence the area definition is hardcoded into the reader.

A warning is displayed to the user highlighting this. The assumed area definition is a full
disk image at the nominal subsatellite longitude of 140.7 degrees East.

All the simple data products are supported here, but multidimensional products are not yet
supported. These include the CldHgtFlag and the CloudMaskPacked variables.
"""

import logging
from datetime import datetime

import xarray as xr

from satpy._compat import cached_property
from satpy.readers._geos_area import get_area_definition, get_area_extent
from satpy.readers.file_handlers import BaseFileHandler

logger = logging.getLogger(__name__)

EXPECTED_DATA_AREA = 'Full Disk'


class HIML2NCFileHandler(BaseFileHandler):
"""File handler for Himawari L2 NOAA enterprise data in netCDF format."""

def __init__(self, filename, filename_info, filetype_info):
"""Initialize the reader."""
super().__init__(filename, filename_info, filetype_info)
self.nc = xr.open_dataset(self.filename,
decode_cf=True,
mask_and_scale=False,
chunks={"xc": "auto", "yc": "auto"})

# Check that file is a full disk scene, we don't know the area for anything else
if self.nc.attrs['cdm_data_type'] != EXPECTED_DATA_AREA:
raise ValueError('File is not a full disk scene')

self.sensor = self.nc.attrs['instrument_name'].lower()
self.nlines = self.nc.dims['Columns']
self.ncols = self.nc.dims['Rows']
self.platform_name = self.nc.attrs['satellite_name']
self.platform_shortname = filename_info['platform']
self._meta = None

@property
def start_time(self):
"""Start timestamp of the dataset."""
dt = self.nc.attrs['time_coverage_start']
return datetime.strptime(dt, '%Y-%m-%dT%H:%M:%SZ')

@property
def end_time(self):
"""End timestamp of the dataset."""
dt = self.nc.attrs['time_coverage_end']
return datetime.strptime(dt, '%Y-%m-%dT%H:%M:%SZ')

def get_dataset(self, key, info):
"""Load a dataset."""
var = info['file_key']
logger.debug('Reading in get_dataset %s.', var)
variable = self.nc[var]

# Data has 'Latitude' and 'Longitude' coords, these must be replaced.
variable = variable.rename({'Rows': 'y', 'Columns': 'x'})

variable = variable.drop('Latitude')
variable = variable.drop('Longitude')

variable.attrs.update(key.to_dict())
return variable

@cached_property
def area(self):
"""Get AreaDefinition representing this file's data."""
return self._get_area_def()

def get_area_def(self, dsid):
"""Get the area definition."""
del dsid
return self.area

def _get_area_def(self):
logger.info('The AHI L2 cloud products do not have the metadata required to produce an area definition.'
' Assuming standard Himawari-8/9 full disk projection.')

# Basic check to ensure we're processing a full disk (2km) scene.n
if self.nlines != 5500 or self.ncols != 5500:
raise ValueError("Input L2 file is not a full disk Himawari scene. Only full disk data is supported.")

pdict = {'cfac': 20466275, 'lfac': 20466275, 'coff': 2750.5, 'loff': 2750.5, 'a': 6378137.0, 'h': 35785863.0,
'b': 6356752.3, 'ssp_lon': 140.7, 'nlines': self.nlines, 'ncols': self.ncols, 'scandir': 'N2S'}

aex = get_area_extent(pdict)

pdict['a_name'] = 'Himawari_Area'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the HSD reader we use self.basic_info["observation_area"] for the name of the area. Is there anything like that in these files? Or if not in these files, what does observation area equal in the HSD files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not really. The self.nc.attrs['cdm_data_type'] attrs stores an area name but it's always Full Disk, which isn't very meaningful. I can set pdict['a_name'] to that if you prefer though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's not an in-file attribute for this, then can we guess a decent abbreviation/region name based on the filename?

pdict['a_desc'] = "AHI Full Disk area"
pdict['p_id'] = f'geos{self.platform_shortname}'

return get_area_definition(pdict, aex)
Loading
Loading