Skip to content

Commit

Permalink
Merge pull request #5 from ARM-DOE/master
Browse files Browse the repository at this point in the history
pull request
  • Loading branch information
ajsockol authored May 11, 2021
2 parents 776c89b + a7b4c67 commit 976002b
Show file tree
Hide file tree
Showing 96 changed files with 3,039 additions and 614 deletions.
2 changes: 2 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[run]
omit =./act/tests/*, ./act/*version*py
11 changes: 6 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,23 @@ env:

matrix:
include:
- python: 3.6
- python: 3.7
env:
- PYTHON_VERSION="3.6"
- PYTHON_VERSION="3.7"
- DOC_BUILD="true"
- python: 3.7
- python: 3.8
sudo: yes
dist: xenial
env:
- PYTHON_VERSION="3.7"
- PYTHON_VERSION="3.8"
- DOC_BUILD="true"
install:
- source continuous_integration/install.sh
- pip install pytest-cov
- pip install coveralls
- pip install metpy
script:
- eval xvfb-run pytest --cov=act/
- eval xvfb-run pytest --mpl --cov=act/ --cov-config=.coveragerc
- flake8 --max-line-length=115 --ignore=F401,E402,W504,W605
after_success:
- coveralls
Expand Down
2 changes: 1 addition & 1 deletion CREATING_ENVIRONMENTS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ do this step while the environment is activate::
Another way to create a conda environment is by doing it from scratch using
the conda create command. An example of this::

conda create -n act_env -c conda-forge python=3.7 numpy pandas astral
conda create -n act_env -c conda-forge python=3.7 numpy pandas
scipy matplotlib dask xarray

After activating the environment with::
Expand Down
5 changes: 4 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,13 @@ recursive-exclude * *.py[co]
recursive-include act/plotting *.txt

recursive-include docs *.rst conf.py Makefile make.bat
recursive-include act/tests/data *.cdf *.nc *.data
recursive-include act/tests/data *


include versioneer.py
include act/_version.py

include act/utils/conf/de421.bsp

# If including data files in the package, add them like:
# include path/to/data_file
27 changes: 18 additions & 9 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Atmospheric data Community Toolkit (ACT)

|AnacondaCloud| |Travis| |Coveralls|

|CondaDownloads| |Zenodo|
|CondaDownloads| |Zenodo| |ARM|

.. |AnacondaCloud| image:: https://anaconda.org/conda-forge/act-atmos/badges/version.svg
:target: https://anaconda.org/conda-forge/act-atmos
Expand All @@ -21,11 +21,15 @@ Atmospheric data Community Toolkit (ACT)
.. |Coveralls| image:: https://coveralls.io/repos/github/ARM-DOE/ACT/badge.svg
:target: https://coveralls.io/github/ARM-DOE/ACT

.. |ARM| image:: https://img.shields.io/badge/Sponsor-ARM-blue.svg?colorA=00c1de&colorB=00539c
:target: https://www.arm.gov/


Python toolkit for working with atmospheric time-series datasets of varying dimensions. The toolkit is meant to have functions for every part of the scientific process; discovery, IO, quality control, corrections, retrievals, visualization, and analysis. This toolkit is meant to be a community platform for sharing code with the goal of reducing duplication of effort and better connecting the science community with programs such as the `Atmospheric Radiation Measurement (ARM) User Facility <http://www.arm.gov>`_. Overarching development goals will be updated on a regular basis as part of the `Roadmap <https://github.com/AdamTheisen/ACT/blob/master/guides/ACT_Roadmap.pdf>`_.
The Atmospheric data Community Toolkit (ACT) is an open source Python toolkit for working with atmospheric time-series datasets of varying dimensions. The toolkit is meant to have functions for every part of the scientific process; discovery, IO, quality control, corrections, retrievals, visualization, and analysis. It is meant to be a community platform for sharing code with the goal of reducing duplication of effort and better connecting the science community with programs such as the `Atmospheric Radiation Measurement (ARM) User Facility <http://www.arm.gov>`_. Overarching development goals will be updated on a regular basis as part of the `Roadmap <https://github.com/AdamTheisen/ACT/blob/master/guides/ACT_Roadmap.pdf>`_ .

* Free software: 3-clause BSD license
|act|

.. |act| image:: ./docs/source/act_plots.png

Important Links
~~~~~~~~~~~~~~~
Expand All @@ -34,19 +38,22 @@ Important Links
* Examples: https://arm-doe.github.io/ACT/source/auto_examples/index.html
* Issue Tracker: https://github.com/ARM-DOE/ACT/issues

Citing
~~~~~~

If you use ACT to prepare a publication, please cite the DOI listed in the badge above, which is updated with every version release to ensure that contributors get appropriate credit. DOI is provided through Zenodo.

Dependencies
~~~~~~~~~~~~

* `xarray <https://xarray.pydata.org/en/stable/>`_
* `NumPy <https://www.numpy.org/>`_
* `SciPy <https://www.scipy.org/>`_
* `matplotlib <https://matplotlib.org/>`_
* `xarray <https://xarray.pydata.org/en/stable/>`_
* `astral <https://astral.readthedocs.io/en/latest/>`_
* `skyfield <https://rhodesmill.org/skyfield/>`_
* `pandas <https://pandas.pydata.org/>`_
* `dask <https://dask.org/>`_
* `Pint <https://pint.readthedocs.io/en/0.9/>`_
* `Cartopy <https://scitools.org.uk/cartopy/docs/latest/>`_
* `Boto3 <https://aws.amazon.com/sdk-for-python/>`_
* `PyProj <https://pyproj4.github.io/pyproj/stable/>`_
* `Proj <https://proj.org/>`_
* `Six <https://pypi.org/project/six/>`_
Expand All @@ -55,7 +62,9 @@ Dependencies
Optional Dependencies
~~~~~~~~~~~~~~~~~~~~~

* `MPL2NC <https://github.com/peterkuma/mpl2nc>`_ For reading binary MPL data.
* `MPL2NC <https://github.com/peterkuma/mpl2nc>`_ Reading binary MPL data.
* `Cartopy <https://scitools.org.uk/cartopy/docs/latest/>`_ Mapping and geoplots
* `MetPy <https://unidata.github.io/MetPy/latest/index.html>`_ >= V1.0 Skew-T plotting and some stabilities indices calculations

Installation
~~~~~~~~~~~~
Expand Down Expand Up @@ -138,7 +147,7 @@ Testing
After installation, you can launch the test suite from outside the
source directory (you will need to have pytest installed)::

$ pytest --pyargs act
$ pytest --mpl --pyargs act

In-place installs can be tested using the `pytest` command from within
the source directory.
13 changes: 13 additions & 0 deletions act/discovery/get_armfiles.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@ def download_data(username, token, datastream,
current working directory with the same name as *datastream* to place
the files in.
Returns
-------
files : list
Returns list of files retrieved
Notes
-----
This programmatic interface allows users to query and automate
Expand Down Expand Up @@ -107,7 +112,12 @@ def download_data(username, token, datastream,
output_dir = os.path.join(os.getcwd(), datastream)

# not testing, response is successful and files were returned
if response_body_json is None:
print("ARM Data Live Webservice does not appear to be functioning")
return []

num_files = len(response_body_json["files"])
file_names = []
if response_body_json["status"] == "success" and num_files > 0:
for fname in response_body_json['files']:
if time is not None:
Expand All @@ -125,6 +135,9 @@ def download_data(username, token, datastream,
# create file and write bytes to file
with open(output_file, 'wb') as open_bytes_file:
open_bytes_file.write(urlopen(save_data_url).read())
file_names.append(output_file)
else:
print("No files returned or url status error.\n"
"Check datastream name, start, and end date.")

return file_names
1 change: 1 addition & 0 deletions act/io/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@
from . import armfiles
from . import csvfiles
from . import mpl
from . import noaagml
139 changes: 128 additions & 11 deletions act/io/armfiles.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@
import numpy as np
import urllib
import json
from enum import Flag, auto
import copy
import act.utils as utils
import warnings


def read_netcdf(filenames, concat_dim='time', return_None=False,
Expand All @@ -32,7 +33,7 @@ def read_netcdf(filenames, concat_dim='time', return_None=False,
Name of file(s) to read.
concat_dim : str
Dimension to concatenate files along. Default value is 'time.'
return_none : bool, optional
return_None : bool, optional
Catch IOError exception when file not found and return None.
Default is False.
combine : str
Expand Down Expand Up @@ -134,6 +135,7 @@ def read_netcdf(filenames, concat_dim='time', return_None=False,
arm_ds[var_name].astype(desired_time_precision),
arm_ds[var_name].attrs)})
arm_ds[var_name] = temp_ds[var_name]
temp_ds.close()

# If time_offset is in file try to convert base_time as well
if var_name == 'time_offset':
Expand All @@ -160,13 +162,17 @@ def read_netcdf(filenames, concat_dim='time', return_None=False,
not np.issubdtype(arm_ds['time'].values.dtype, np.datetime64) and
not type(arm_ds['time'].values[0]).__module__.startswith('cftime.')):
# Use microsecond precision to create time since epoch. Then convert to datetime64
time = (arm_ds['base_time'].values * 1000000 +
arm_ds['time'].values * 1000000.).astype('datetime64[us]')
if arm_ds['base_time'].values == arm_ds['time_offset'].values[0]:
time = arm_ds['time_offset'].values
else:
time = (arm_ds['base_time'].values +
arm_ds['time_offset'].values * 1000000.).astype('datetime64[us]')
# Need to use a new Dataset creation to correctly index time for use with
# .group and .resample methods in Xarray Datasets.
temp_ds = xr.Dataset({'time': (arm_ds['time'].dims, time, arm_ds['time'].attrs)})

arm_ds['time'] = temp_ds['time']
temp_ds.close()
for att_name in ['units', 'ancillary_variables']:
try:
del arm_ds['time'].attrs[att_name]
Expand All @@ -180,8 +186,17 @@ def read_netcdf(filenames, concat_dim='time', return_None=False,
# Get file dates and times that were read in to the object
filenames.sort()
for f in filenames:
file_dates.append(f.split('.')[-3])
file_times.append(f.split('.')[-2])
# If Not ARM format, read in first time for infos
if len(f.split('/')[-1].split('.')) == 5:
file_dates.append(f.split('.')[-3])
file_times.append(f.split('.')[-2])
else:
if arm_ds['time'].size > 1:
dummy = arm_ds['time'].values[0]
else:
dummy = arm_ds['time'].values
file_dates.append(utils.numpy_to_arm_date(dummy))
file_times.append(utils.numpy_to_arm_date(dummy, returnTime=True))

# Add attributes
arm_ds.attrs['_file_dates'] = file_dates
Expand Down Expand Up @@ -266,7 +281,7 @@ def create_obj_from_arm_dod(proc, set_dims, version='', fill_value=-9999.,
"""
# Set base url to get DOD information
base_url = 'https://pcm.arm.gov/pcmserver/dods/'
base_url = 'https://pcm.arm.gov/pcm/api/dods/'

# Get data from DOD api
with urllib.request.urlopen(base_url + proc) as url:
Expand All @@ -275,7 +290,9 @@ def create_obj_from_arm_dod(proc, set_dims, version='', fill_value=-9999.,
# Check version numbers and alert if requested version in not available
keys = list(data['versions'].keys())
if version not in keys:
print(' '.join(['Version:', version, 'not available or not specified. Using Version:', keys[-1]]))
warnings.warn(' '.join(['Version:', version,
'not available or not specified. Using Version:', keys[-1]]),
UserWarning)
version = keys[-1]

# Create empty xarray dataset
Expand Down Expand Up @@ -351,9 +368,9 @@ def __init__(self, xarray_obj):
self._obj = xarray_obj

def write_netcdf(self, cleanup_global_atts=True, cleanup_qc_atts=True,
join_char='__', make_copy=True,
join_char='__', make_copy=True, cf_compliant=False,
delete_global_attrs=['qc_standards_version', 'qc_method', 'qc_comment'],
FillValue=-9999, **kwargs):
FillValue=-9999, cf_convention='CF-1.8', **kwargs):
"""
This is a wrapper around Dataset.to_netcdf to clean up the Dataset before
writing to disk. Some things are added to global attributes during ACT reading
Expand All @@ -372,11 +389,17 @@ def write_netcdf(self, cleanup_global_atts=True, cleanup_qc_atts=True,
Will use a single space a delimeter between values and join_char to replace
white space between words.
join_char : str
The character sting to use for replacing white spaces between words.
The character sting to use for replacing white spaces between words when converting
a list of strings to single character string attributes.
make_copy : boolean
Make a copy before modifying Dataset to write. For large Datasets this
may add processing time and memory. If modifying the Dataset is OK
try setting to False.
cf_compliant : boolean
Option to output file with additional attributes to make file Climate & Forecast
complient. May require runing .clean.cleanup() method on the object to fix other
issues first. This does the best it can but it may not be truely complient. You
should read the CF documents and try to make complient before writing to file.
delete_global_attrs : list
Optional global attributes to be deleted. Defaults to some standard
QC attributes that are not needed. Can add more or set to None to not
Expand All @@ -387,6 +410,8 @@ def write_netcdf(self, cleanup_global_atts=True, cleanup_qc_atts=True,
so not a perfect fix. Set to None to leave Xarray to do what it wants.
Set to a value to be the value used as _FillValue in the file and data
array. This should then remove missing_value attribute from the file as well.
cf_convention : str
The Climate and Forecast convention string to add to Conventions attribute.
**kwargs : keywords
Keywords to pass through to Dataset.to_netcdf()
Expand Down Expand Up @@ -447,4 +472,96 @@ def write_netcdf(self, cleanup_global_atts=True, cleanup_qc_atts=True,
except KeyError:
pass

# If requested update global attributes and variables attributes for required
# CF attributes.
if cf_compliant:
# Get variable names and standard name for each variable
var_names = list(write_obj.keys())
standard_names = []
for var_name in var_names:
try:
standard_names.append(write_obj[var_name].attrs['standard_name'])
except KeyError:
standard_names.append(None)

# Check if time varible has axis and standard_name attribute
coord_name = 'time'
try:
write_obj[coord_name].attrs['axis']
except KeyError:
try:
write_obj[coord_name].attrs['axis'] = 'T'
except KeyError:
pass

try:
write_obj[coord_name].attrs['standard_name']
except KeyError:
try:
write_obj[coord_name].attrs['standard_name'] = 'time'
except KeyError:
pass

# Try to determine type of dataset by coordinate dimention named time
# and other factors
try:
write_obj.attrs['FeatureType']
except KeyError:
dim_names = list(write_obj.dims)
FeatureType = None
if dim_names == ['time']:
FeatureType = "timeSeries"
elif len(dim_names) == 2 and 'time' in dim_names and 'bound' in dim_names:
FeatureType = "timeSeries"
elif len(dim_names) >= 2 and 'time' in dim_names:
for var_name in var_names:
dims = list(write_obj[var_name].dims)
if len(dims) == 2 and 'time' in dims:
prof_dim = list(set(dims) - set(['time']))[0]
if write_obj[prof_dim].values.size > 2:
FeatureType = "timeSeriesProfile"
break

if FeatureType is not None:
write_obj.attrs['FeatureType'] = FeatureType

# Add axis and positive attributes to variables with standard_name
# equal to 'altitude'
alt_variables = [var_names[ii] for ii, sn in enumerate(standard_names) if sn == 'altitude']
for var_name in alt_variables:
try:
write_obj[var_name].attrs['axis']
except KeyError:
write_obj[var_name].attrs['axis'] = 'Z'

try:
write_obj[var_name].attrs['positive']
except KeyError:
write_obj[var_name].attrs['positive'] = 'up'

# Check if the Conventions global attribute lists the CF convention
try:
Conventions = write_obj.attrs['Conventions']
Conventions = Conventions.split()
cf_listed = False
for ii in Conventions:
if ii.startswith('CF-'):
cf_listed = True
break
if not cf_listed:
Conventions.append(cf_convention)
write_obj.attrs['Conventions'] = ' '.join(Conventions)

except KeyError:
write_obj.attrs['Conventions'] = str(cf_convention)

# Reorder global attributes to ensure history is last
try:
global_attrs = write_obj.attrs
history = copy.copy(global_attrs['history'])
del global_attrs['history']
global_attrs['history'] = history
except KeyError:
pass

write_obj.to_netcdf(encoding=encoding, **kwargs)
Loading

0 comments on commit 976002b

Please sign in to comment.