diff --git a/notebooks/example-workflows/esgf2-arm-comparison.ipynb b/notebooks/example-workflows/esgf2-arm-comparison.ipynb deleted file mode 100644 index 520a028..0000000 --- a/notebooks/example-workflows/esgf2-arm-comparison.ipynb +++ /dev/null @@ -1,475 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "26550474-a788-4c90-a6fc-fb520acc8d41", - "metadata": {}, - "source": [ - "\"ESGF\n", - "\"ARM" - ] - }, - { - "cell_type": "markdown", - "id": "b26398c8-01f4-40e7-bf6b-db9e4c466eac", - "metadata": {}, - "source": [ - "# Compare Data from ESGF and ARM" - ] - }, - { - "cell_type": "markdown", - "id": "dd0adfff-43d4-4874-9d6d-511693ce8613", - "metadata": {}, - "source": [ - "## Overview\n", - "\n", - "This notebook details how to compare CMIP6 data hosted through the Earth System Grid Federation (ESGF) to observations collected and hosted through the Department of Energy's Atmospheric Radiation Measurement (ARM) user facility.\n", - "\n", - "The measurement of focus is 2 meter air temperature, collected at the Southern Great Plains (SGP) site in Northern Oklahoma. This climate observatory has collected state-of-the-art observations since 1993." - ] - }, - { - "cell_type": "markdown", - "id": "5ffd1461-464c-4e04-ad8f-e2aa1a710d59", - "metadata": {}, - "source": [ - "## Prerequisites\n", - "\n", - "| Concepts | Importance | Notes |\n", - "| --- | --- | --- |\n", - "| [Intro to Xarray](https://foundations.projectpythia.org/core/xarray/xarray-intro.html) | Necessary | |\n", - "| [Search and Load CMIP6 Data via ESGF/OPeNDAP](https://projectpythia.org/cmip6-cookbook/notebooks/foundations/esgf-opendap.html) | Necessary | Familiarity with data access patterns |\n", - "| [Understanding of NetCDF](https://foundations.projectpythia.org/core/data-formats/netcdf-cf.html) | Helpful | Familiarity with metadata structure |\n", - "| [Dask Arrays with Xarray](https://foundations.projectpythia.org/core/xarray/dask-arrays-xarray.html) | Helpful | Familiarity with lazy-loading |\n", - "\n", - "- **Time to learn**: 25 minutes" - ] - }, - { - "cell_type": "markdown", - "id": "a40b8cb6-ba5a-470e-90ef-3e21e0a0dc97", - "metadata": {}, - "source": [ - "## Imports" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ace2396c-6182-4488-8f4c-d4993fffbe00", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "import os\n", - "import warnings\n", - "\n", - "import act\n", - "from distributed import Client\n", - "import holoviews as hv\n", - "import hvplot.xarray\n", - "from matplotlib import pyplot as plt\n", - "import numpy as np\n", - "import pandas as pd\n", - "import cf_xarray\n", - "import metpy\n", - "from pyesgf.search import SearchConnection\n", - "import xarray as xr\n", - "\n", - "xr.set_options(display_style='html')\n", - "warnings.filterwarnings(\"ignore\")\n", - "hv.extension('bokeh')" - ] - }, - { - "cell_type": "markdown", - "id": "516b996f-7c8a-4264-a86c-410f06e63452", - "metadata": {}, - "source": [ - "## Spin up a Dask Cluster\n", - "We will use a Dask Local Cluster to compute in parellel and distribute our data, enabling us to work with these large datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7ad22062-6b6d-44d9-859a-f9ed7546dc9e", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "client = Client()\n", - "client" - ] - }, - { - "cell_type": "markdown", - "id": "1bdcd8de-b9b3-4b65-924d-3292fa4d1e63", - "metadata": {}, - "source": [ - "## Access Data\n", - "Our first step is to access data from the ESGF data servers, and the Atmospheric Radiation Measurement (ARM) user facility, which has a long term site in Northern Oklahoma." - ] - }, - { - "cell_type": "markdown", - "id": "cf527623-1d78-4f62-8d52-427e4ecbd67d", - "metadata": {}, - "source": [ - "### Access ESGF Data\n", - "A tutorial on how to access ESGF-hosted CMIP6 data is included in the Foundations section of this cookbook:\n", - "- [ESGF OpenDAP Tutorial](https://projectpythia.org/cmip6-cookbook/notebooks/foundations/esgf-opendap.html)\n", - "\n", - "We use the following block of code to search for a single earth system model simulation, the Energe Exascale Earth System Model (E3SM), which is the Department of Energy's flagship coupled Earth System Model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a1721519-5b6f-465b-9089-bf9a8f3ce13c", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "conn = SearchConnection('https://esgf-node.llnl.gov/esg-search',\n", - " distrib=False)\n", - "ctx = conn.new_context(\n", - " facets='project,experiment_id',\n", - " project='CMIP6',\n", - " table_id='Amon',\n", - " institution_id = 'E3SM-Project',\n", - " experiment_id='historical',\n", - " source_id='E3SM-1-0',\n", - " variable='tas',\n", - " variant_label='r1i1p1f1',\n", - ")\n", - "result = ctx.search()[1]\n", - "files = result.file_context().search()\n", - "opendap_urls = [file.opendap_url for file in files]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f60b409e-cbea-43ac-907f-e31f03cfab95", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "esgf_ds = xr.open_mfdataset(opendap_urls,\n", - " combine='by_coords',\n", - " chunks={'time':480})\n", - "esgf_ds" - ] - }, - { - "cell_type": "markdown", - "id": "f5a83d3a-938d-44d8-9350-5a9cdf218c99", - "metadata": {}, - "source": [ - "### Clean up the dataset\n", - "We need to adjust the 0 to 360 degree longitude to be -180 to 180 - we can do this generically using the climate forecast (CF) conventions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "92ccf9da-5f3c-40e4-875c-49c662b4a8e1", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "lon_coord = esgf_ds.cf['X'].name\n", - "esgf_ds[lon_coord] = (esgf_ds[lon_coord] + 180) % 360 - 180\n", - "esgf_ds = esgf_ds.sortby(lon_coord)" - ] - }, - { - "cell_type": "markdown", - "id": "44bfced7-1886-4fd5-bb2f-1810709c95e5", - "metadata": {}, - "source": [ - "## Access ARM Data\n", - "We use the ARM data API, which is included in the Atmospheric Data Community Toolkit (ACT) to access the data.\n", - "\n", - "### Setup the Search\n", - "\n", - "Before downloading our data, we need to make sure we have an ARM Data Account, and ARM Live token. Both of these can be found using this link:\n", - "- [ARM Live Signup](https://adc.arm.gov/armlive/livedata/home)\n", - "\n", - "Once you sign up, you will see your token. Copy and replace that where we have `arm_username` and `arm_password` below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e3d24fdf-517f-4f32-966b-e9e8f9e52174", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "arm_username = os.getenv(\"ARM_USERNAME\")\n", - "arm_password = os.getenv(\"ARM_PASSWORD\")\n", - "\n", - "# Meteorological observations at the Southern Great Plains site\n", - "datastream = \"sgpmetE13.b1\"\n", - "\n", - "start_date = \"2013-01-01\"\n", - "end_date = \"2013-02-28\"\n", - "files = act.discovery.download_data(arm_username,\n", - " arm_password,\n", - " datastream,\n", - " start_date,\n", - " end_date\n", - " )" - ] - }, - { - "cell_type": "markdown", - "id": "7d66ed98-e96d-44bb-a558-4a9bf46f7db5", - "metadata": {}, - "source": [ - "### Load the Data Using Xarray" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "69ab2faa-3890-4e6d-b0fc-3f0486a6c79e", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "arm_ds = xr.open_mfdataset(files,\n", - " combine='nested',\n", - " concat_dim='time',\n", - " chunks={'time':86400})" - ] - }, - { - "cell_type": "markdown", - "id": "60dc491d-ff51-4dc0-bfd4-695b4884722d", - "metadata": {}, - "source": [ - "## Subset and Prepare Data to be Compared\n", - "We need to subset the climate model output for the nearest grid point, over the SGP site." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "079b0699-89cc-4bdb-a7a1-5f96859c13ac", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "lat = arm_ds.lat.values[0]\n", - "lon = arm_ds.lon.values[0]\n", - "lat, lon" - ] - }, - { - "cell_type": "markdown", - "id": "a6346398-7dbe-4507-8e18-5c703960ecb7", - "metadata": {}, - "source": [ - "Xarray offers this subsetting functionality, and we specify we want the **nearest** gird point to the site." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7c54eb44-29e5-4f33-ab01-b438bf0c77a9", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "cmip6_nearest = esgf_ds.cf.sel(lat=lat,\n", - " lon=lon,\n", - " method='nearest')\n", - "cmip6_nearest" - ] - }, - { - "cell_type": "markdown", - "id": "8d1ca239-a5cb-40e8-8704-9024f8600774", - "metadata": {}, - "source": [ - "We need to convert our time to datetime to make it easier to compare." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fb0e3c86-7c39-4248-8c8e-43b7b50e9e00", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "cmip6_nearest['time'] = cmip6_nearest.indexes['time'].to_datetimeindex()" - ] - }, - { - "cell_type": "markdown", - "id": "1581930c-df52-4954-b52b-c01da13cf72b", - "metadata": {}, - "source": [ - "Next, we select the times we have data from the SGP site, specified earlier in the notebook." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b48285d8-7e65-4a4e-9df5-e0cdc1d4d4c9", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "cmip6_nearest = cmip6_nearest.sel(time=slice(start_date,\n", - " end_date)).resample(time='1M').mean()" - ] - }, - { - "cell_type": "markdown", - "id": "0eae53d6-ef15-4d5f-adda-ec3eb5012328", - "metadata": {}, - "source": [ - "### Calculate Monthly Mean Temperature at SGP\n", - "We can calculate the monthly average temperature at the SGP site using the `resample` method in `Xarray`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d16d08c4-e317-4590-a119-9f7171fd8793", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "arm_ds = arm_ds.sortby('time')\n", - "sgp_monthly_mean_temperature = arm_ds.temp_mean.resample(time='1M').mean().compute().rename('tas (ARM)')" - ] - }, - { - "cell_type": "markdown", - "id": "716b8556-46be-4f32-a440-9e38f78bdea5", - "metadata": {}, - "source": [ - "We need to apply some data cleaning here too - converting our units of temperature to degrees Celsius for the CMIP6 data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2d89cc9a-3160-4549-8221-e3f4e7f49dfe", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "cmip6_monthly_mean_temperature = cmip6_nearest.tas.compute().metpy.quantify()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e6e66da0-4d4c-4dcc-85ab-55d9036f9fa5", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "cmip6_monthly_mean_temperature = cmip6_monthly_mean_temperature.metpy.convert_units('degC').rename(\"tas (CMIP6)\")" - ] - }, - { - "cell_type": "markdown", - "id": "7dca2cfb-c76c-4899-9e4d-f67515299f3e", - "metadata": {}, - "source": [ - "## Visaulize the Output\n", - "Once we have our comparisons ready, we can visualize using `hvPlot`, which produces an interactive visualization!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a4f46b86-e7ed-441f-9411-e7112702ed12", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "esgf_plot = cmip6_monthly_mean_temperature.hvplot.bar(title='Average Surface Temperature \\n near the Southern Great Plains Field Site',\n", - " xlabel='Time')\n", - "arm_plot = sgp_monthly_mean_temperature.hvplot.bar(ylabel='Average Temperature (degC)',\n", - " xlabel='Time')\n", - "\n", - "esgf_plot * arm_plot" - ] - }, - { - "cell_type": "markdown", - "id": "d4b169dd-67b5-45c0-9768-ea50c663f927", - "metadata": {}, - "source": [ - "## Summary\n", - "In this notebook, we searched for and opened a CMIP6 E3SM dataset using the ESGF API and OPeNDAP, and compared to an ARM dataset collected at the Southern Great Plains climate observatory.\n", - "\n", - "### What's next?\n", - "We will see some more advanced examples of using the CMIP6 and obsverational data." - ] - }, - { - "cell_type": "markdown", - "id": "71dbbf10-c12b-459f-ac02-fc9015905ff9", - "metadata": {}, - "source": [ - "## Resources and references\n", - "- [ARM Surface Meteorological Handbook](https://www.arm.gov/publications/tech_reports/handbooks/met_handbook.pdf)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "27b4be06-a088-452b-8633-febce92610c6", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.4" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -}