Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration for crop cli #169

Draft
wants to merge 35 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c988efa
feat: add `crop_start_index` and `crop_stop_index` config
spool Sep 13, 2024
a031f13
fix: update `run-pipeline-iteratively.sh`
spool Sep 13, 2024
7b09a9a
refactor: `crop` methods in `resample.py` to classes in `crop.py`
spool Sep 16, 2024
2dbda92
fix: add this `crop-cli` branch for `CI`
spool Sep 16, 2024
c7d3100
Merge remote-tracking branch 'origin/main' into crop-cli
spool Sep 16, 2024
1efcefb
fix: remove from `ci.yaml` (pull request opened)
spool Sep 16, 2024
3d3edf3
refactor: `resample.py`, simplifying processing steps
spool Sep 19, 2024
e42231c
refactor: fix `config` `crop` path fails and `progress_bar`
spool Sep 25, 2024
a839735
fix: `multiprocess` `doctest` typo
spool Sep 25, 2024
765bed0
chore: remove commented lines in `resample.py` and `console.log` -> `…
spool Sep 25, 2024
e8b6daf
refactor: `crop_projection` method
spool Sep 25, 2024
611c25c
fix: `crop_projection` and `range_crop_projection`
spool Sep 25, 2024
35468da
fix: `pipeline.main()` and remove commented lines
spool Sep 25, 2024
0dd26f5
fix: update CLI, work in progress
spool Sep 26, 2024
cf78c8c
fix: start refactor to fix generator
spool Sep 27, 2024
8c15f8f
fix: `cli` iteration bug and harmonise `resample` and `crop`
spool Sep 29, 2024
fcccb68
refactor: rename `resampler.py` -> `convert.py` and `ResamplerBase` -…
spool Sep 30, 2024
2259602
fix: incorrect `docstring` in `crop.py`
spool Oct 4, 2024
78e0450
fix: `_quarto.yml` config for `resample.py` -> `convert.py` refactor
spool Oct 5, 2024
68887f3
fix: refactor `progress_wrapper` and fix docstrings
spool Oct 6, 2024
96e3f23
fix: update `rasterio -> 1.4.1` to support `conda`
spool Oct 6, 2024
2f8d408
refactor: rename `CPMResampler` -> `CPMConvert` etc.
spool Oct 6, 2024
67502a3
refactor: alpha simultaneous progress bars in `cli`
spool Oct 14, 2024
bb8b01b
refactor: shift `cropt` to use `progress_wrapper`
spool Oct 15, 2024
a0767be
feat: save and load `ClimRecalConfig` instances via `json` serialisation
spool Oct 18, 2024
5f4f28a
feat: add `start_date` and `end_date` for conversion and crop
spool Oct 22, 2024
3f8970d
fix: enable `GitHubActions` for `branch` `crop-cli`
spool Oct 22, 2024
b14a0ec
merge: branch 'plot-cpm-time-series' into crop-cli
spool Oct 23, 2024
acdc019
fix: `cli` parameter checking and default `crop` config
spool Oct 25, 2024
2fae03b
fix: crop `cli` and improve testing
spool Oct 28, 2024
0b87a5a
feat: add classes and tests for downloading `hads` and `cpm`
spool Oct 29, 2024
aacca6a
feat: add `ceda` command to `cli`
spool Oct 30, 2024
9c81e5d
merge: address `_quarto.yml`, `run-pipeline-interactive.sh` and `conf…
spool Oct 30, 2024
c419da6
doc: fix `quartodoc` function signature errors
spool Oct 30, 2024
853af93
fix: merge error rewriting `cpm` variables
spool Oct 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ env:

on:
pull_request:
branches: ['main', 'cache-cpm-for-hads', ]
branches: ['main',]
paths-ignore: ['docs/**']

push:
branches: ['main', 'cache-cpm-for-hads', ]
branches: ['main', 'crop-cli']
paths-ignore: ['docs/**']

concurrency:
Expand Down
42 changes: 28 additions & 14 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,22 +73,36 @@ website:
text: "CEDA Data Access"
- href: "docs/reference/clim_recal.data_loader.qmd"
text: "Data Loading"
- href: "docs/reference/clim_recal.resample.qmd"
- href: "docs/reference/clim_recal.convert.qmd"
text: "Data Resampling"
- section: "Debiasing"
contents:
- href: "docs/reference/clim_recal.debiasing.debias_wrapper.qmd"
text: "Wrapper"
- section: "Utilities"
contents:
- href: "docs/reference/clim_recal.utils.core.qmd"
text: "core"
- href: "docs/reference/clim_recal.utils.server.qmd"
text: "server"
- href: "docs/reference/clim_recal.utils.xarray.qmd"
text: "xarray"
- href: "docs/reference/clim_recal.utils.data.qmd"
text: "data"
- href: "docs/reference/clim_recal.pipeline.qmd"
text: "Pipeline"
- href: "docs/reference/clim_recal.config.qmd"
text: "Configure"
- href: "docs/reference/clim_recal.ceda_ftp_download.qmd"
text: "CEDA Data Access"
- href: "docs/reference/clim_recal.data_loader.qmd"
text: "Data Loading"
- href: "docs/reference/clim_recal.convert.qmd"
text: "Data Resampling"
- section: "Debiasing"
contents:
- href: "docs/reference/clim_recal.debiasing.debias_wrapper.qmd"
text: "Wrapper"
- section: "Utilities"
contents:
- href: "docs/reference/clim_recal.utils.core.qmd"
text: "core"
- href: "docs/reference/clim_recal.utils.server.qmd"
text: "server"
- href: "docs/reference/clim_recal.utils.xarray.qmd"
text: "xarray"
- href: "docs/reference/clim_recal.utils.data.qmd"
text: "data"
- text: "Docker"
href: "docs/docker-configurations.qmd"
- text: "Contributing"
href: "docs/contributing.md"

Expand Down Expand Up @@ -130,7 +144,7 @@ quartodoc:
- clim_recal.ceda_ftp_download
- clim_recal.data_loader
- clim_recal.config
- clim_recal.resample
- clim_recal.convert
- clim_recal.debiasing.debias_wrapper
- clim_recal.utils.core
- clim_recal.utils.server
Expand Down
6 changes: 4 additions & 2 deletions bash/run-pipeline-iteratively.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,10 @@ for year in $(seq $cpm_start_year $cpm_end_year); do

{
clim-recal \
--hads-input-path $hads_working_dir \
--cpm-input-path $cpm_working_dir \
--resample-start-index $i \
--total-from-index 1 \
--hads-input-path $hads_input_path \
--cpm-input-path $cpm_input_path \
--output-path $output_path \
--all-variables \
--all-regions \
Expand Down
94 changes: 92 additions & 2 deletions python/clim_recal/ceda_ftp_download.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,26 @@
import ftplib
import os
import random
from dataclasses import dataclass
from datetime import datetime
from pathlib import Path
from typing import Final, Sequence

HADS_FTP_PATH: Final[str] = (
"/badc/ukmo-hadobs/data/insitu/MOHC/HadOBS/HadUK-Grid/v1.2.0.ceda/1km/"
)
CPM_FTP_PATH: Final[str] = "/badc/ukcp18/data/land-cpm/uk/2.2km/rcp85/"

DEFAULT_SAVE_PATH: Final[Path] = Path("ceda")
CEDA_ENV_USER_NAME_KEY: Final[str] = "CLIM_RECAL_CEDA_USER_NAME"
CEDA_ENV_PASSWORD_KEY: Final[str] = "CLIM_RECAL_CEDA_PASSWORD"


def check_env_auth() -> bool:
"""Test if CEDA `user_name` and `password` available."""
user_name: str | None = os.getenv(CEDA_ENV_USER_NAME_KEY)
password: str | None = os.getenv(CEDA_ENV_PASSWORD_KEY)
return True if user_name and password else False


def download_ftp(
Expand Down Expand Up @@ -77,19 +95,91 @@ def download_ftp(

if size_ftp == size_local:
download = False
print("File exist, will not dowload")
print("File exists, will not download")

if download:
f.retrbinary("RETR %s" % file, open(file, "wb").write)

counter += 1
print(counter, "file downloaded out of", len(filelist))

print("Finished: ", counter, " files dowloaded from ", input)
print("Finished: ", counter, " files downloaded from ", input)
# Close FTP connection
f.close()


@dataclass(kw_only=True)
class HADsCEDADownloadManager:

user_name: str | None
password: str | None
variables: Sequence[str] | None = None
save_path: os.PathLike = DEFAULT_SAVE_PATH
reverse: bool = False
shuffle: bool = False
change_hierarchy: bool = False
ftp_path: str = HADS_FTP_PATH
order: int = 0

def __post_init__(self) -> None:
self.user_name = self.user_name or os.getenv(CEDA_ENV_USER_NAME_KEY)
self.password = self.password or os.getenv(CEDA_ENV_PASSWORD_KEY)
if self.reverse:
self.order = 1
# reverse precedes shuffle
elif self.shuffle:
self.order = 2
if not self.user_name or not self.password:
raise ValueError(f"Both 'user_name' and 'password' needed.")

def download(self) -> None:
if self.change_hierarchy:
for v in self.variables:
download_ftp(
os.path.join(self.ftp_path, n, v, "day", "latest"),
os.path.join(self.save_path, v, n, "latest"),
username=self.user_name,
password=self.password,
order=self.order,
)
else:
download_ftp(
self.ftp_path,
str(self.save_path),
username=self.user_name,
password=self.password,
order=self.order,
)


@dataclass(kw_only=True, repr=False)
class CPMCEDADownloadManager(HADsCEDADownloadManager):
"""Manage downloading raw CPM data."""

runs: Sequence[str] | None = None
ftp_path: str = CPM_FTP_PATH

def download(self) -> None:
if self.change_hierarchy:
for n in self.runs:
for v in self.variables:
download_ftp(
os.path.join(self.ftp_path, n, v, "day", "latest"),
os.path.join(self.save_path, v, n, "latest"),
username=self.user_name,
password=self.password,
order=self.order,
)
else:
download_ftp(
self.ftp_path,
str(self.save_path),
username=self.user_name,
password=self.password,
order=self.order,
)


if __name__ == "__main__":
"""
Script to download CEDA data from the command line.
Expand Down
Loading
Loading