Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature 18 #26

Open
wants to merge 122 commits into
base: release_0.1.1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
122 commits
Select commit Hold shift + click to select a range
935ff15
Add CITATION.cff
Jan 31, 2024
b7019a4
Add sample catalog
Jan 31, 2024
5c8391f
Update catalog
Jan 31, 2024
acd2f1a
Add sample netcdf for catalog
Jan 31, 2024
59a39c5
Update api version
Jan 31, 2024
a0a8f5f
Update README
Jan 31, 2024
59a9b69
Test api to add filters for map
vale95-eng Feb 1, 2024
9ec303f
Add THI dataset in catalog
vale95-eng Feb 1, 2024
5e08db9
Fix api map endpoint with filters
vale95-eng Feb 1, 2024
dbc49f7
Add filters for items endpoint in api
vale95-eng Feb 2, 2024
944270f
Add filters for items endpoint in api
vale95-eng Feb 2, 2024
7d38011
Add filters for items endpoint in api
vale95-eng Feb 2, 2024
b7a7252
Merge pull request #25 from vale95-eng/feature_18
vale95-eng Feb 2, 2024
b59703e
Fix map and items for dataset rs-indices
vale95-eng Feb 8, 2024
ad95dd9
Merge branch 'CMCC-Foundation:feature_18' into feature_18
vale95-eng Feb 8, 2024
295a5cc
Merge pull request #27 from vale95-eng/feature_18
vale95-eng Feb 8, 2024
be914c5
Add RS_indices to catalog
vale95-eng Feb 15, 2024
129801c
Fix dpi for map
vale95-eng Feb 27, 2024
b47a662
Merge pull request #28 from vale95-eng/feature_18
vale95-eng Feb 27, 2024
93b6cd9
Fix dpi for map as parameter
vale95-eng Feb 27, 2024
5ce30c4
Merge pull request #29 from vale95-eng/feature_18
vale95-eng Feb 27, 2024
ce8c3dd
Fix dpi equal to 100 for map as parameter
vale95-eng Feb 27, 2024
d6f097a
Merge pull request #30 from vale95-eng/feature_18
vale95-eng Feb 27, 2024
2cfd3c1
Change geokube image version
vale95-eng Feb 28, 2024
6dd4138
Add cmap as parameter for map
vale95-eng Feb 28, 2024
8bc92d8
Merge pull request #32 from vale95-eng/feature_18
vale95-eng Feb 28, 2024
8d0a4be
add optional bbox for item and map
vale95-eng Mar 20, 2024
81fbd65
update dockerfile
gtramonte May 2, 2024
367f759
adding registry for Dockerfiles base
gtramonte May 2, 2024
bbd3cf7
adding tag echo on workflow
gtramonte May 3, 2024
ab57ede
fix python version
gtramonte May 3, 2024
d32a447
adding cache on workflow
gtramonte May 3, 2024
0fb4abb
fix to python version and catalog workdir
gtramonte May 3, 2024
010608d
fix python version
gtramonte May 3, 2024
1459dd0
adding projection to api get map
gtramonte May 3, 2024
bc54df8
moving projection inside geokube
gtramonte May 3, 2024
7a060c6
Fix query parameter in estimate for datastore
vale95-eng May 8, 2024
95d7ad5
update to docker files and workflow
gtramonte May 8, 2024
9049c51
update workflow for release
gtramonte May 8, 2024
fc5231a
Fix bug for pasture
vale95-eng May 8, 2024
5eef173
Merge pull request #35 from CMCC-Foundation/feature_34
gtramonte May 13, 2024
4de5d7c
fix time should be optional
gtramonte May 13, 2024
1084516
Merge branch 'main' into feature_18
gtramonte May 13, 2024
6de979b
adding vmin and vmax option to get_map and get_map_with_filters API
gtramonte Jun 3, 2024
0ba23eb
Add oai_cat
vale95-eng Jun 17, 2024
e09dffe
Fix utils import
vale95-eng Jun 17, 2024
936f3bf
Fix utils import
vale95-eng Jun 17, 2024
7d41050
Add rdflib in requirements
vale95-eng Jun 17, 2024
f75e01c
Fix metadata_provider import
vale95-eng Jun 17, 2024
aa10db5
Remoove scopes as argument
vale95-eng Jun 17, 2024
19a4527
Remove json dumps
vale95-eng Jun 17, 2024
5b346c5
change example url in oai_utils
vale95-eng Jun 17, 2024
3e924b5
adding csv format to persist datacube
gtramonte Jun 20, 2024
faf79ec
adding csv format into persist dataset
gtramonte Jun 20, 2024
0c5bbe9
fix slice for time and vertical axis
gtramonte Jun 27, 2024
7783589
fix query
gtramonte Jun 27, 2024
b12e27d
update oai dcat api for italian portal
gtramonte Jul 12, 2024
aaabd1c
fix import
gtramonte Jul 12, 2024
fc33278
Fixed Geokube Version
gtramonte Jul 15, 2024
af6884c
fix bug on dataset with no cache
gtramonte Jul 15, 2024
8e10e75
adding new driver for Active Fire Monitoring dataset
gtramonte Jul 15, 2024
1d54021
adding driver to setup.py
gtramonte Jul 15, 2024
946e0f2
changing from preprocess to post process data
gtramonte Jul 15, 2024
7145d8b
fix typo in DataCube
gtramonte Jul 15, 2024
f89ccbc
adding sort into driver
gtramonte Jul 17, 2024
eab48e8
dropping certainty variable from dataset
gtramonte Jul 17, 2024
a528242
removing postprocess call
gtramonte Jul 17, 2024
6e14a4a
doing only sort in post process
gtramonte Jul 17, 2024
15772cd
check if values explode memory
gtramonte Jul 17, 2024
5c4cea8
expanding only lat
gtramonte Jul 17, 2024
25d07d1
expanding longitude dim
gtramonte Jul 17, 2024
0402069
breaking the expand dim operation into two steps
gtramonte Jul 17, 2024
b16b97c
ignoring spotlight files
gtramonte Jul 19, 2024
d503dbb
applying chunking after expand dim
gtramonte Jul 19, 2024
40710f2
resetting indexes to remove duplicate
gtramonte Jul 19, 2024
80c42b9
adding a post process function to remove duplicate coordinate and app…
gtramonte Jul 19, 2024
5a2f15f
removing duplicated indexes by sel
gtramonte Jul 19, 2024
360a37e
moving reshape in post process
gtramonte Jul 19, 2024
7eac0de
changing chunks size
gtramonte Jul 19, 2024
210c3f9
adding post process chunk parameter
gtramonte Jul 19, 2024
56fd4bb
adding crs projection
gtramonte Jul 19, 2024
04cc64b
fix post process function was using wrong dataset dimensions
gtramonte Jul 19, 2024
3fa9f27
setting threads_per_worker param to 1
gtramonte Jul 22, 2024
2e21cf2
upgrade dask version
gtramonte Jul 22, 2024
5fa752e
upgrade dask version
gtramonte Jul 22, 2024
374ebb3
setting dask workers to 4
gtramonte Jul 22, 2024
6ae159b
setting dask workers to 1
gtramonte Jul 22, 2024
158fa47
removed dask cluster from executor
gtramonte Jul 22, 2024
e6aca53
checking with geokube version 0.2.6b2
gtramonte Jul 22, 2024
79463dd
test with new geokube image and dask cluster
gtramonte Jul 22, 2024
9405931
removed async keyword from process function
gtramonte Jul 22, 2024
00bf6a3
adding certainty back in the driver
gtramonte Jul 22, 2024
b5dea5c
adding pattern handling for filters
gtramonte Jul 25, 2024
c300925
resolving patter before using the open_datacube to get the list of fi…
gtramonte Jul 25, 2024
e4b1307
fix path should be a list of files not a Series
gtramonte Jul 25, 2024
997eb13
using geokube.Dataset apply function to apply postprocess on all Data…
gtramonte Jul 25, 2024
81b8650
removed unused functions
gtramonte Jul 25, 2024
0708a72
setting worker number to 4
gtramonte Jul 25, 2024
9f77ab2
applying resampling to datacubes
gtramonte Jul 26, 2024
9876f1d
removed chunk in postprocess
gtramonte Jul 26, 2024
7cda4fd
applying chunking again
gtramonte Jul 26, 2024
533fdff
setting number of thread per worker to 8
gtramonte Jul 26, 2024
991f9cf
adding options for Dask cluster
gtramonte Jul 29, 2024
15dbb9c
n_workers and thread per worker should be integer
gtramonte Jul 29, 2024
0e129e9
Adding healtchecks for API pods via cron
gtramonte Jul 30, 2024
626ecd5
missing new line at EOF
gtramonte Jul 30, 2024
a55c25a
launching service cron at API start
gtramonte Jul 30, 2024
a8b045e
starting service first
gtramonte Jul 30, 2024
e17dfef
modifying entrypoint
gtramonte Jul 30, 2024
38fe6df
missing curl in api image
gtramonte Jul 30, 2024
67e6c9b
adding healtchek to executors
gtramonte Jul 31, 2024
6565728
fix to dcat_ap_it api
gtramonte Aug 1, 2024
352c4c7
other fixes to dcat_ap_it api
gtramonte Aug 2, 2024
d7dd851
fix for dcat_ap_it italian validator
gtramonte Aug 2, 2024
051320b
fix for datatime.strptime
gtramonte Aug 2, 2024
a06f748
Merge pull request #36 from CMCC-Foundation/hotfix_v0.1a1
gtramonte Aug 5, 2024
7d33613
final adjustments to dcat_ap_it api
gtramonte Aug 6, 2024
c66a62c
Merge branch 'main' into feature_18
gtramonte Aug 6, 2024
2892c10
fix geokube version
gtramonte Aug 20, 2024
e0fe5ee
fix iot environmental accrual periodicity and double slash in paths
gtramonte Aug 29, 2024
33a704f
datatype should be dateTime
gtramonte Aug 30, 2024
39d75d2
update gitignore
gtramonte Sep 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 20 additions & 10 deletions .github/workflows/build_on_pull_request.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,58 +20,68 @@ jobs:
build
--user
- name: Build a binary wheel and a source for drivers
run: python3 -m build ./drivers
run: python3 -m build ./drivers
- name: Set Docker image tag name
run: echo "TAG=$(date +'%Y.%m.%d.%H.%M')" >> $GITHUB_ENV
- name: TAG ECHO
run: echo ${{ env.TAG }}
- name: Login to Scaleway Container Registry
uses: docker/login-action@v2
with:
username: nologin
password: ${{ secrets.DOCKER_PASSWORD }}
registry: ${{ vars.DOCKER_REGISTRY }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v2
- name: Build and push drivers
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./drivers
file: ./drivers/Dockerfile
push: true
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
REGISTRY=${{ vars.GEOKUBE_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:latest
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:latest
- name: Build and push datastore component
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./datastore
file: ./datastore/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:latest
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:latest
- name: Build and push api component
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./api
file: ./api/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-api:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-api:latest
${{ vars.DOCKER_REGISTRY }}/geolake-api:latest
- name: Build and push executor component
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./executor
file: ./executor/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-executor:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-executor:latest
${{ vars.DOCKER_REGISTRY }}/geolake-executor:latest
17 changes: 13 additions & 4 deletions .github/workflows/build_on_release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,45 +32,54 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build and push drivers
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./drivers
file: ./drivers/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.GEOKUBE_REGISTRY }}
TAG=v0.2a6
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.GEOLAKE_REGISTRY }}/geolake-drivers:${{ env.RELEASE_TAG }}
- name: Build and push datastore component
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./datastore
file: ./datastore/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.GEOLAKE_REGISTRY }}
TAG=${{ env.RELEASE_TAG }}
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.GEOLAKE_REGISTRY }}/geolake-datastore:${{ env.RELEASE_TAG }}
- name: Build and push api component
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./api
file: ./api/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.GEOLAKE_REGISTRY }}
TAG=${{ env.RELEASE_TAG }}
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.GEOLAKE_REGISTRY }}/geolake-api:${{ env.RELEASE_TAG }}
- name: Build and push executor component
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
context: ./executor
file: ./executor/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.GEOLAKE_REGISTRY }}
TAG=${{ env.RELEASE_TAG }}
cache-from: type=gha
cache-to: type=gha,mode=max
tags: |
${{ vars.GEOLAKE_REGISTRY }}/geolake-executor:${{ env.RELEASE_TAG }}
6 changes: 4 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -112,5 +112,7 @@ venv.bak/
_catalogs/
_old/

# Netcdf files
*.nc
.DS_Store
db_init
*.zarr
*.nc
51 changes: 51 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: geolake
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Marco
family-names: Mancini
orcid: 'https://orcid.org/0000-0002-9150-943X'
- given-names: Jakub
family-names: Walczak
orcid: 'https://orcid.org/0000-0002-5632-9484'
- given-names: Mirko
family-names: Stojiljković
- given-names: Valentina
family-names: Scardigno
orcid: 'https://orcid.org/0000-0002-0123-5368'
identifiers:
- type: doi
value: 10.5281/zenodo.10598417
repository-code: 'https://github.com/CMCC-Foundation/geolake'
abstract: >+
geolake is an open source framework for management,
storage, and analytics of Earth Science data. geolake
implements the concept of a data lake as a central
location that holds a large amount of data in its native
and raw format. geolake does not impose any schema when
ingesting the data, however it provides a unified Data
Model and API for geoscientific datasets. The data is kept
in the original format and storage, and the in-memory data
structure is built on-the-fly for the processing analysis.

The system has been designed using a cloud-native
architecture, based on containerized microservices, that
facilitates the development, deployment and maintenance of
the system itself. It has been implemented by integrating
different open source frameworks, tools and libraries and
can be easily deployed using the Kubernetes platform and
related tools such as kubectl.

keywords:
- python framework
- earth science
- data analytics
license: Apache-2.0
version: 0.1.0
date-released: '2024-01-29'
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10598417.svg)](https://doi.org/10.5281/zenodo.10598417)

# geolake

## Description
Expand Down
10 changes: 10 additions & 0 deletions api/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
ARG REGISTRY=rg.fr-par.scw.cloud/geolake
ARG TAG=latest
FROM $REGISTRY/geolake-datastore:$TAG

RUN apt update && apt install -y cron curl

WORKDIR /app
COPY requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
COPY app /app
EXPOSE 80

COPY ./healtcheck.* /opt/

RUN chmod +x /opt/healtcheck.sh
RUN crontab -u root /opt/healtcheck.cron

CMD ["uvicorn", "app.main:app", "--proxy-headers", "--host", "0.0.0.0", "--port", "80"]

Loading
Loading