Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #5 #9

Open
wants to merge 33 commits into
base: release_0.1.1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
8112d89
Update api with adjust_for_intake branch
Jan 11, 2024
3a608ef
Update datastore with adjust_for_intake branch
Jan 11, 2024
49bec47
Update executor with adjust_for_intake branch
Jan 11, 2024
0935326
Removed old geokube packages
Jan 11, 2024
e4365d0
Removed old geoquery
Jan 11, 2024
9484255
Merge intake drivers
Jan 11, 2024
3a25e54
Add workflows
Jan 12, 2024
291a1e6
Remove db folder
Jan 12, 2024
1bd8355
Prepare single workflow for docker images of all components
Jan 15, 2024
29d93e7
Build wheel for driver
Jan 15, 2024
97b1ef3
Update path for intake wheel in Docker use
Jan 15, 2024
bdb7719
Add action for production
Jan 15, 2024
5f42f3d
Update docker context
Jan 15, 2024
f1771da
Fix variable name in staging
vale95-eng Jan 16, 2024
bf18bfb
Add docs, part 1
Jan 22, 2024
a3a3e99
Change var name for registry
vale95-eng Jan 23, 2024
3a454a1
Merge branch 'feature_1' into feature_5
Jan 24, 2024
5360b1b
Update docs
Jan 24, 2024
8d2a761
Add authors
Jan 24, 2024
0355260
Update use dir urls
Jan 26, 2024
6e1dd32
Add built sites
Jan 26, 2024
ae60e0d
Update logo
Jan 29, 2024
b21fb78
Add README
Jan 29, 2024
3978b8b
Update sites
Jan 29, 2024
592479a
Update sites
Jan 29, 2024
f588146
Update logo file path
Jan 29, 2024
395af72
Add action to publish docs when needed
Jan 29, 2024
752e11d
Remove ghaction
Jan 29, 2024
fb2c3d4
Update readme
Jan 29, 2024
621ef11
Merge commit
Jan 31, 2024
8abb251
Solve conflicts. Update docs
Jan 31, 2024
e7b932a
Update client docs
Jan 31, 2024
28893c6
Add docs for endpoints
Jan 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions .github/workflows/build-production.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
name: Build Docker images for geolake components and push to the repository

on:
push:
tags:
- 'v*'
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install build
run: >-
python3 -m
pip install
build
--user
- name: Build a binary wheel and a source for drivers
run: python3 -m build ./drivers
- name: Set Docker image tag name
run: echo "TAG=$(date +'%Y.%m.%d.%H.%M')" >> $GITHUB_ENV
- name: Login to Scaleway Container Registry
uses: docker/login-action@v2
with:
username: nologin
password: ${{ secrets.DOCKER_PASSWORD }}
registry: ${{ vars.DOCKER_REGISTRY }}
- name: Get release tag
run: echo "RELEASE_TAG=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build and push drivers
uses: docker/build-push-action@v4
with:
context: ./drivers
file: ./drivers/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:${{ env.RELEASE_TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:latest
- name: Build and push datastore component
uses: docker/build-push-action@v4
with:
context: ./datastore
file: ./datastore/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:${{ env.RELEASE_TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:latest
- name: Build and push api component
uses: docker/build-push-action@v4
with:
context: ./api
file: ./api/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-api:${{ env.RELEASE_TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-api:latest
- name: Build and push executor component
uses: docker/build-push-action@v4
with:
context: ./executor
file: ./executor/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-executor:${{ env.RELEASE_TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-executor:latest
77 changes: 77 additions & 0 deletions .github/workflows/build-staging.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
name: Build Docker images for geolake components and push to the repository

on:
pull_request:
types: [opened, synchronize]
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install build
run: >-
python3 -m
pip install
build
--user
- name: Build a binary wheel and a source for drivers
run: python3 -m build ./drivers
- name: Set Docker image tag name
run: echo "TAG=$(date +'%Y.%m.%d.%H.%M')" >> $GITHUB_ENV
- name: Login to Scaleway Container Registry
uses: docker/login-action@v2
with:
username: nologin
password: ${{ secrets.DOCKER_PASSWORD }}
registry: ${{ vars.DOCKER_REGISTRY }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build and push drivers
uses: docker/build-push-action@v4
with:
context: ./drivers
file: ./drivers/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-drivers:latest
- name: Build and push datastore component
uses: docker/build-push-action@v4
with:
context: ./datastore
file: ./datastore/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-datastore:latest
- name: Build and push api component
uses: docker/build-push-action@v4
with:
context: ./api
file: ./api/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-api:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-api:latest
- name: Build and push executor component
uses: docker/build-push-action@v4
with:
context: ./executor
file: ./executor/Dockerfile
push: true
build-args: |
REGISTRY=${{ vars.DOCKER_REGISTRY }}
tags: |
${{ vars.DOCKER_REGISTRY }}/geolake-executor:${{ env.TAG }}
${{ vars.DOCKER_REGISTRY }}/geolake-executor:latest
40 changes: 31 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# geolake
<div align="center">
<img src="docs/img/logo.svg" width="40%" height="40%">
</div>

## Description

## 📖 Description

**geolake** is an open source framework for management, storage, and analytics of Earth Science data. geolake implements the concept of a data lake as a central location that holds a large amount of data in its native and raw format.

Expand All @@ -10,13 +13,32 @@ The system has been designed using a cloud-native architecture, based on contain

It uses [geokube](https://github.com/CMCC-Foundation/geokube) as an Analytics Engine to perform geospatial operations.

## Authors

## 🖋️ Authors
**Project Lead**:
[Marco Mancini](https://github.com/km4rcus)

1. [Marco Mancini](https://github.com/km4rcus) <a href="https://orcid.org/0000-0002-5632-9484"><img alt="ORCID logo" src="https://info.orcid.org/wp-content/uploads/2019/11/orcid_16x16.png" width="16" height="16" /></a>
**Main Developers**
- [Jakub Walczak](https://github.com/jamesWalczak)
- [Mirko Stojiljkovic](https://github.com/MMStojiljkovic)
- [Valentina Scardigno](https://github.com/vale95-eng)

1. [Jakub Walczak](https://github.com/jamesWalczak) <a href="https://orcid.org/0000-0002-5632-9484"><img alt="ORCID logo" src="https://info.orcid.org/wp-content/uploads/2019/11/orcid_16x16.png" width="16" height="16" /></a>
1.[Mirko Stojiljkovic](https://github.com/MMStojiljkovic) <a href="https://orcid.org/0000-0003-2256-1645"><img alt="ORCID logo" src="https://info.orcid.org/wp-content/uploads/2019/11/orcid_16x16.png" width="16" height="16" /></a>
1. [Valentina Scardigno](https://github.com/vale95-eng) <a href="https://orcid.org/0000-0002-0123-5368"><img alt="ORCID logo" src="https://info.orcid.org/wp-content/uploads/2019/11/orcid_16x16.png" width="16" height="16" /></a>


## 📜 Cite Us
```bibtex
@SOFTWARE{geolake,
author = {Mancini, Marco and
Walczak, Jakub and
Stojiljković, Mirko and
Scardigno, Valentina},
title = {geolake},
month = jan,
year = 2024,
note = {{Available in GitHub: https://github.com/CMCC-Foundation/geolake}},
}
```

## 🙏 Acknowledgement
This work was funded by ...

## 🎬 References
See [References](https://opengeolake.github.io/) page for details
4 changes: 4 additions & 0 deletions api/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
<<<<<<< HEAD
ARG REGISTRY=rg.nl-ams.scw.cloud/geogeolake-production
=======
ARG REGISTRY=rg.fr-par.scw.cloud/geolake
>>>>>>> release_0.1.1
ARG TAG=latest
FROM $REGISTRY/geolake-datastore:$TAG
WORKDIR /app
Expand Down
16 changes: 16 additions & 0 deletions api/app/auth/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,19 @@
from dbmanager.dbmanager import DBManager

import exceptions as exc
<<<<<<< HEAD
from auth.models import GeoLakeUser
from auth import scopes


class GeoLakeAuthenticationBackend(AuthenticationBackend):
=======
from auth.models import DDSUser
from auth import scopes


class DDSAuthenticationBackend(AuthenticationBackend):
>>>>>>> release_0.1.1
"""Class managing authentication and authorization"""

async def authenticate(self, conn):
Expand All @@ -25,7 +33,11 @@ async def authenticate(self, conn):
def _manage_user_token_auth(self, user_token: str):
try:
user_id, api_key = self.get_authorization_scheme_param(user_token)
<<<<<<< HEAD
except exc.BaseGeoLakeException as err:
=======
except exc.BaseDDSException as err:
>>>>>>> release_0.1.1
raise err.wrap_around_http_exception()
user_dto = DBManager().get_user_details(user_id)
eligible_scopes = [scopes.AUTHENTICATED] + self._get_scopes_for_user(
Expand All @@ -35,7 +47,11 @@ def _manage_user_token_auth(self, user_token: str):
raise exc.AuthenticationFailed(
user_dto
).wrap_around_http_exception()
<<<<<<< HEAD
return AuthCredentials(eligible_scopes), GeoLakeUser(username=user_id)
=======
return AuthCredentials(eligible_scopes), DDSUser(username=user_id)
>>>>>>> release_0.1.1

def _get_scopes_for_user(self, user_dto) -> list[str]:
if user_dto is None:
Expand Down
7 changes: 7 additions & 0 deletions api/app/auth/manager.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
"""Module with access/authentication functions"""
from typing import Optional

<<<<<<< HEAD
from utils.api_logging import get_geolake_logger
import exceptions as exc

log = get_geolake_logger(__name__)
=======
from utils.api_logging import get_dds_logger
import exceptions as exc

log = get_dds_logger(__name__)
>>>>>>> release_0.1.1


def is_role_eligible_for_product(
Expand Down
12 changes: 12 additions & 0 deletions api/app/auth/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@
from starlette.authentication import SimpleUser


<<<<<<< HEAD
class GeoLakeUser(SimpleUser):
=======
class DDSUser(SimpleUser):
>>>>>>> release_0.1.1
"""Immutable class containing information about the authenticated user"""

def __init__(self, username: str) -> None:
Expand All @@ -13,7 +17,11 @@ def id(self):
return self.username

def __eq__(self, other) -> bool:
<<<<<<< HEAD
if not isinstance(other, GeoLakeUser):
=======
if not isinstance(other, DDSUser):
>>>>>>> release_0.1.1
return False
if self.username == other.username:
return True
Expand All @@ -23,7 +31,11 @@ def __ne__(self, other):
return self != other

def __repr__(self):
<<<<<<< HEAD
return f"<GeoLakeUser(username={self.username}>"
=======
return f"<DDSUser(username={self.username}>"
>>>>>>> release_0.1.1

def __delattr__(self, name):
if getattr(self, name, None) is not None:
Expand Down
8 changes: 8 additions & 0 deletions api/app/callbacks/on_startup.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,17 @@
"""Module with functions call during API server startup"""
<<<<<<< HEAD
from utils.api_logging import get_geolake_logger

from datastore.datastore import Datastore

log = get_geolake_logger(__name__)
=======
from utils.api_logging import get_dds_logger

from datastore.datastore import Datastore

log = get_dds_logger(__name__)
>>>>>>> release_0.1.1


def _load_cache() -> None:
Expand Down
15 changes: 8 additions & 7 deletions api/app/endpoint_handlers/dataset.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
"""Modules realizing logic for dataset-related endpoints"""
import os
import pika
import json
from typing import Optional

from fastapi.responses import FileResponse

from dbmanager.dbmanager import DBManager, RequestStatus
from geoquery.geoquery import GeoQuery
from geoquery.task import TaskList
from intake_geokube.queries.geoquery import GeoQuery
from intake_geokube.queries.workflow import Workflow
from datastore.datastore import Datastore, DEFAULT_MAX_REQUEST_SIZE_GB
from datastore import exception as datastore_exception

from utils.metrics import log_execution_time
from utils.api_logging import get_dds_logger
from utils.api_logging import get_geolake_logger
from auth.manager import (
is_role_eligible_for_product,
)
Expand All @@ -22,7 +23,7 @@

from . import request

log = get_dds_logger(__name__)
log = get_geolake_logger(__name__)
data_store = Datastore()

MESSAGE_SEPARATOR = os.environ["MESSAGE_SEPARATOR"]
Expand Down Expand Up @@ -285,7 +286,7 @@ def async_query(
user_id=user_id,
dataset=dataset_id,
product=product_id,
query=query.original_query_json(),
query=json.dumps(query.model_dump_original()),
)

# TODO: find a separator; for the moment use "\"
Expand Down Expand Up @@ -370,7 +371,7 @@ def sync_query(
@log_execution_time(log)
def run_workflow(
user_id: str,
workflow: TaskList,
workflow: Workflow,
):
"""Realize the logic for the endpoint:

Expand All @@ -382,7 +383,7 @@ def run_workflow(
----------
user_id : str
ID of the user executing the query
workflow : TaskList
workflow : Workflow
Workflow to perform

Returns
Expand Down
Loading
Loading