Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MEP v8.0 #576

Merged
merged 101 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
e7c6729
feat(mvt): add artificial area vector tiles endpoint
alexisig Jul 30, 2024
fad5581
fix(tests): remove test depending on s3
alexisig Jul 30, 2024
7121bd5
temp
alexisig Aug 6, 2024
eed7c1f
fix(airflow): remove example dag
alexisig Aug 6, 2024
9f7e14b
feat(airflow): set get_table_name type hinting
alexisig Aug 6, 2024
65d03d6
fix(is_artificial): change style of comments
alexisig Aug 6, 2024
87e390b
fix(land): remove schema
alexisig Aug 6, 2024
20ca757
temp
alexisig Aug 7, 2024
2242991
feat(airflow): add gpu
alexisig Aug 9, 2024
f011ca4
temp
alexisig Aug 12, 2024
456bacf
feat(ingestion): add app deps
alexisig Aug 12, 2024
5495237
misc
alexisig Aug 14, 2024
b58c0a6
temp
alexisig Aug 15, 2024
72e1181
temp
alexisig Aug 15, 2024
667d3d2
temp
alexisig Aug 15, 2024
894f05e
temp
alexisig Aug 17, 2024
776e981
feat(airflow): prepare zone_urba table for ingestion from airflow
alexisig Aug 18, 2024
74dca1f
chore(airflow): remove unneeded commands
alexisig Aug 19, 2024
1b88239
feat(airflow): separate building and loading for ocsge
alexisig Aug 20, 2024
ad4321b
feat(airflow): add mattermost integration
alexisig Aug 20, 2024
f27997d
feat(airflow): add gpu
alexisig Aug 22, 2024
ba1da77
chore(commands): remove unused
alexisig Aug 22, 2024
17e18c9
feat(airflow): move code to include folder
alexisig Aug 22, 2024
be1bc2d
feat(airflow): make land models managed=False
alexisig Aug 26, 2024
de2c61f
feat(dbt): add many_to_many models for lands
alexisig Aug 27, 2024
e6e93d1
feat(dbt): add many_to_many models for lands
alexisig Aug 27, 2024
e42c5a4
Revert "feat(dbt): add many_to_many models for lands"
alexisig Aug 27, 2024
686fe35
feat(airflow): add expose ports to config
alexisig Aug 27, 2024
379eef8
temp
alexisig Aug 27, 2024
7d2ebd9
temp
alexisig Aug 27, 2024
ea09c8c
temp
alexisig Aug 27, 2024
cf99e81
feat(airflow): add all ocsge url in source.json
alexisig Aug 28, 2024
77cddf6
feat(airflow): increase concurrency to 4 threads
alexisig Aug 28, 2024
be8520e
temp
alexisig Aug 29, 2024
8116684
feat(dbt): add many indexes
alexisig Aug 29, 2024
85ae386
add more indexes
alexisig Aug 29, 2024
dd3ec14
increase threads to 4
alexisig Aug 29, 2024
1c9ad60
feat(dbt): improve macro performance
alexisig Aug 29, 2024
a3d73fc
feat(dbt): optimize incremental condition
alexisig Aug 29, 2024
95d05b5
feat(airflow): do not throw error if ocsge url does not exist
alexisig Aug 29, 2024
98604c5
temp
alexisig Aug 29, 2024
ae6b96c
feat(airflow): allow multiple type of normalizations
alexisig Aug 30, 2024
eeaf540
feat(airflow): add noramlization ruel for ocsge
alexisig Aug 30, 2024
db05570
feat(zonage_urbanisme): make_valid on geom
alexisig Aug 30, 2024
b9a449e
temp
alexisig Aug 30, 2024
8a4e298
fix(dbt): remove unique condition on zone construite
alexisig Sep 1, 2024
3ac893b
feat(airflow): add ocsge 22 source
alexisig Sep 3, 2024
8aa9dbb
feat(dbt): readd map_color to commune
alexisig Sep 3, 2024
558f490
feat(airflow): allow update of staging, production or dev
alexisig Sep 4, 2024
c1e18f1
fea(update_app): set source to dbt
alexisig Sep 4, 2024
2da179b
feat(airflow): add indexes
alexisig Sep 4, 2024
ec554b7
temp
alexisig Sep 4, 2024
0bb7673
feat(update_add): name created btree index
alexisig Sep 4, 2024
a1e5413
feat(update_app): add index type
alexisig Sep 4, 2024
0d1b0f2
temp
alexisig Sep 4, 2024
022cf38
temp
alexisig Sep 4, 2024
5444031
temp
alexisig Sep 4, 2024
85600bb
temp
alexisig Sep 4, 2024
e634b06
temp
alexisig Sep 4, 2024
24a25a4
temp
alexisig Sep 4, 2024
680a9ee
feat(dbt): make index creation concurent
alexisig Sep 4, 2024
c8cecb7
feat(dbt): revert make index creation concurent
alexisig Sep 4, 2024
394a27b
feat(update_app): change order of tasks
alexisig Sep 4, 2024
32db191
feat(container): add staging and prod psycopg2 dep
alexisig Sep 4, 2024
9c56cd0
feat(update_app): add gdal and psycopg conn
alexisig Sep 4, 2024
e8367ac
feat(airflow): add github action to automate deploy
alexisig Sep 4, 2024
31c3b57
feat(update_app): remove gist index creation as it is automatic with …
alexisig Sep 4, 2024
a64fe20
feat(update_app): remove prod connection
alexisig Sep 4, 2024
5e6b810
feat(admin_express): add support for drom com
alexisig Sep 5, 2024
10fbe3d
feat(admin_express): add + to model creation in airflow
alexisig Sep 5, 2024
01754fe
feat(dbt): add srid_source
alexisig Sep 5, 2024
3a48ce8
feat(dbt): update selector for admin express
alexisig Sep 5, 2024
b6c89db
feat(departement): add srid_source
alexisig Sep 5, 2024
e4e8e1d
feat(aritificial_commune): add srid_source to table
alexisig Sep 5, 2024
f34f965
feat(updateapp): add index on ocsge table
alexisig Sep 5, 2024
7340ac1
feat(ocsge): add url for 04
alexisig Sep 5, 2024
3e86f50
feat(update_app): set max_action_runs to 1
alexisig Sep 5, 2024
2c74efb
feat(dbt): remove test for for_app_commune
alexisig Sep 5, 2024
bf36866
docs(version): bump to 8
alexisig Sep 6, 2024
22b0d9b
feat(update_app): fix index name
alexisig Sep 6, 2024
9104bfa
feat(dbt): add pre-commit lint & fix
alexisig Sep 6, 2024
6929fa3
feat(dbt): add sqlfluff linter
alexisig Sep 6, 2024
af4d935
feat(precommit): skip sqlflull in ci
alexisig Sep 6, 2024
6c839f7
feat(dbt): add pools to prevent airflow from triggering dbt twice at …
alexisig Sep 6, 2024
f9f5856
feat(dbt): add dbt pool
alexisig Sep 6, 2024
a705112
fix(for_app_commune): set surface to 0 if missing
alexisig Sep 10, 2024
60577d7
feat(update_app): allow runing only certain tasks
alexisig Sep 10, 2024
aaba21b
temp
alexisig Sep 10, 2024
e80fa45
temp
alexisig Sep 10, 2024
7f1ee70
temp
alexisig Sep 10, 2024
885b28c
temp
alexisig Sep 10, 2024
63e33bf
temp
alexisig Sep 10, 2024
4da7114
feat(cog2023): hide from search results
alexisig Sep 10, 2024
81f0cb2
feat(airflow): deploy from staging
alexisig Sep 11, 2024
7f8d3c6
feat(RNUPackagesNoticeView): remove unused file
alexisig Sep 11, 2024
533bff2
feat(admin_express): make tables managed=False
alexisig Sep 11, 2024
1b8a676
fix(search): typo
alexisig Sep 11, 2024
385d44a
fix(search): typo
alexisig Sep 11, 2024
3bb600e
Merge pull request #554 from MTES-MCT/feat-airflow
alexisig Sep 11, 2024
351d9c2
feat(update_app): enable connection to production
alexisig Sep 12, 2024
7113d11
Merge pull request #575 from MTES-MCT/feat-enable-update-app-production
alexisig Sep 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .astro/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
project:
name: sparte
1 change: 1 addition & 0 deletions .astro/dag_integrity_exceptions.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Add dag files to exempt from parse test below. ex: dags/<test-file>
130 changes: 130 additions & 0 deletions .astro/test_dag_integrity_default.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
"""Test the validity of all DAGs. **USED BY DEV PARSE COMMAND DO NOT EDIT**"""

import logging
import os
from contextlib import contextmanager

import pytest
from airflow.hooks.base import BaseHook
from airflow.models import Connection, DagBag, Variable
from airflow.utils.db import initdb

# init airflow database
initdb()

# The following code patches errors caused by missing OS Variables, Airflow Connections, and Airflow Variables


# =========== MONKEYPATCH BaseHook.get_connection() ===========
def basehook_get_connection_monkeypatch(key: str, *args, **kwargs):
print(f"Attempted to fetch connection during parse returning an empty Connection object for {key}")
return Connection(key)


BaseHook.get_connection = basehook_get_connection_monkeypatch
# # =========== /MONKEYPATCH BASEHOOK.GET_CONNECTION() ===========


# =========== MONKEYPATCH OS.GETENV() ===========
def os_getenv_monkeypatch(key: str, *args, **kwargs):
default = None
if args:
default = args[0] # os.getenv should get at most 1 arg after the key
if kwargs:
default = kwargs.get("default", None) # and sometimes kwarg if people are using the sig

env_value = os.environ.get(key, None)

if env_value:
return env_value # if the env_value is set, return it
if key == "JENKINS_HOME" and default is None: # fix https://github.com/astronomer/astro-cli/issues/601
return None
if default:
return default # otherwise return whatever default has been passed
return f"MOCKED_{key.upper()}_VALUE" # if absolutely nothing has been passed - return the mocked value


os.getenv = os_getenv_monkeypatch
# # =========== /MONKEYPATCH OS.GETENV() ===========

# =========== MONKEYPATCH VARIABLE.GET() ===========


class magic_dict(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)

def __getitem__(self, key):
return {}.get(key, "MOCKED_KEY_VALUE")


_no_default = object() # allow falsey defaults


def variable_get_monkeypatch(key: str, default_var=_no_default, deserialize_json=False):
print(f"Attempted to get Variable value during parse, returning a mocked value for {key}")

if default_var is not _no_default:
return default_var
if deserialize_json:
return magic_dict()
return "NON_DEFAULT_MOCKED_VARIABLE_VALUE"


Variable.get = variable_get_monkeypatch
# # =========== /MONKEYPATCH VARIABLE.GET() ===========


@contextmanager
def suppress_logging(namespace):
"""
Suppress logging within a specific namespace to keep tests "clean" during build
"""
logger = logging.getLogger(namespace)
old_value = logger.disabled
logger.disabled = True
try:
yield
finally:
logger.disabled = old_value


def get_import_errors():
"""
Generate a tuple for import errors in the dag bag, and include DAGs without errors.
"""
with suppress_logging("airflow"):
dag_bag = DagBag(include_examples=False)

def strip_path_prefix(path):
return os.path.relpath(path, os.environ.get("AIRFLOW_HOME"))

# Initialize an empty list to store the tuples
result = []

# Iterate over the items in import_errors
for k, v in dag_bag.import_errors.items():
result.append((strip_path_prefix(k), v.strip()))

# Check if there are DAGs without errors
for file_path in dag_bag.dags:
# Check if the file_path is not in import_errors, meaning no errors
if file_path not in dag_bag.import_errors:
result.append((strip_path_prefix(file_path), "No import errors"))

return result


@pytest.mark.parametrize("rel_path, rv", get_import_errors(), ids=[x[0] for x in get_import_errors()])
def test_file_imports(rel_path, rv):
"""Test for import errors on a file"""
if os.path.exists(".astro/dag_integrity_exceptions.txt"):
with open(".astro/dag_integrity_exceptions.txt", "r") as f:
exceptions = f.readlines()
print(f"Exceptions: {exceptions}")
if (rv != "No import errors") and rel_path not in exceptions:
# If rv is not "No import errors," consider it a failed test
raise Exception(f"{rel_path} failed to import with message \n {rv}")
else:
# If rv is "No import errors," consider it a passed test
print(f"{rel_path} passed the import test")
8 changes: 8 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
astro
.git
.env
airflow_settings.yaml
logs/
.venv
airflow.db
airflow.cfg
19 changes: 19 additions & 0 deletions .github/workflows/deploy_airflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: Deploy to production Airflow
on:
push:
branches:
- "staging"

jobs:
deploy:
name: Deploy
runs-on: ubuntu-latest
steps:
- name: executing remote ssh git pull
uses: appleboy/ssh-action@master
with:
host: ${{ secrets.AIRFLOW_SSH_HOST }}
username: ${{ secrets.AIRFLOW_SSH_USER }}
key: ${{ secrets.AIRFLOW_SSH_KEY }}
port: ${{ secrets.AIRFLOW_SSH_PORT }}
script: cd ~/sparte && git pull
2 changes: 1 addition & 1 deletion .github/workflows/pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
- uses: actions/checkout@v4
- uses: pre-commit/[email protected]
env:
SKIP: ggshield
SKIP: ggshield,sqlfluff-lint,sqlfluff-fix
test:
runs-on: ubuntu-latest
steps:
Expand Down
18 changes: 13 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,6 @@ repos:
hooks:
- id: black
language_version: python3
# - repo: https://github.com/pycqa/bandit
# rev: 1.7.0
# hooks:
# - id: bandit
# args: ['-iii', '-ll']
- repo: https://github.com/PyCQA/autoflake
rev: v2.2.1
hooks:
Expand All @@ -41,6 +36,19 @@ repos:
hooks:
- id: flake8
args: ['--config', 'flake8']
- repo: https://github.com/sqlfluff/sqlfluff
rev: 3.1.1
hooks:
- id: sqlfluff-lint
additional_dependencies: [
'dbt-postgres==1.8.2',
'sqlfluff-templater-dbt'
]
- id: sqlfluff-fix
additional_dependencies: [
'dbt-postgres==1.8.2',
'sqlfluff-templater-dbt'
]
- repo: https://github.com/gitguardian/ggshield
rev: v1.24.0
hooks:
Expand Down
25 changes: 25 additions & 0 deletions .sqlfluff
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
[sqlfluff]
templater = dbt
dialect = postgres

[sqlfluff:templater:jinja]
apply_dbt_builtins = True


[sqlfluff:templater:dbt]
project_dir = airflow/include/sql/sparte
profiles_dir = ~/.dbt


[sqlfluff:layout:type:alias_expression]
# We want non-default spacing _before_ the alias expressions.
spacing_before = align
# We want to align them within the next outer select clause.
# This means for example that alias expressions within the FROM
# or JOIN clause would _not_ be aligned with them.
align_within = select_clause
# The point at which to stop searching outward for siblings, which
# in this example would likely be the boundary of a CTE. Stopping
# when we hit brackets is usually a good rule of thumb for this
# configuration.
align_scope = bracketed
8 changes: 8 additions & 0 deletions airflow/.astro/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
airflow:
expose_port: true
project:
name: airflow
postgres:
port: 5433
webserver:
port: 9090
1 change: 1 addition & 0 deletions airflow/.astro/dag_integrity_exceptions.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Add dag files to exempt from parse test below. ex: dags/<test-file>
130 changes: 130 additions & 0 deletions airflow/.astro/test_dag_integrity_default.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
"""Test the validity of all DAGs. **USED BY DEV PARSE COMMAND DO NOT EDIT**"""

import logging
import os
from contextlib import contextmanager

import pytest
from airflow.hooks.base import BaseHook
from airflow.models import Connection, DagBag, Variable
from airflow.utils.db import initdb

# init airflow database
initdb()

# The following code patches errors caused by missing OS Variables, Airflow Connections, and Airflow Variables


# =========== MONKEYPATCH BaseHook.get_connection() ===========
def basehook_get_connection_monkeypatch(key: str, *args, **kwargs):
print(f"Attempted to fetch connection during parse returning an empty Connection object for {key}")
return Connection(key)


BaseHook.get_connection = basehook_get_connection_monkeypatch
# # =========== /MONKEYPATCH BASEHOOK.GET_CONNECTION() ===========


# =========== MONKEYPATCH OS.GETENV() ===========
def os_getenv_monkeypatch(key: str, *args, **kwargs):
default = None
if args:
default = args[0] # os.getenv should get at most 1 arg after the key
if kwargs:
default = kwargs.get("default", None) # and sometimes kwarg if people are using the sig

env_value = os.environ.get(key, None)

if env_value:
return env_value # if the env_value is set, return it
if key == "JENKINS_HOME" and default is None: # fix https://github.com/astronomer/astro-cli/issues/601
return None
if default:
return default # otherwise return whatever default has been passed
return f"MOCKED_{key.upper()}_VALUE" # if absolutely nothing has been passed - return the mocked value


os.getenv = os_getenv_monkeypatch
# # =========== /MONKEYPATCH OS.GETENV() ===========

# =========== MONKEYPATCH VARIABLE.GET() ===========


class magic_dict(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)

def __getitem__(self, key):
return {}.get(key, "MOCKED_KEY_VALUE")


_no_default = object() # allow falsey defaults


def variable_get_monkeypatch(key: str, default_var=_no_default, deserialize_json=False):
print(f"Attempted to get Variable value during parse, returning a mocked value for {key}")

if default_var is not _no_default:
return default_var
if deserialize_json:
return magic_dict()
return "NON_DEFAULT_MOCKED_VARIABLE_VALUE"


Variable.get = variable_get_monkeypatch
# # =========== /MONKEYPATCH VARIABLE.GET() ===========


@contextmanager
def suppress_logging(namespace):
"""
Suppress logging within a specific namespace to keep tests "clean" during build
"""
logger = logging.getLogger(namespace)
old_value = logger.disabled
logger.disabled = True
try:
yield
finally:
logger.disabled = old_value


def get_import_errors():
"""
Generate a tuple for import errors in the dag bag, and include DAGs without errors.
"""
with suppress_logging("airflow"):
dag_bag = DagBag(include_examples=False)

def strip_path_prefix(path):
return os.path.relpath(path, os.environ.get("AIRFLOW_HOME"))

# Initialize an empty list to store the tuples
result = []

# Iterate over the items in import_errors
for k, v in dag_bag.import_errors.items():
result.append((strip_path_prefix(k), v.strip()))

# Check if there are DAGs without errors
for file_path in dag_bag.dags:
# Check if the file_path is not in import_errors, meaning no errors
if file_path not in dag_bag.import_errors:
result.append((strip_path_prefix(file_path), "No import errors"))

return result


@pytest.mark.parametrize("rel_path, rv", get_import_errors(), ids=[x[0] for x in get_import_errors()])
def test_file_imports(rel_path, rv):
"""Test for import errors on a file"""
if os.path.exists(".astro/dag_integrity_exceptions.txt"):
with open(".astro/dag_integrity_exceptions.txt", "r") as f:
exceptions = f.readlines()
print(f"Exceptions: {exceptions}")
if (rv != "No import errors") and rel_path not in exceptions:
# If rv is not "No import errors," consider it a failed test
raise Exception(f"{rel_path} failed to import with message \n {rv}")
else:
# If rv is "No import errors," consider it a passed test
print(f"{rel_path} passed the import test")
8 changes: 8 additions & 0 deletions airflow/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
astro
.git
.env
airflow_settings.yaml
logs/
.venv
airflow.db
airflow.cfg
11 changes: 11 additions & 0 deletions airflow/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.git
.env
.DS_Store
airflow_settings.yaml
__pycache__/
astro
.venv
airflow-webserver.pid
webserver_config.py
airflow.cfg
airflow.db
3 changes: 3 additions & 0 deletions airflow/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FROM quay.io/astronomer/astro-runtime:11.8.0
RUN mkdir /home/astro/.dbt
COPY ./dbt_profile.yml /home/astro/.dbt/profiles.yml
File renamed without changes.
Loading