Skip to content

Commit

Permalink
Merge branch 'master' into doc
Browse files Browse the repository at this point in the history
  • Loading branch information
hadrilec committed Jan 2, 2025
2 parents 46d0d7d + 180d1b3 commit 974b7cd
Show file tree
Hide file tree
Showing 37 changed files with 380 additions and 447 deletions.
6 changes: 2 additions & 4 deletions .github/workflows/examples.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,7 @@ jobs:
- name: Test all examples
env:
insee_key: ${{ secrets.INSEE_KEY }}
insee_secret: ${{ secrets.INSEE_SECRET }}
sirene_key: ${{ secrets.SIRENE_KEY }}
run: |
pip install jupytext
pip install .[full]
Expand All @@ -48,8 +47,7 @@ jobs:
- name: Test idbank list download
env:
insee_key: ${{ secrets.INSEE_KEY }}
insee_secret: ${{ secrets.INSEE_SECRET }}
sirene_key: ${{ secrets.SIRENE_KEY }}
run: |
pip install .
python -c "from pynsee.macrodata._dwn_idbank_files import _dwn_idbank_files;_dwn_idbank_files()"
Expand Down
10 changes: 4 additions & 6 deletions .github/workflows/pkgTests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.9", "3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v2
Expand All @@ -22,7 +22,7 @@ jobs:
run: |
#sudo apt-get install libgeos-dev
python -m pip install --upgrade pip
pip install flake8 pytest pytest-cov geopandas nbconvert matplotlib descartes
pip install flake8 pytest pytest-cov geopandas nbconvert matplotlib descartes parameterized
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-extra.txt ]; then pip install -r requirements-extra.txt; fi
- name: Lint with flake8
Expand All @@ -33,8 +33,7 @@ jobs:
flake8 . --count --ignore=E722,C901 --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test examples
env:
insee_key: ${{ secrets.INSEE_KEY }}
insee_secret: ${{ secrets.INSEE_SECRET }}
sirene_key: ${{ secrets.SIRENE_KEY }}
run: |
pip install jupytext
pip install -r requirements.txt
Expand All @@ -48,8 +47,7 @@ jobs:
cd ../..
- name: Test with pytest
env:
insee_key: ${{ secrets.INSEE_KEY }}
insee_secret: ${{ secrets.INSEE_SECRET }}
sirene_key: ${{ secrets.SIRENE_KEY }}
run: |
pytest -v --cov
- name: "Upload coverage to Codecov"
Expand Down
10 changes: 4 additions & 6 deletions .github/workflows/pkgTests_pull_requests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.9", "3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v4
Expand All @@ -24,7 +24,7 @@ jobs:
run: |
#sudo apt-get install libgeos-dev
python -m pip install --upgrade pip
pip install flake8 pytest pytest-cov geopandas nbconvert matplotlib descartes
pip install flake8 pytest pytest-cov geopandas nbconvert matplotlib descartes parameterized
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-extra.txt ]; then pip install -r requirements-extra.txt; fi
- name: Lint with flake8
Expand All @@ -35,8 +35,7 @@ jobs:
flake8 . --count --ignore=E722,C901 --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test examples
env:
insee_key: ${{ secrets.INSEE_KEY }}
insee_secret: ${{ secrets.INSEE_SECRET }}
sirene_key: ${{ secrets.SIRENE_KEY }}
run: |
pip install jupytext
pip install -r requirements.txt
Expand All @@ -50,8 +49,7 @@ jobs:
cd ../..
- name: Test with pytest
env:
insee_key: ${{ secrets.INSEE_KEY }}
insee_secret: ${{ secrets.INSEE_SECRET }}
sirene_key: ${{ secrets.SIRENE_KEY }}
run: |
pytest -v --cov
- name: "Upload coverage to Codecov"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
matrix:
os: [ubuntu-20.04]
#[ubuntu-20.04, windows-2019, macOS-10.15]
python-version: ["3.8"]
python-version: ["3.12"]
#["3.7", "3.8", "3.9", "3.10"]

steps:
Expand Down
4 changes: 1 addition & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@ It benefits from the developments made by teams working on APIs at INSEE and IGN

## Installation & API subscription

The files available on [insee.fr](https://www.insee.fr) and IGN data, i.e. the use of `download` and `geodata` modules, do not require authentication.
Credentials are necessary to access some of the INSEE APIs available through `pynsee` by the modules `macrodata`, `localdata`, `metadata` and `sirene`.
API credentials can be created here : [portail-api.insee.fr](https://portail-api.insee.fr/)
Credentials are necessary to access SIRENE API available through `pynsee` by the module `sirene`. API credentials can be created here : [portail-api.insee.fr](https://portail-api.insee.fr/). All other modules are freely accessible.

```python

Expand Down
4 changes: 1 addition & 3 deletions docs/readme.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,7 @@ It benefits from the developments made by teams working on APIs at INSEE and IGN
Installation & API subscription
-------------------------------

The files available on `insee.fr <https://www.insee.fr>`_ and IGN data, i.e. the use of `download` and `geodata` modules, do not require authentication.
Credentials are necessary to access some of the INSEE APIs available through `pynsee` by the modules `macrodata`, `localdata`, `metadata` and `sirene`.
API credentials can be created here : `portail-api.insee.fr <https://portail-api.insee.fr/>`_
Credentials are necessary to access SIRENE API available through `pynsee` by the module `sirene`. API credentials can be created here : `portail-api.insee.fr <https://portail-api.insee.fr/>`_. All other modules are freely accessible.

.. code-block:: python
Expand Down
6 changes: 4 additions & 2 deletions pynsee/download/_download_pb.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

from pynsee.utils.requests_params import _get_requests_proxies
from pynsee.utils.requests_params import _get_requests_proxies, _get_requests_session


def _download_pb(url: str, fname: str, total: int = None):
Expand All @@ -19,8 +19,10 @@ def _download_pb(url: str, fname: str, total: int = None):
"""

proxies = _get_requests_proxies()
session = _get_requests_session()

resp = requests.get(url, proxies=proxies, stream=True, verify=False)
with session as s:
resp = s.get(url, proxies=proxies, stream=True, verify=False)

if total is None:
total = int(resp.headers.get("content-length", 0))
Expand Down
10 changes: 6 additions & 4 deletions pynsee/download/_get_file_list_internal.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
import io
import zipfile
import pkg_resources
import importlib
import json


def _get_file_list_internal():

zip_file = pkg_resources.resource_stream(
__name__, "data/liste_donnees.zip"
)
try:
zip_file = str(importlib.resources.files(__name__)) + "/data/liste_donnees.zip"
except:
import pkg_resources
zip_file = pkg_resources.resource_stream(__name__, "data/liste_donnees.zip")

with zipfile.ZipFile(zip_file, "r") as zip_ref:
zip_file = io.BytesIO(zip_ref.read("liste_donnees.json"))
Expand Down
12 changes: 6 additions & 6 deletions pynsee/download/download_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,13 @@ def download_file(id, variables=None, update=False, silent=False):
"""

with tempfile.TemporaryDirectory() as tmpdir:
try:
# try:

dwn = _download_store_file(tmpdir, id, update=update)
df = _load_data_from_schema(dwn, variables=variables)
dwn = _download_store_file(tmpdir, id, update=update)
df = _load_data_from_schema(dwn, variables=variables)

except:
warnings.warn("Download failed")
df = pd.DataFrame()
# except:
# warnings.warn("Download failed")
# df = pd.DataFrame()

return df
42 changes: 15 additions & 27 deletions pynsee/download/get_file_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,33 +31,21 @@ def get_file_list():
df = df.reset_index(drop=True)
df = _move_col_before(df, "id", "nom")

df.columns = [
"id",
"name",
"label",
"collection",
"link",
"type",
"zip",
"big_zip",
"data_file",
"tab",
"first_row",
"api_rest",
"md5",
"size",
"label_col",
"date_ref",
"meta_file",
"separator",
"type_col",
"long_col",
"val_col",
"encoding",
"last_row",
"missing_value",
]

rename_col_dict = {
"nom": "name",
"libelle": "label",
"lien": "link",
"fichier_donnees": "data_file",
"onglet": "tab",
"premiere_ligne": "first_row",
"fichier_meta": "meta_file",
"separateur": "separator",
"derniere_ligne": "last_row",
"valeurs_manquantes": "missing_value",
"disponible": "available"
}
df = df.rename(columns = rename_col_dict)

df = df[~df.link.str.contains("https://api.insee.fr")]

warning_metadata_download()
Expand Down
11 changes: 5 additions & 6 deletions pynsee/geodata/_find_wfs_closest_match.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
import os
import sys

import difflib

from pynsee.geodata._get_geodata import _get_geodata
from pynsee.geodata.get_geodata_list import get_geodata_list
from pynsee.utils.HiddenPrints import HiddenPrints

string = "ADMINEXPRESS-COG.LATEST:departement"


def _find_wfs_closest_match(string=string):
sys.stdout = open(os.devnull, 'w')
wfs = get_geodata_list()
sys.stdout = sys.__stdout__

with HiddenPrints():
wfs = get_geodata_list()

list_sugg = list(wfs.Identifier.unique())
suggestions = difflib.get_close_matches(
Expand Down
2 changes: 0 additions & 2 deletions pynsee/localdata/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from .get_area_list import get_area_list
from .get_geo_list import get_geo_list
from .get_local_data import get_local_data
from .get_included_area import get_included_area
from .get_nivgeo_list import get_nivgeo_list
from .get_local_metadata import get_local_metadata
from .get_population import get_population
Expand All @@ -15,7 +14,6 @@
"get_area_list",
"get_geo_list",
"get_local_data",
"get_included_area",
"get_nivgeo_list",
"get_local_metadata",
"get_population",
Expand Down
25 changes: 14 additions & 11 deletions pynsee/localdata/_find_latest_local_dataset.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@

import sys
import os
import re
from tqdm import trange
Expand All @@ -11,11 +10,12 @@
from pynsee.localdata._get_insee_local_onegeo import _get_insee_local_onegeo
from pynsee.utils._create_insee_folder import _create_insee_folder
from pynsee.utils._hash import _hash
from pynsee.utils.HiddenPrints import HiddenPrints

import logging
logger = logging.getLogger(__name__)

def _find_latest_local_dataset(dataset_version, variables, nivgeo, codegeo, update):
def _find_latest_local_dataset(dataset_version, variables, nivgeo, codegeo, update, backwardperiod = 6):

filename = _hash("".join([dataset_version] + ['_find_latest_local_dataset']))
insee_folder = _create_insee_folder()
Expand All @@ -26,7 +26,7 @@ def _find_latest_local_dataset(dataset_version, variables, nivgeo, codegeo, upda
datasetname = dataset_version.replace('latest', '').replace('GEO', '')

current_year = int(datetime.datetime.today().strftime('%Y'))
backwardperiod = 10

list_geo_dates = range(current_year, current_year-backwardperiod, -1)
list_data_dates = range(current_year, current_year-backwardperiod, -1)

Expand All @@ -44,11 +44,10 @@ def _find_latest_local_dataset(dataset_version, variables, nivgeo, codegeo, upda
dv = list_dataset_version[dvindex]

try:
sys.stdout = open(os.devnull, 'w')
df = _get_insee_local_onegeo(
variables, dv, nivgeo=nivgeo, codegeo=codegeo
)
sys.stdout = sys.__stdout__
with HiddenPrints():
df = _get_insee_local_onegeo(
variables, dv, nivgeo=nivgeo, codegeo=codegeo
)

if type(df) == pd.core.frame.DataFrame:
if len(df.index) == 1:
Expand All @@ -63,11 +62,15 @@ def _find_latest_local_dataset(dataset_version, variables, nivgeo, codegeo, upda
else:
dataset_version = dv
break

pickle.dump(dataset_version, open(file_localdata, "wb"))

f = open(file_localdata, "wb")
pickle.dump(str(dataset_version), f)
f.close()
else:
try:
dataset_version = pickle.load(open(file_localdata, "rb"))
f = open(file_localdata, "rb")
dataset_version = pickle.load(f)
f.close()
except:
os.remove(file_localdata)
dataset_version = _find_latest_local_dataset(
Expand Down
Loading

0 comments on commit 974b7cd

Please sign in to comment.