Skip to content

Commit

Permalink
Fix issue downloading NEET data
Browse files Browse the repository at this point in the history
  • Loading branch information
gilesdring committed Sep 19, 2023
1 parent 76db5d9 commit c04fb34
Show file tree
Hide file tree
Showing 9 changed files with 702 additions and 669 deletions.
1 change: 1 addition & 0 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ dvc = "*"
openpyxl = "*"
papermill = "*"
ipykernel = "*"
requests = "*"

[dev-packages]

Expand Down
1,143 changes: 588 additions & 555 deletions Pipfile.lock

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions scripts/cpi/dvc.lock
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ stages:
size: 8893
- path: working/metadata.csv
hash: md5
md5: 4ec0da4994782d25aca721561815e260
size: 7650
md5: 0da329f21f3d8a08af2216103729f224
size: 7642
outs:
- path: src/_data/sources/cpi/cpi_barchart.csv
hash: md5
Expand Down
12 changes: 6 additions & 6 deletions scripts/labour-market/dvc.lock
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ stages:
size: 608
- path: working/LMS_data.csv
hash: md5
md5: 87e8d61a9a784f9db2a56b1a67bccb8f
size: 16991591
md5: 0fa1500fa1ad5467cf0710943f012825
size: 16991592
outs:
- path: data/labour-market/monthly-rolling.csv
hash: md5
Expand All @@ -29,13 +29,13 @@ stages:
size: 359985
- path: ../../scripts/util/
hash: md5
md5: 13221061b7639bdb76f26c9ffb1fc256.dir
size: 13180
md5: bbd9c17bdb21207e9126ae17bc04e2fc.dir
size: 13118
nfiles: 9
- path: ../../working/metadata.csv
hash: md5
md5: 4ec0da4994782d25aca721561815e260
size: 7650
md5: 0da329f21f3d8a08af2216103729f224
size: 7642
- path: config.py
hash: md5
md5: c9187b99fb9c48a24be707561d15eaa2
Expand Down
9 changes: 4 additions & 5 deletions scripts/util/downloader.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
from urllib.request import build_opener, install_opener, urlretrieve
import requests


def download_file(url, filename, headers=[]):
opener = build_opener()
opener.addheaders = headers
install_opener(opener)
urlretrieve(url, filename)
response = requests.get(url)
with open(filename, 'wb') as file:
file.write(response.content)
8 changes: 4 additions & 4 deletions scripts/vacancies/dvc.lock
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ stages:
size: 5733
- path: working/metadata.csv
hash: md5
md5: 4ec0da4994782d25aca721561815e260
size: 7650
md5: 0da329f21f3d8a08af2216103729f224
size: 7642
- path: working/vacancies/vacancies_by_sector.csv
hash: md5
md5: d87aa1a94f87db2274f2b87c9ee0d9ab
Expand Down Expand Up @@ -44,8 +44,8 @@ stages:
size: 822
- path: working/LMS_data.csv
hash: md5
md5: 87e8d61a9a784f9db2a56b1a67bccb8f
size: 16991591
md5: 0fa1500fa1ad5467cf0710943f012825
size: 16991592
- path: working/lookups/LMS_variable_lookup.csv
md5: faf1d16226f95f8d2448b90d1d5868a0
size: 137661
Expand Down
10 changes: 5 additions & 5 deletions working/LMS_data.csv.dvc
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
md5: ff0510b134124df1867a91a9882bc780
md5: d7f482f9d61f988b2c308c01a34be96a
deps:
- checksum: '"fa83c30aceea0bbe5a6d6427acef0b8d3bc53c483821062d7cf889dadf00517b"'
size: 16991591
- checksum: '"c699a3037cbb8a20a37a025397a3d530eaa8ff7be0578333bcd51f2d3355404d"'
size: 16991592
path:
https://github.com/economic-analytics/edd/blob/main/data/csv/LMS_data.csv?raw=true
hash: md5
outs:
- md5: 87e8d61a9a784f9db2a56b1a67bccb8f
size: 16991591
- md5: 0fa1500fa1ad5467cf0710943f012825
size: 16991592
path: LMS_data.csv
hash: md5
10 changes: 5 additions & 5 deletions working/metadata.csv.dvc
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
md5: f92e654b731dfc09c164adf560421856
md5: 6e55818c9d84fc8f1da2c5ad320405d8
# frozen: true
deps:
- checksum: '"3e601232be60001167fd43759b008e7b6bc434ae603a7ae628fee84867623ac4"'
size: 7650
- checksum: '"b3fbe22876b819ad38760dc198b67328657e026030573b7aab6b5ece9441900f"'
size: 7642
hash: md5
path:
https://raw.githubusercontent.com/economic-analytics/edd/main/data-raw/edd_dict.csv
outs:
- md5: 4ec0da4994782d25aca721561815e260
size: 7650
- md5: 0da329f21f3d8a08af2216103729f224
size: 7642
hash: md5
path: metadata.csv
Loading

0 comments on commit c04fb34

Please sign in to comment.