Skip to content

Commit

Permalink
Refactor: improve the data file structure
Browse files Browse the repository at this point in the history
  • Loading branch information
luke-strange committed Aug 7, 2024
1 parent 2d9e938 commit fb21452
Show file tree
Hide file tree
Showing 27 changed files with 62,279 additions and 63,449 deletions.
7 changes: 7 additions & 0 deletions data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Data files are grouped by their topic / dataset e.g. Affordable homes.
Each topic contains two directories: `site` and `standard`.
In `standard`, data are stored in a standardised format. These always include the `geography_code`, `geography_name`, `date`, `Measure` and `value` columns. These files are used to generate metadata and for manually checking what is in the file, if needed.
In `site`, data are stored in `parquet` files in the correct shape they need to be in to power a visualisation. This is usally a wide (or pivoted) version of the `standard` files.
In some cases, for example a `headlines.csv` file, these arae in a unique format to drive a particular visualisation type, e.g. an OI Lume `dashboard`.

Any questions, suggestions, or improvements - let me know!
9,797 changes: 0 additions & 9,797 deletions data/affordable-homes/by_tenure.csv

This file was deleted.

Binary file not shown.
23,469 changes: 23,469 additions & 0 deletions data/affordable-homes/standard/by_tenure.csv

Large diffs are not rendered by default.

Loading

0 comments on commit fb21452

Please sign in to comment.