Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add-data-Senegal CEO 2022 set 1&2 #340

Merged
merged 10 commits into from
Aug 30, 2023
6 changes: 3 additions & 3 deletions data/datasets.dvc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
outs:
- md5: b6a08170b543289fc043576b00e8a65c.dir
size: 658002555
nfiles: 45
- md5: 001feb4ecdaa108deaf43002ef840c11.dir
size: 663324332
nfiles: 46
path: datasets
hash: md5
6 changes: 3 additions & 3 deletions data/raw.dvc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
outs:
- md5: 53662f45a86eb8f39bd26f87f3b98e6e.dir
size: 442437587
nfiles: 380
- md5: f63283bc4a661fb36f405f0dc99da064.dir
size: 442919107
nfiles: 382
path: raw
hash: md5
11 changes: 11 additions & 0 deletions data/report.txt
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,17 @@ eo_data_skipped 82



Senegal_CEO_2022 (Timesteps: 16)
----------------------------------------------------------------------------
disagreement: 10.5%
eo_data_complete 1342
eo_data_skipped 158
✔ training amount: 276, positive class: 4.7%
✔ validation amount: 516, positive class: 6.6%
✔ testing amount: 550, positive class: 10.7%



HawaiiAgriculturalLandUse2020 (Timesteps: 24)
----------------------------------------------------------------------------
eo_data_complete 4834
Expand Down
24 changes: 24 additions & 0 deletions datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -1120,6 +1120,30 @@ def load_labels(self) -> pd.DataFrame:
),
),
),
CustomLabeledDataset(
dataset="Senegal_CEO_2022",
country="Senegal",
raw_labels=(
RawLabels(
filename="ceo-Senegal-March-2022---March-2023-Stratified-sample-(Set-1)-sample-data-2023-08-28.csv", # noqa: E501
class_prob=lambda df: (df["Does this pixel contain active cropland?"] == "Crop"),
start_year=2022,
train_val_test=(0.2, 0.4, 0.4),
latitude_col="lat",
longitude_col="lon",
filter_df=clean_ceo_data,
),
RawLabels(
filename="ceo-Senegal-March-2022---March-2023-Stratified-sample-(Set-2)-sample-data-2023-08-28.csv", # noqa: E501
class_prob=lambda df: (df["Does this pixel contain active cropland?"] == "Crop"),
start_year=2022,
train_val_test=(0.2, 0.4, 0.4),
latitude_col="lat",
longitude_col="lon",
filter_df=clean_ceo_data,
),
),
),
HawaiiAgriculturalLandUse2020(),
KenyaCEO2019(),
HawaiiCorrective2020(),
Expand Down