-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for EO newbies #95
Comments
Meant to tag @gabrieltseng |
Hi @dks4-hw , Thanks for the feedback! I'll work on adding some better documentation. In the meantime, to help you get started: All the data is accessible through the
So if I wanted to use this to train a model to identify crop vs. non crop in France, I might do it like this: from sklearn.ensemble import RandomForestClassifier
from cropharvest.datasets import Task, CropHarvest
from cropharvest.countries import get_country_bbox
my_dataset = CropHarvest(
# the first argument to the dataset is the (already existing)
# folder into which the data will be downloaded / already exists
"data",
Task(
# get_country_bbox returns a list of bounding boxes.
# the one representing Metropolitan France is the
# 2nd box
bounding_box=get_country_bbox("France")[1],
normalize=True
)
)
X, y = my_dataset.as_array(flatten_x=True)
model = RandomForestClassifier(random_state=0)
model.fit(X, y) I hope this helps to get started; in the meantime, I'll write up some more thorough documentation. |
Hello, I'm trying to run this exact example. But after my_dataset = CropHarvest(
Task(
# get_country_bbox returns a list of bounding boxes
bounding_box=get_country_bbox("France")[0],
normalize=True
)
) it returns Traceback (most recent call last):
File "C:\Users\leand\AppData\Local\Temp\ipykernel_19248\3361455196.py", line 1, in <module>
my_dataset = CropHarvest(
File "C:\Users\leand\anaconda3\envs\crop\lib\site-packages\cropharvest\datasets.py", line 203, in __init__
super().__init__(root, download, filenames=(FEATURES_DIR, TEST_FEATURES_DIR))
File "C:\Users\leand\anaconda3\envs\crop\lib\site-packages\cropharvest\datasets.py", line 60, in __init__
self.root = Path(root)
File "C:\Users\leand\anaconda3\envs\crop\lib\pathlib.py", line 1042, in __new__
self = cls._from_parts(args, init=False)
File "C:\Users\leand\anaconda3\envs\crop\lib\pathlib.py", line 683, in _from_parts
drv, root, parts = self._parse_args(args)
File "C:\Users\leand\anaconda3\envs\crop\lib\pathlib.py", line 667, in _parse_args
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not Task Any ideas of what is wrong? Thank you very much. EDIT: runing Task(bounding_box=BBox(min_lat=41.384912109374994, max_lat=43.021484375, min_lon=8.565625000000011, max_lon=9.556445312500017, name='France_0'), target_label='crop', balance_negative_crops=False, test_identifier=None, normalize=True) but then for the next part |
Hi @kolrocket ; apologies. There was a bug in the example above, which is now fixed. I've confirmed the code runs: >>> from sklearn.ensemble import RandomForestClassifier
>>> from cropharvest.datasets import Task, CropHarvest
>>> from cropharvest.countries import get_country_bbox
>>> my_dataset = CropHarvest("data", Task(bounding_box=get_country_bbox("France")[1], normalize=True))
>>> X, y = my_dataset.as_array(flatten_x=True)
>>> X.shape, y.shape
((6603, 216), (6603,))
>>> model = RandomForestClassifier(random_state=0)
>>> model.fit(X, y)
RandomForestClassifier(random_state=0) |
Thank you again! |
This looks like a terrific ML resource with a powerful API. But your documentation is a bit lean, especially for EO newbies. The map in README.md suggests there is terrific image coverage in the dataset of Europe and North America, but the example code is limited to Togo, with benchmarks for Kenya & Brazil.
Can we use cropharvest to feed data for Europe or North America to ML models? I am guessing we need to supplement the features data download with those features in geographies we want to perform ML on. How do we use cropharvest to do that? It is not obvious.
Forgive me if the dataset is intended only for Kenya/Brazil/Togo only and I have misunderstood. As EO professionals you will be familiar with the sentinelsat library whose documentation is brilliant for EO newbies but does not produce ML ready products. Could you produce something as explanatory but with a ML ready output?
The text was updated successfully, but these errors were encountered: