-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: api to get data extracts for the given aoi and category #960
feat: api to get data extracts for the given aoi and category #960
Conversation
@spwoodcock This works in case users uploads a single aoi. |
Could we load the geojson with shapely, then just get the bounds (bbox) of the features? That would give us a single bounding geometry to pass to osm-rawdata. |
I guess, it might be okay, if I use convex_hull of those multipolygon and create a boundary, |
Great work! Thanks @nrjadkry 🙏 The only thing I would like to do is test the performance - I assume it's reasonably quick to return the extract for good UX in the creation flow. If so, this is definitely the best solution! (I think raw-data-api creates a file in S3 for every request, but it's temporary and deleted after 90 days). |
@spwoodcock It does not have that much of a performance issue for a relatively small or medium sized polygons, It might take some time for large polygons. |
One small efficiency gain:
The yaml file details are in osm-rawdata and raw-data-api docs 👍 |
I have passed the yaml file in the PostgresClient but they are the existing yaml files for the categories in the osm-fieldwork. Those yaml files do not have filter keys. So, I need to filter the data obtained as well. |
It's a different file than the one for osm-fieldwork. See the raw-data-api docs. Saying that, the docs actually show it as a JSON file, so I'm not sure if the YAML definition is valid yet. |
Looking into it, YAML definition is valid and handled by osm-rawdata. I would use YAML over JSON. You can create the YAML file using QueryConfig. (Creating the file is inefficient, but there is an open issue to accept BytesIO objects instead in osm-rawdata in the todo list). |
Also, Rob generally has an example usage of the code in his It might help if I pull this out into a wrapper convenience function (so we can call via both code or CLI easily). |
Rob has done this in the
I have done the same in this too. Those filters are created inside the execQuery function by the osm-raw-data itself. We dont need to create the config yaml file here too. |
Yes you are right that execQuery is used to execute the query and return the result, so that is still required (I got that wrong above). But the main function also has a config file passed in as a required argument. We also need to pass a config file to filter the result directly, instead of using osm-fieldwork.filter_data.FilterData. (this will do a filter on the underlying SQL query when the data is fetched, instead of fetching all data, then filtering, potentially reducing the web payload returned significantly) |
I will update osm-rawdata to accept a BytesIO config file in the coming days. In the meantime we need to generate the config file on disk under /tmp as a workaround (being sure to delete it afterwards). |
I have passed the config file as well in this.
I have passed the config file from osm-fieldwork. Since, we just need to check if the data exists or not for the category. |
Looks good to me 🎉 Just passing the osm-fieldwork YAML config is the best option in this case, as it's just a preview 👍 When we create the actual data extract we would have to use JSON format, as it supports more params: https://hotosm.github.io/osm-rawdata/json/ While the YAML format only contains the |
I have made an api which returns the data extracts available in the provided AOI( Area of Interest) from osm using raw-data-api.
This api accepts an aoi and the category for which data extracts are required as a parameter.