You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a lot of repeated code across each county's scraper and parser scripts. And lots of code that should be repeated but isn't (like retrying on transient network failures). Additionally, there are a handful of counties which use shared systems (e.g., https://www.mptsweb.net/).
Ideally, a county script would instantiate a class with a few variables (CSV file location, URL template, etc) and define a parse_html() function and call a method which takes care of everything else.
I'm working on this as part of Placer (#17). I'm creating this issue to track and discuss the work.
@typpo One question I have so far is related to my Placer work. You recommend the geojson script step. It seems like it'd be easier to do this in python (with, e.g., pyshp) to minimize the number of steps that someone has to follow. Have you found that the geojson script is better for one reason or another?
The text was updated successfully, but these errors were encountered:
Reusable classes would be very useful! Thanks for getting this started.
Using pyshp would be nicer and cleaner than the geojson conversion. I'm in the habit of converting to geojson first just so I can see what type of data is in the shapefile (for example: is there all the required address info? Is there zoning info? Does it use latlng or XY coordinates).
There is a lot of repeated code across each county's scraper and parser scripts. And lots of code that should be repeated but isn't (like retrying on transient network failures). Additionally, there are a handful of counties which use shared systems (e.g., https://www.mptsweb.net/).
Ideally, a county script would instantiate a class with a few variables (CSV file location, URL template, etc) and define a parse_html() function and call a method which takes care of everything else.
I'm working on this as part of Placer (#17). I'm creating this issue to track and discuss the work.
@typpo One question I have so far is related to my Placer work. You recommend the geojson script step. It seems like it'd be easier to do this in python (with, e.g., pyshp) to minimize the number of steps that someone has to follow. Have you found that the geojson script is better for one reason or another?
The text was updated successfully, but these errors were encountered: