-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quality Control for AHP_parser #11
Comments
Which URLs/sites are these? Is it this one: CA_city_websites_final.csv? |
Hello, |
Ok cool. Could you please specify the URLs to check the status? Is it one of the columns in this file 'CA_city_websites_final.csv'? |
Sorry about the delay in responding! I'm talking about the urls returned by the html-request scraper. I would encourage you to try running the scraper on your own to find any issues, but you can also find the output on this Google Sheet: https://docs.google.com/spreadsheets/d/11offSYz2irnjI-9tILkcI-ClclRUZ0pyhXtPy-G4i8g/edit?usp=sharing All columns besides CITY and CITY_URL are what needs to be quality-checked. |
Update: html-request scraper 2 has been renamed to AHP_parser |
It would be good to verify that the urls that we're pulling in are actually valid with no errors.
Can someone please do a simple loop on the AHP_parser to request the sites and pull the status codes? If we're getting anything besides 200 codes, then we have some problems.
The text was updated successfully, but these errors were encountered: