A Python-based tool for collecting and analyzing Google Street View panorama data across New York City boroughs. You can modify the script to collect data from other areas.
This project samples coordinates throughout NYC boroughs and collects associated Google Street View panorama data. It uses a multi-threaded approach to efficiently gather panorama metadata including location, date, and copyright information.
- Generates coordinate grid points within NYC borough boundaries
- Searches for Street View panoramas near sampled coordinates
- Collects panorama metadata (date, copyright, location, etc.)
- Multi-threaded processing for improved performance
- Progress tracking and statistics
- Python 3.x
- Google Maps API key
pip install -r requirements.txt
All data is stored in a local SQLite database named gsv.db
. This database is created automatically when running the scripts and maintains the state throughout the entire scraping process. The database persists between runs, allowing for interrupted scraping jobs to be resumed.
python 01-sample-coordinates.py
This script samples coordinates throughout NYC boroughs and saves them to a SQLite database. It uses a grid of points with a specified spacing (by default 5 meters
, but can be adjusted).
python 02-search-and-update-metadata.py
This script searches for panoramas near the sampled coordinates and updates the metadata in the database. It uses a multi-threaded approach to efficiently gather panorama metadata.
python 03-search-date-and-copyright.py
At the search step, we only get the panorama id, lat, lon, and heading. We need to get the date and copyright information to get a complete panorama.
python 04-check-progress.py
While the search and update steps are running, you can use this script to check the progress.
This project is licensed under the MIT License. See the LICENSE file for details.
This project uses the following open-source packages:
- geopandas (BSD-3-Clause License)
- python-dotenv (BSD-3-Clause License)
- streetview
For full license texts of dependencies, please see their respective repositories.
The NYC boroughs geojson data in geojson/Borough Boundaries.geojson
is from NYC Open Data.