Skip to content

A Python tool that systematically collects Google Street View panorama metadata across New York City boroughs, gathering location, date, and copyright information for each sampled coordinate.

License

Notifications You must be signed in to change notification settings

yz3440/nyc-gsv-collector

Repository files navigation

NYC Street View Data Collector

A Python-based tool for collecting and analyzing Google Street View panorama data across New York City boroughs. You can modify the script to collect data from other areas.

Overview

This project samples coordinates throughout NYC boroughs and collects associated Google Street View panorama data. It uses a multi-threaded approach to efficiently gather panorama metadata including location, date, and copyright information.

Features

  • Generates coordinate grid points within NYC borough boundaries
  • Searches for Street View panoramas near sampled coordinates
  • Collects panorama metadata (date, copyright, location, etc.)
  • Multi-threaded processing for improved performance
  • Progress tracking and statistics

Prerequisites

  • Python 3.x
  • Google Maps API key

Install dependencies

pip install -r requirements.txt

Data Storage

All data is stored in a local SQLite database named gsv.db. This database is created automatically when running the scripts and maintains the state throughout the entire scraping process. The database persists between runs, allowing for interrupted scraping jobs to be resumed.

Usage

Sample Coordinates

python 01-sample-coordinates.py

This script samples coordinates throughout NYC boroughs and saves them to a SQLite database. It uses a grid of points with a specified spacing (by default 5 meters, but can be adjusted).

Search Panoramas near Coordinates

python 02-search-and-update-metadata.py

This script searches for panoramas near the sampled coordinates and updates the metadata in the database. It uses a multi-threaded approach to efficiently gather panorama metadata.

Update Panoramas with Date and Copyright

python 03-search-date-and-copyright.py

At the search step, we only get the panorama id, lat, lon, and heading. We need to get the date and copyright information to get a complete panorama.

Check Progress

python 04-check-progress.py

While the search and update steps are running, you can use this script to check the progress.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

This project uses the following open-source packages:

For full license texts of dependencies, please see their respective repositories.

The NYC boroughs geojson data in geojson/Borough Boundaries.geojson is from NYC Open Data.

About

A Python tool that systematically collects Google Street View panorama metadata across New York City boroughs, gathering location, date, and copyright information for each sampled coordinate.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages