Repository for the code used for download and processing of the NUS Global Streetscapes dataset, developed by the Urban Analytics Lab (UAL) at the National University of Singapore (NUS).
You can read more about this project on its website too. It includes an overview of the project together with the background, paper, examples, FAQ, etc.
The journal paper can be found here and the dataset is hosted on Hugging Face. For users who have no access to Hugging Face, the dataset is also available on Baidu Cloud Disk (code: 98tr).
This repository contains also a detailed Wiki with tutorials.
Global Streetscapes is an open dataset made up of 10 million Street View Images (SVIs) spanning 688 cities from 212 countries and regions, crowdsourced from Mapillary and KartaView. The map below illustrates the geographical coverage of the dataset.
Apart from their original metadata, each image has been enriched with a wide range of geospatial, temporal, contextual, semantic, and perceptual information adding up to 346 unique features, as shown in the below illustration.
The plots below illustrate the class or value distribution among the 10 million images for (A) continents covered, (B) settlement typology (degree of urbanisation), (C) OSM road type, (D) camera projection type, (E) season, (F) hour of the day, (G) transportation mode, and (H) perception scores.
To install requirements for CV (computer vision) related tasks (i.e. code/model_training
):
Install Python 3.10.14
pip install -r requirements-cv-linux.txt
Note that some packages might require Linux system.
To install requirements for non-CV related tasks (i.e. code/raw_download
, code/download_imgs
, code/enrichment
):
Install Python 3.10.1
pip install -r requirements-non_cv.txt
Please visit our data repository to download the dataset.
info.csv
outlines the meaning of each variable in this dataset.
We recommend you download the Anaconda Python Distribution and use Jupyter to get an understanding of the data.
Example notebooks are found in notebooks/
and figures and plots in imgs/
.
Our data repository hosts only the tabular data (.csv
) due to resource constraints.
If you wish to download the imagery data (.jpeg
) of Global Streetscapes, we recommend you to follow the instructions on this Wiki.
The detailed documentation on how this dataset was created and enriched can be found in this repo's Wiki and in the original publication.
Interested users can adapt the scripts to download and enrich new data as well.
The charts below show the class distribution for each of the eight contextual attributes that we have manually labelled for a subset of the dataset, and some example images for each class. We used this manually labelled subset for training computer vision models that were used to label the remaining data.
The Model training wiki page elaborates on the steps to train and run the models.
The following attributes were manually labelled and the chosen model (one per attribute) is MaxViT.
Attribute | Data type | # classes | Values |
---|---|---|---|
Platform | String | 6 | driving/walking/clyching surface, railway, fields, tunnel |
Weather | String | 5 | clear, cloudy, rainy, snowy, foggy |
View direction | String | 2 | front/back, side |
Lighting condition | String | 3 | day, night, dusk/dawn |
Panoramic status | Boolean | 2 | true, false |
Quality | String | 3 | good, slightly poor, very poor |
Glare | Boolean | 2 | yes, no |
Reflection | Boolean | 2 | yes, no |
Model performance:
Attribute | Model | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|---|
Panoramic status | MaxViT | 0.999 | 0.995 | 0.995 | 0.995 |
Lighting condition | MaxViT | 0.962 | 0.916 | 0.897 | 0.905 |
Glare | MaxViT | 0.941 | 0.602 | 0.698 | 0.631 |
View direction | MaxViT | 0.874 | 0.735 | 0.912 | 0.780 |
Quality | MaxViT | 0.799 | 0.398 | 0.515 | 0.410 |
Reflection | MaxViT | 0.787 | 0.745 | 0.788 | 0.757 |
Weather | MaxViT | 0.755 | 0.664 | 0.608 | 0.599 |
Platform | MaxViT | 0.683 | 0.574 | 0.582 | 0.567 |
For the following attributes, pre-trained models were ran directly to infer the labels.
Attribute | Data type | Values | Model used |
---|---|---|---|
Instance segmentation | Integer | Pixel count, instance count | Mask2Former |
Scene recognition | String | Place type | VGG16 |
Human perception | Float | Score between 0 to 10 for each category (safety, lively, beautiful, wealthy, boring, and depressing) | Visual transformer |
If you use Global Streetscapes, please cite the paper:
Hou Y, Quintana M, Khomiakov M, Yap W, Ouyang J, Ito K, Wang Z, Zhao T, Biljecki F (2024): Global Streetscapes — A comprehensive dataset of 10 million street-level images across 688 cities for urban science and analytics. ISPRS Journal of Photogrammetry and Remote Sensing 215: 216-238. doi:10.1016/j.isprsjprs.2024.06.023
BibTeX:
@article{2024_global_streetscapes,
author = {Hou, Yujun and Quintana, Matias and Khomiakov, Maxim and Yap, Winston and Ouyang, Jiani and Ito, Koichi and Wang, Zeyu and Zhao, Tianhong and Biljecki, Filip},
doi = {10.1016/j.isprsjprs.2024.06.023},
journal = {ISPRS Journal of Photogrammetry and Remote Sensing},
pages = {216-238},
title = {Global Streetscapes -- A comprehensive dataset of 10 million street-level images across 688 cities for urban science and analytics},
volume = {215},
year = {2024}
}
Besides the published paper, a free version (postprint / author-accepted manuscript) can be downloaded here.