Skip to content

nathanbiller/whatthestreet-datawrangling

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What the Street!?

Part 1: Data Wrangling

screen shot 2017-06-27 at 19 11 42

What the Street!? was derived out of the question “How do new and old mobility concepts change our cities?”. It was raised by Michael Szell and Stephan Bogner during their residency at moovel lab. With support of the lab team they set out to wrangle data of cities around the world to develop and design this unique mobility space report.

What the Street!? was made out of open-source software and resources. Thanks to the OpenStreetMap contributors and many other pieces we put together the puzzle of urban mobility space seen above.

Implemented project URL: https://whatthestreet.com/

Read more about the technical details behind What the Street!? on our blog https://move-lab.space/blog/about-what-the-street and in the scientific paper https://www.cogitatiopress.com/urbanplanning/article/view/1209/.

Demo Video

Codebase

The complete codebase consist of two independent parts:

  1. Data Wrangling
  2. Front & Backend

This is part 1. It wrangles the OpenStreetMap data and creates the SVG files underlying the visuals of What the Street!?.

You can find part 2 here: https://github.com/mszell/whatthestreet

Table of Contents

Note about code quality

The code in this repository was produced for the specific use case of wrangling data for What the Street!?. Since this is not live production code, but code to pre-process data, we did not strictly apply best practices of software development. The code grew organically together with various deadlines and requirements that came from front & backend develoment on the way. Nevertheless, we commented and documented as well as possible to make the process reproducible.

Inital setup

  • Get NodeJS
  • Get osmconvert.c. Set border__edge_M 1300004 so it can handle larger poly files. Compile: gcc osmconvert.c -lz -O3 -o osmconvert
  • Get osmfilter
  • Get MongoDB
  • Get QGIS, install the osmpoly_export plugin
  • Get Anaconda (Python 3.*)
  • Get SVGnest-batch
  • Extra python libraries to install: pymongo, shapely, haversine, osmnx (first rtree!), pyprind
  • Get mongosm from Stephan Bogner's fork, and in options.js of mongosm/lib, set populateGeometry: false. To get dependencies, run: npm install

Adding a city

Follow this order. If not noted otherwise, run commands in terminal. Examples for Berlin.

Process OSM data

1. Create geo files

  1. Get shapefile of city boundary from somewhere. If nowhere found, use Turbopass, mind the correct admin_level:
    [out:json]; ( node[boundary="administrative"][admin_level=6](48.46, 8.79, 48.93, 9.50); way[boundary="administrative"][admin_level=6](48.46, 8.79, 48.93, 9.50); ) ;(._;>;); out skel;
  • Load file into QGIS, save as a duplicate with correct CRS: EPSG:4326, and save as berlin_boundary.poly via Vector > Export OSM Poly
  • Download and unpack osm.bz2 file that contains the city from geofabrik, e.g berlin-latest
  • Crop osm file according the boundary using osmconvert ./osmconvert berlin-latest.osm -B=berlin_boundary.poly --drop-broken-refs -o=berlin_cropped.osm

2. Load into mongoDB

  1. Load osm data into mongoDB via mongosm node mongosm.js --max_old_space_size=8192 -db berlin_raw -f berlin_cropped.osm (don't forget to run npm install before running the script the first time to install dependencies)
  2. Set cityname parameter (in Jupyter notebook) and execute 01_generategeometries.ipynb

3. Street names

Note: osmfilter seems buggy and does not actually remove some things we want removed. That's why we need to manually remove them in the end.

  1. Use osmfilter to create a temporary osm file containing only the relevant streets (to derive names from) ./osmfilter berlin_cropped.osm --keep="highway=residential =primary =secondary =tertiary =unclassified" --drop="public_transport=stop_position public_transport=platform public_transport man_made boundary leisure amenity highway=traffic_signals =motorway_junction =bus_stop railway building entrance=yes barrier=gate barrier shop" > temp.osm
  • Extract names and export as csv ./osmconvert temp.osm --all-to-nodes --csv="name" > temp.csv
  • Sort alphabetically and discard duplicates sort -u temp.csv > citydata/berlin_streetnames.txt
  • Check manually and delete obvious errors.
  • Sort by length of string cat citydata/berlin_streetnames.txt | awk '{ print length, $0 }' | sort -n -s | cut -d" " -f2- > citydata/berlin_streetnames_bylength.txt
  • Check again manually for obvious errors.

4. Generate streets

  1. Set cityname parameter (in Jupyter notebook) and execute 02_unwindbike.ipynb
  • Set cityname parameter (in Jupyter notebook) and execute 03_unwindrail.ipynb
  • Set cityname parameter (in Jupyter notebook) and execute 04_unwindstreet.ipynb

Generate parking spots

1. Create SVGs

  1. Serve SVGnest-batch locally (e.g. python3 -m http.server or python -m SimpleHTTPServer 8000)
  • Set cityname parameter and execute 05_parkingtosvgbike.ipynb step by step.

    • This involves executing SVGnest-batch inbetween!
    • If SVGNest fails → execute 06_parkingtosvgbikealt.ipynb instead!
    • In the end an SVG (all.svg) like the following is created:
      SVG of bike parking spots
  • Set cityname parameter and execute 07_parkingtosvgcar.ipynb step by step.

    • This involves executing SVGnest-batch inbetween!
    • In the end an SVG (all.svg) like following is created (shown rotated):
      SVG of car parking spots

2. Add neighborhood information to SVG/mongoDB

Open 08_add_neighborhoods and run node index.js to get instructions

3. Add parking space size information to SVG

Open 09_add_parking_space_size and run node index.js to get instructions

Calculate area for streets/rails

This adds size information to the mongoDB

1. Calculate area

Open 10_calculate_area and run node index.js to get instructions

2. Add area

Open 11_add_area and run node index.js to get instructions

Set up landmark

0. Finding a landmark

Search for a proper landmark in the city (around the size of Central Park in NY or Mt. Tabor in Portland)

1. Tracing

The outlines can be traced via geojson.io, but any tool should be fine which produces a geojson-file. Make sure you trace as a polygon to calculate its size.

2. Area size information

  1. Import geojson to geojson.io
  2. Click on shape
  3. Select info
  4. Extract m² information and update citymetadata.json

3. Convert to SVG

  1. Run 15_landmarkReference in order to obtain a reference square (you only have to do this step once)
  2. Merge the geojson of the landmark together with the reference
  3. Install the plugin SimpleSVG for QGIS
  4. Open the geojson and from 'Tab', select save as svg
  5. Save both, svg and geojson to GDrive

4. Edit in Sketch

  1. Import svgs
  2. Scale that the reference square equals the width in pixels of the other landmarks reference squares
  3. Style like other Landmarks
  4. Simplify shape if necessary
  5. Flatten text
  6. Export

Generate street coils

Open 12_generate_coils and run node index.js to get instructions Note: Running this script will result in large file sizes

Update citymetadata.json

  1. Use 13_get_information
  2. Use 16_getSvgHeights

Team

Concept and Coding

Direction

Benedikt Groß

Website Front & Backend Engineering

Thibault Durand

Website Implementation Assistant

Tobias Lauer

Visual Design

Anagrama

Extended Team

  • Raphael Reimann
  • Joey Lee
  • Daniel Schmid
  • Tilman Häuser

Server Setup and Migration

Florian Porada

City Data Wrangling Assistant

Johannes Wachs

Data Sources

OpenStreetMap, a free alternative to services like Google Maps. Please contribute, if you notice poor data quality.

https://donate.openstreetmap.org/

Acknowledgements

About

What the Street for Grand Rapids, Michigan!?

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 53.4%
  • Jupyter Notebook 38.3%
  • HTML 7.4%
  • Other 0.9%