The python environment is managed using pipenv
.
You'll need a working version of pip
installed and on the path.
You should have this if you have python
installed.
Install pipenv
using pip install pipenv
.
Next, navigate to this repository in the terminal.
Run pipenv sync
to install all packages (specified in Pipfile.lock).
You can then use pipenv shell
to activate the virutal env and install additional packages with pipenv install <package_name>
.
If you're using VScode, you can load the jupyter notebook and choose the kernel associated with the virtual environment you created. This should allow you to run the notebooks with the required packages.
gtfrt2gtfs_interpolation
- Current method for matching buses to the timetable. Makes use of current_stop_sequence
and current_status
in GTFS-RT, along with interpolation to fill gaps where we don't track buses.
gtfsrt_to_csv.ipynv
- Code for converting and cleaning GTFSRT raw data into CSV format. Useful as a pre-processing step.
demo.ipynb
- Older method for matching buses to the timetable using distance and bearing between stops and buses.
intersection.ipynb
- Script to calculate geojson intersections for equal time isochrones over multiple days' data.
utils.py
- General utility functions to keep main notebooks tidy.
gtfs-realtime-utils.py
- GTFS Realtime specific utility functions to keep notebooks tidy.
A notebook to calculate populations inside isochrones for England.
Functions in utils.py
. Main script is calculator.ipynb
Realtime bus data that powers our bus tracking tool.
process.ipynb
is the main script for producing the data.
Data is organised by region (using NUTS codes).
Each data file is named by a unique ID, which is either scraped from bustimes.org, or failing that, the route_short_name-agency_noc.json
e.g. X84-FLDS.json
.
id
: a unique id of the bus route.name
: human readable equivalent of ID.agency_name
: Name of the company operating the busagency_noc
: national operator codebustimesorg
: boolean. Whether or not the meta info was matched with bustimes.org.
Keys are shape_id
from the GTFS timetable.
Values an array of arrays containing long/lat pairs. E.g. [[lon_1
, lat_1
],...,[lon_n
, lat_n
]]
There can be more than one shape for each route.
Dictionary of stops on the route.
Keys are the stop_id
of each stop on the route.
Values are:
name
: human name of the stoplon
: longitudelat
: latitudebearing
: Direction of travel on an 8 point compass.
trips
: the stop_id and arrival times of buses for stops on the route, according to the timetable, and the realtime data
trips
is an array. Each item of the array is itself an array of stop times. Each stop time is an array of the form [stop_id
, realtime
, timetable
, interpolated
].
stop_id
is a unique identifier of the stop, realtime
is a unixtimestamp of when the bus arrived at the stop and timetable
is a timestamp of when the bus was timetabled to arrive at the stop. interpolated
is either 1 or 0, where 0 means the timestamp was observed in the live data and 1 means it is an interpolated value.
DEPRACATED: Exploring use of R5R to create travel time isochrones.
extract.py
- Extract the gtfsrt.bin
file from the gtfsrt.zip
that comes from BODS for all the live location data we downloaded.
rename.py
- rename a file endings to .zip
.
Documentation on GTFS and GTFS-RT format can read https://gtfs.org/documentation/overview/. It takes a bit of time to get familiar with the GTFS format but the documentation is helpful and worth referring to.
You can verify GTFS timetables here https://gtfs-validator.mobilitydata.org/.