Skip to content

Backend: Implementation

Daniel Tischner edited this page Jul 18, 2018 · 11 revisions
Table of Contents

Overview

The backend is organized in the following way:

  • Main - Entry point of the program, forwards to Application.
  • Application - The backend itself, initializes and starts the whole application
  • commands - Representing and parsing command line arguments
    • CommandParser - Parses command line arguments into CommandDate objects
    • CommandDate - The model which represents command line arguments
  • config - Read, store and provide data of configuration files
    • ConfigLoader - Loads configuration files into the ConfigStore
    • ConfigStore - Provider of configuration data
  • db - Interaction with a database that stores application relevant data
    • ExternalDatabase - External database to use if configuration is set accordingly
    • MemoryDatabase - Internal fall-back database to use if configuration is set accordingly
    • DatabaseUtil - Contains all SQL queries which ExternalDatabase uses
    • OsmDatabaseHandler - Fills the database with OSM data
  • parsing - Parsing external data-sets into the internal model
    • DataParser - Parses all data-sets and forwards entities to registered handler
    • osm
      • OsmParser - Parses OSM files into entities
      • OsmReducer - Reduces OSM files using a filter, is used by the reduce command line argument
    • gtfs
      • GtfsParser - Parses GTFS files into entities
  • routing - Answering routing queries, computation of shortest paths
    • algorithms - Algorithms used for answering routing queries
    • model
      • graph
        • IGraph - Model of the graph, implementations like RoadGraph, TransitGraph, LinkGraph
        • INode - Model of nodes, implementations like RoadNode, TransitNode
        • IEdge - Model of egdes, implementations like RoadEdge, TransitEdge, LinkEdge
      • timetable
        • Timetable - Specialized model for representing transit data. Consisting of stops, trips, connections and footpaths
    • parsing
      • osm
        • OsmRoadBuilder - Creates instances of RoadNode and RoadEdge out of OSM entities
        • OsmRoadHandler - Fills a road graph with OSM data
      • gtfs
        • GtfsConnectionBuilder - Creates instances of TransitNode and TransitEdge out of GTFS entities
        • GtfsRealisticTimeExpandedHandler - Fills a transit graph with GTFS data for a realistic time expanded model
        • GtfsTimetableHandler - Fills a timetable transit model with GTFS data
    • server
      • RoutingServer - Server which offers the REST API for answering routing requests
      • model
        • RoutingRequest - POJO representing the request, directly constructed out of the JSON request
        • RoutingResponse - POJO representing the response, directly parsed into a JSON response
  • searching - Answering search queries, computation of inverted indices and search algorithms
    • name
      • model
        • NodeNameSet - Data-structure to query name searches on
      • server
        • NameSearchServer - Server which offers the REST API for answering name search requests
        • model
          • NameSearchRequest - POJO representing the request, directly constructed out of the JSON request
          • NameSearchResponse - POJO representing the response, directly parsed into a JSON response
    • nearest
      • server
        • NearestSearchServer - Server which offers the REST API for answering nearest search requests
        • model
          • NearestSearchRequest - POJO representing the request, directly constructed out of the JSON request
          • NearestSearchResponse - POJO representing the response, directly parsed into a JSON response
  • util - Utility methods not necessarily related to the application itself
    • CleanUtil - Cleans the database, caches and serialized data, used by the clean command line argument
    • http - Low level HTTP communication
    • collections - Various specialized collections used to optimize the application

Algorithms

For answering routing requests the application offers the following algorithms:

  • metrics
    • AsTheCrowFliesMetric - As-the-crow-flies metric for ISpatial objects, see As the crow flies
    • landmark
      • LandmarkMetric - Approximates distance by comparing shortest paths from objects to landmarks, see Computing the shortest path: A search meets graph theory
      • GreedyFarthestLandmarks - Landmark provider which greedily selects landmarks that are farthest away from each other
      • RandomLandmarks - Landmark provider which selects landmarks randomly
  • scc
  • nearestneighbor
    • CoverTree - Implementation of a Cover-Tree for computing nearest neighbor nodes, see Cover tree
  • shortestpath
    • dijkstra
      • Dijkstra - Implementation of Dijkstra's algorithm for computing shortest paths, see Dijkstra's algorithm
      • modules
        • ModuleDijkstra - Subclass of Dijkstra which allows modification of the algorithm by using various modules
        • A-StarModule - Implementation of the A-Star algorithm as module, a variant of Dijkstra's algorithm which uses a heuristic for approximation, see A* search algorithm
        • TransitModule - A module which respects time information for LinkGraphs linking road to transit nodes
        • MultiModalModule - A module which respects transportation mode restrictions when traveling along edges
    • connectionscan
      • ConnectionScan - Implementation of the Connection Scan algorithm for computing shortest paths on timetable data, see arxiv.org/abs/1703.05997
    • hybridmodel
      • HybridRoadTimetable - Algorithm that combines an algorithm for a road network with an algorithm for a transit timetable. Can for example be used with a Dijkstra instance together with ConnectionScan.

For answering name search requests the application uses the LexiSearch API. The main involved classes are:

  • QGramProvider - Splits a word into n-grams, see n-gram
  • NodeNameSet - Data-set to query on, optimized by using an inverted index which links n-grams to containing words, see Inverted index
  • FuzzyPrefixQuery - Searches names by using a fuzzy prefix query on the given data-set, see Approximate string matching and Edit distance

Control flow

The program starts with the Main#main(args) method which creates an instance of Application and calls initialize() on it.

Initialization

The method Application#initialize() configures the logger and loads configuration settings. After that initializeApi() is called if the command is start.

It creates a database, either ExternalDatabase or MemoryDatabase, depending on the configuration. Then a graph instance is created, either an empty one or it is deserialized from a cache, depending on the configuration.

Then a DataParser is created and handler like OsmDatabaseHandler and OsmRoadHandler are created and registered to the parser. The parser runs and parses all data-sets. Entities are forwarded to the handler which fill the database and the model with the corresponding data.

The model and database are now ready to go. If desired, according to the configuration, the model is serialized to speedup future calls.

Now the routing, name and nearest search server are initialized using initializeRouting(), initializeNameSearch() and initializeNearestSearch(). This includes initialization of algorithms and data-structures. In particular, it includes pre-computation for routing queries which may take a while depending on the model size.

Start

After initialization has finished the main method calls Application#start which starts the corresponding action:

  • start - Calls start() on the routing and name search server
  • reduce - Calls Application#startReducer
  • clean - Calls CleanUtil#clean

When the server are started they create a new thread which listens on the configured port for requests. The APIs are now ready for usage and will answer requests.

Shutdown

Shutting down the API is done using Application#shutdown(). It passes the shutdown request to all modules of the application. In particular, it calls shutdown() on both server.

The Application#initialize method registers Application#shutdown() as shutdown-hook to the JVM. Thus, destroying the JVM (for example by passing Control-C) will correctly trigger the shutdown() method.

Parsing

The class DataParser is responsible for determining the data-sets to parse. Handler can be registered to it and it will then forward parsed entities to them.

OSM

The DataParser uses the class OsmParser for parsing OSM files. It parses data directly into OSM entities and forwards them to the handler.

The OsmDatabaseHandler collects all nodes, ways and relations and forwards them to the database by IRoutingDatabase#offerOsmEntities. The database will then insert all relevant data it did not already contain.

The OsmRoadHandler collects only ways. Therefore, it uses a filter to only collect relevant ways. From the remaining ways it collects all included nodes and adds them and the corresponding edges to the graph. However, at this point the handler does not know spatial data of nodes. Therefore, the handler queues in requests to the database which are then later executed in order to fill spatial data for all nodes. As the graph does not use OSM IDs for nodes, the handler pushes the mapping of the internal to OSM IDs to the database.

GTFS

The DataParser uses the class GtfsParser for parsing GTFS files. It parses data directly into GTFS entities and forwards them to the handler.

The GtfsRealisticTimeExpandedHandler collects entities and constructs a realistic time expanded graph out of them.

The GtfsTimetableHandler collects entities and fills a timetable transit model with them.

Server

The routing, name and nearest search server are structured fairly simple. When calling start() they create a new thread which listens on the configured port for requests. For client handling they use a ExecutorService which offers a cached thread pool. Thus, every request is executed in an independent thread.

For this purpose a ClientHandler is used. It parses the HTTP request and does the HTTP communication and general request validation. If a JSON request is valid it parses it into the corresponding request POJO (RoutingRequest, NameSearchRequest or NearestSearchRequest). Then a RequestHandler is created and handleRequest is called on it.

The RequestHandler computes the response, for example by using shortest path or search algorithms. It builds the response POJO (RoutingResponse, NameSearchResponse or NearestSearchResponse) and sends it to the client.