-
Notifications
You must be signed in to change notification settings - Fork 16
Creating a River
Find a URL that contains useful or interesting data. It must be public (no authentication). This data should change over time. An HTTP request is sent to each URL, and the response body is passed to your parser.
Create a directory in /rivers
and name it something unique. This is where all the rivers go.
Write a JavaScript function called parse.js
that parses the response body and extracts a stream of data over time. See an example parser for NYC Traffic data. The function looks like this:
module.exports = function(config, body, url, fieldCallback, propertyCallback) {
// 1. parse the body
// 2. call the callbacks with data
};
We will get to the callbacks in a minute...
Put it in config.yml
like this. Each URL in sources
is called at the interval
specified and the response body text is sent to your parser. You must provide a list of fields
and properties
.
Fields are the keys to values within your data that change over time. For example, the fields for traffic paths might be speed
and travelTime
. These are temporal data labels, and it is expected that your parser will provide values for these fields every time it is called. Example fields from the nyc-traffic
river config:
fields:
- Speed
- TravelTime
Properties are like meta data. They are perceived as being static, but they may change over time. You have the opportunity to update them every time your parser is called if you want. Example properties from the nyc-traffic
river config:
properties:
- Borough
- linkName
- linkId
- linkPoints
- Owner
- Transcom_id
You push data into River View by calling the fieldCallback
and propertyCallback
callbacks in your parser.
fieldCallback(error, id, timestamp, fieldValues);
Where:
-
error
(Error
) is any error that occurred while parsing the data that prevented completion (when specified, this should be the only argument given) -
id
(string
) is the unique identifier for the data object being updated with new data (example: an id to a traffic route) -
timestamp
(integer
) is the UNIX timestamp (NOT with milliseconds!) for the data (MUST match thetimezone
string in the config -
fieldValues
(array
) is the actual scalar data values corresponding to the fields defined in the config (example: for a traffic config containing the fields[Speed, TravelTime]
, the data should look something like[23.1, 100]
propertyCallback(error, id, dataProperties);
Where:
-
error
(Error
) is any error that occurred while parsing the data that prevented completion (when specified, this should be the only argument given) -
id
(string
) is the unique identifier for the data object being updated with new data (example: an id to a traffic route) -
dataProperties
(object
) is a key/value object with keys matching the properties defined in the config