The Data Highway is a service that allows data to be easily produced and consumed via JSON messages over HTTPS/WSS. Data is first defined using a schema and a "road" is created which will accept messages that conform to this schema. Producers of data sets thus only need to define the structure of their data and are then able to send their data to a REST endpoint and not be concerned with what happens next. Data Highway will ensure that this data is made available for streaming consumption and also stored reliably in a "data lake" in the cloud for access by end users.
Paver is Data Highway's administration endpoint. It provides the following features:
- Road (Synonymous with Kafka topic) creation.
- Schema registration and (soft) deletion.
- Data-at-rest to Hive/S3 configuration.
- Road-level producer and consumer authorisation.
Onramp is Data Highway's producer endpoint. It allows users to submit messages to roads in JSON format over HTTPS.
Offramp is Data Highway's consumer endpoint. It allows users to consume message from roads in JSON format over WSS.
Tollbooth is the core of Data Highway. It provides the mechanism by which mutations to a road's model are persisted. Mutations can come from users (Paver) or internal agents. Anything wishing to make a mutation submit's a JSON Patch onto a deltas Kafka topic. Tollbooth consumes this topic, continuously applying patches to models and persisting them back onto the main Model (compact) topic.
Traffic Control is the Kafka Agent. It is primarily responsible for managing Kafka topics in response to changes in models.
Loading Bay is responsible for orchestrating the landing of data to S3 on a configured interval and managing Hive tables - creation, schema mutation and the addition of partitions.
Try Test Drive, an in-memory version of Data Highway that exposes all the public facing endpoints in a single Spring Boot application or Docker container.
docker run -p 8080:8080 hotelsdotcom/road-test-drive:<tag>
Using a local instance of Test Drive, try creating road, registering a schema and producing and consuming messages using the build in user account user:pass
.
Note: For the example below, cURL will prompt for a password which is pass
.
curl -sk \
-u user \
-X POST \
-H "Content-Type: application/json" \
-d '{
"name": "my_road",
"description": "My Road",
"teamName": "TEAM",
"contactEmail": "[email protected]",
"partitionPath": "$.foo",
"enabled": true,
"authorisation": {
"onramp": {
"cidrBlocks": ["0.0.0.0/0"],
"authorities": ["*"]
},
"offramp": {
"authorities": {
"*": ["PUBLIC"]
}
}
}
}' https://localhost:8080/paver/v1/roads
curl -sk \
-u user\
-X POST \
-H "Content-Type: application/json" \
-d '{
"type" : "record",
"name" : "my_record",
"fields" : [
{"name":"foo","type":"string"},
{"name":"bar","type":"string"}
]
}' https://localhost:8080/paver/v1/roads/my_road/schemas
curl -sk \
-u user\
-H "Content-Type: application/json" \
-d '[{"foo":"foo1","bar":"bar1"}]' \
https://localhost:8080/onramp/v1/roads/my_road/messages
echo '{"type":"REQUEST","count":1}' |\
websocat -nk wss://localhost:8080/offramp/v2/roads/my_road/streams/my_stream/messages?defaultOffset=EARLIEST
See: websocat
Build and load docker images to the local docker daemon:
mvn clean package -Djib.goal=dockerBuild
Build without docker images:
mvn clean package -Djib.skip
Build and push docker images to a repo:
mvn clean package -Ddocker.repo=my.docker.repo
Special thanks to the following for making data-highway possible!
Teiva Harsanyi π» |
Kryiakos Sideris π» |
Sandeep Solanki π» |
---|
This project follows the all-contributors specification.
This project is available under the Apache 2.0 License.
Copyright 2019 Expedia Inc.