This repository has been archived by the owner on Nov 28, 2023. It is now read-only.
Initial Historics Reader & Twitter API Reader Release
This release includes support for Gnip Historics and use of the Twitter API.
Changes
- The configuration file for the Gnip Reader has changed format. retries, buffer_size and buffer_timeout are now children of a property hosebird which is a sibling to Gnip. See README.md for an example.
- A new reader Twitter API is provided. Note that either the Gnip reader or the Twitter API reader can currently run, but not both at the same time. A new Twitter API managed source is required to take advantage of this reader, contact DataSift for this to be enabled on your account. The default configured Kafka topic is now twitter rather than twitter_gnip and both the Gnip reader and the Twitter API reader will write to it.
- Gnip Historics components are included in 1.0.19. An Historics API service will run on port 8888 of the provisioned machine, to which Gnip historics job IDs may be sent as per README file instructions. An Historics Reader service will now be installed, which will execute every 5 minutes and process any Gnip Historics jobs which have completed and are available for download. Interactions within the Historics files will be sent to Kafka, as with our other reader components. A web frontend to this API is in development, which will simplify the submission of Gnip jobs and allow for easy listing of in-progress and completed job processing.
Upgrade steps
To take advantage of the new features, it will be necessary to create a new AMI/VM using packer or vagrant, following the Quick Deployment steps in README.md, and switch to using this new instance. Unfortunately a manual upgrade will not be supported due to the new components and heavy modifications to Chef provisioning processes.