Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flume Source to live stream Blockchain data into HDFS #11

Open
jornfranke opened this issue Feb 9, 2017 · 1 comment
Open

Flume Source to live stream Blockchain data into HDFS #11

jornfranke opened this issue Feb 9, 2017 · 1 comment
Assignees

Comments

@jornfranke
Copy link
Member

jornfranke commented Feb 9, 2017

Live streaming of Bitcoin blockchain data for immediate analysis to HDFS, but also other applications (e.g. via Kafka).

It could be done as a flume source or a Kafka producer.
This Flume source should

  1. Provide Bitcoin Blocks to any Flume Channel
  2. Provide Bitcoin Block metadata (e.g. number of confirmations, validations of checksums etc.) to any Flume Channel. Metadata should be related to one block and does not describe deltas, but only full changes. For example, the number of confirmations is always the currently known total number of confirmations and not new confirmations that are known. The reason is that otherwise the application would have to maintain this information which leads usually to inconsistent information (e.g. number of confirmations is different from the real number of confirmations etc.). However, the flume source would need to have a backend to manage state, which should be ideally configurable. Via JDBC one could connect to a variety of NoSQL databases (e.g. Hbase, ignite etc.).

Unit and integration tests must be provided.
An example manual needs to be provided to integrate the Flume source into any cluster that has Flume support deployed. As a basic, it shows that Bitcoin Blocks are stored in HDFS files using the append mode and configurable file size (e.g. 128M) and meta data is stored in an updatable fashion in Hbase.

@jornfranke jornfranke self-assigned this Feb 9, 2017
@jornfranke
Copy link
Member Author

we will design an architecture for blockchain analytics and provide selected implementations in https://github.com/ZuInnoTe/cryptoledgerstreamer

@jornfranke jornfranke changed the title Flume Source to live stream Bitcoin data into HDFS Flume Source to live stream Blockchain data into HDFS Mar 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant