-
Notifications
You must be signed in to change notification settings - Fork 51
Use Hive to analyse Namecoin data
This is a Hive SQL application demonstrating some of the capabilities of the hadoopcryptoledger library. It enables transparent access to Namecoin Blockchain data via Hive SQL. It can be linked with any other table in any other format in Hive, such as ORC or Parquet. It has successfully been tested with the Cloudera Quickstart VM 5.5 and HDP 2.5, but other Hadoop distributions should work equally well.
Namecoin describes itself as a distributed blockchain based domain name and identity system. Namecoin data has the same data structures as Bitcoin data, but has 1) special output scripts for name operations and 2) by using merged mining/AuxPOW as an incentive for Bitcoin miners to mine as well Namecoins. Both introduces additional data structures. The first one is addressed by additional methods (cf. Useful Utility functions) and the second one by supporting reading of AuxPOW information, which needs to be activated that you can properly process Namecoin blockchain data (see here). Finally, you need to configure the Namecoin network magic instead of the Bitcoin one (cf. Support for Altcoins based on Bitcoin)
See here how to fetch Namecoin blockchain data.
After it has been copied you are ready to use the example.
Instructions to deploy the Hive Serde for the Bitcoin Blockchain.
These are some examples. We assumed that you deployed the Hive Serde and created in the database 'blockchains' the table 'NamecoinBlockChain':
create external table NamecoinBlockchain ROW FORMAT SERDE 'org.zuinnote.hadoop.bitcoin.hive.serde.BitcoinBlockSerde' STORED AS INPUTFORMAT 'org.zuinnote.hadoop.bitcoin.format.mapred.BitcoinBlockFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.mapred.lib.NullOutputFormat' LOCATION '/user/input/namecoin' TBLPROPERTIES("hadoopcryptoledger.bitcoinblockinputformat.filter.magic"="F9BEB4FE","hadoopcryptoledger.bitcoinblockinputformat.readauxpow"="true");
Note that we need to explicitly activate the readauxpow option and define the Namecoin network magic.
The following example counts the number of blocks:
select count(*) from NamecoinBlockchain;
The following example counts the number of transactions:
select count(*) from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions;
The following example calculates the total output of all Namecoin transactions:
select SUM(expout.value) FROM (select * from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions as exptran) transaction_table LATERAL VIEW explode(exptran.listofoutputs) exploded_outputs as expout;
The following examples illustrate the Namecoin UDFs that enable you to read the namecoin operations (e.g. new, firstupdate, update) and the associated data (e.g. domain name, domain information). You need to deploy the Hive UDF beforehand.
The following example extracts the type of Namecoin name operation from the transactions:
SELECT hclNamecoinGetNameOperation(expout.txoutscript) FROM (select * from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions as exptran) transaction_table LATERAL VIEW explode (exptran.listofoutputs) exploded_outputs as expout;
Example output:
unknown
OP_NAME_NEW
unknown
OP_NAME_UPDATE
OP_NAME_FIRSTUPDATE
unknown
This example extracts further information from the Namecoin operations firstupdate and update. This includes information, such as domain name or domain information:
SELECT hclNamecoinExtractField(expout.txoutscript) FROM (select * from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions as exptran) transaction_table LATERAL VIEW explode (exptran.listofoutputs) exploded_outputs as expout;
Example output:
NULL
["d/panelka","{\"ip\":\"8.8.8.8\",\"map\":{\"*\":{\"ip\":\"8.8.8.8\"}}}"]
["d/flashupd","{\"ip\":\"145.249.106.228\",\"map\":{\"*\":{\"ip\":\"145.249.106.228\"}}}"]
NULL
Please note that not all transactions contain Namecoin data. Additionally, per domain only the latest firstupdate or update operation contains all the information. Hence, you must NOT merge the results of all operations. The new operation only creates a hash that is referred in a subsequent firstupdate operation.
Find more examples here: https://github.com/ZuInnoTe/hadoopcryptoledger/blob/master/examples/hive-namecoin/hive-namecoin.sql
Blog about Namecoin analytics: https://snippetessay.wordpress.com/2017/10/10/big-data-analytics-on-bitcoins-first-altcoin-namecoin/
Understanding the structure of Bitcoin data:
Blocks: https://en.bitcoin.it/wiki/Block
Transactions: https://en.bitcoin.it/wiki/Transactions
Namecoin Webpage: https://namecoin.org
Namecoin DNS specification: https://wiki.namecoin.org/index.php?title=Domain_Name_Specification
Namecoin ID specification: https://wiki.namecoin.org/index.php?title=Identity