Skip to content

Use Hive to analyse Namecoin data

Jörn Franke edited this page Feb 16, 2018 · 4 revisions

This is a Hive SQL application demonstrating some of the capabilities of the hadoopcryptoledger library. It enables transparent access to Namecoin Blockchain data via Hive SQL. It can be linked with any other table in any other format in Hive, such as ORC or Parquet. It has successfully been tested with the Cloudera Quickstart VM 5.5 and HDP 2.5, but other Hadoop distributions should work equally well.

Namecoin describes itself as a distributed blockchain based domain name and identity system. Namecoin data has the same data structures as Bitcoin data, but has 1) special output scripts for name operations and 2) by using merged mining/AuxPOW as an incentive for Bitcoin miners to mine as well Namecoins. Both introduces additional data structures. The first one is addressed by additional methods (cf. Useful Utility functions) and the second one by supporting reading of AuxPOW information, which needs to be activated that you can properly process Namecoin blockchain data (see here). Finally, you need to configure the Namecoin network magic instead of the Bitcoin one (cf. Support for Altcoins based on Bitcoin)

Getting blockchain data

See here how to fetch Namecoin blockchain data.

After it has been copied you are ready to use the example.

Deploying the Hive Serde

Instructions to deploy the Hive Serde for the Bitcoin Blockchain.

Example SQL Commands

These are some examples. We assumed that you deployed the Hive Serde and created in the database 'blockchains' the table 'NamecoinBlockChain':

create external table NamecoinBlockchain ROW FORMAT SERDE 'org.zuinnote.hadoop.bitcoin.hive.serde.BitcoinBlockSerde' STORED AS INPUTFORMAT 'org.zuinnote.hadoop.bitcoin.format.mapred.BitcoinBlockFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.mapred.lib.NullOutputFormat' LOCATION '/user/input/namecoin' TBLPROPERTIES("hadoopcryptoledger.bitcoinblockinputformat.filter.magic"="F9BEB4FE","hadoopcryptoledger.bitcoinblockinputformat.readauxpow"="true");

Note that we need to explicitly activate the readauxpow option and define the Namecoin network magic.

The following example counts the number of blocks:

select count(*) from NamecoinBlockchain;

The following example counts the number of transactions:

select count(*) from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions;

The following example calculates the total output of all Namecoin transactions:

select SUM(expout.value) FROM (select * from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions as exptran) transaction_table LATERAL VIEW explode(exptran.listofoutputs) exploded_outputs as expout;

Example SQL Commands using Namecoin UDFs

The following examples illustrate the Namecoin UDFs that enable you to read the namecoin operations (e.g. new, firstupdate, update) and the associated data (e.g. domain name, domain information). You need to deploy the Hive UDF beforehand.

The following example extracts the type of Namecoin name operation from the transactions:

SELECT hclNamecoinGetNameOperation(expout.txoutscript) FROM (select * from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions as exptran) transaction_table LATERAL VIEW explode (exptran.listofoutputs) exploded_outputs as expout;

Example output:

unknown                                                                                                                                                                                                     
OP_NAME_NEW                                                                                                                                                                                                 
unknown                                                                                                                                                                                                     
OP_NAME_UPDATE                                                                                                                                                                                              
OP_NAME_FIRSTUPDATE                                                                                                                                                                                         
unknown   

This example extracts further information from the Namecoin operations firstupdate and update. This includes information, such as domain name or domain information:

SELECT hclNamecoinExtractField(expout.txoutscript) FROM (select * from NamecoinBlockchain LATERAL VIEW explode(transactions) exploded_transactions as exptran) transaction_table LATERAL VIEW explode (exptran.listofoutputs) exploded_outputs as expout;

Example output:

NULL                                                                                                                                                                                                        
["d/panelka","{\"ip\":\"8.8.8.8\",\"map\":{\"*\":{\"ip\":\"8.8.8.8\"}}}"]                                                                                                                                   
["d/flashupd","{\"ip\":\"145.249.106.228\",\"map\":{\"*\":{\"ip\":\"145.249.106.228\"}}}"]                                                                                                                  
NULL                                                                                                                                                                                                        

Please note that not all transactions contain Namecoin data. Additionally, per domain only the latest firstupdate or update operation contains all the information. Hence, you must NOT merge the results of all operations. The new operation only creates a hash that is referred in a subsequent firstupdate operation.

Find more examples here: https://github.com/ZuInnoTe/hadoopcryptoledger/blob/master/examples/hive-namecoin/hive-namecoin.sql

and https://github.com/ZuInnoTe/hadoopcryptoledger/blob/master/examples/hive-namecoin/hive-namecoin-udf.sql

More Information

Blog about Namecoin analytics: https://snippetessay.wordpress.com/2017/10/10/big-data-analytics-on-bitcoins-first-altcoin-namecoin/

Understanding the structure of Bitcoin data:

Blocks: https://en.bitcoin.it/wiki/Block

Transactions: https://en.bitcoin.it/wiki/Transactions

Namecoin Webpage: https://namecoin.org

Namecoin DNS specification: https://wiki.namecoin.org/index.php?title=Domain_Name_Specification

Namecoin ID specification: https://wiki.namecoin.org/index.php?title=Identity

Clone this wiki locally