Heron is a successor of Apache Storm (stream processing), open sourced by Twitter [[ref]] (http://twitter.github.io/heron). Heron executes user workflows, called topologies [ref], to process data streams. A Heron topology can be deployed on a YARN cluster on HDInsight using the scripts in this repository.
- Each Heron topology is a long running service, and is deployed as a long running YARN job.
- The YARN scheduler [[ref]] (http://twitter.github.io/heron/docs/operators/deployment/schedulers/yarn/) for Heron is developed using Apache REEF framework [ref].
This is work in progress. The installer script and the scheduler for Heron is evolving.
- Create the cluster
- Begin creating a new HDInsight Cluster
- Select
Standard Storm on Linux (3.4)
cluster type - Configure the
Credentials
, desiredStorage account
and cluster size. Heron topology will run onYARN
nodes which are collocated withSupervisor
nodes. - Create the cluster
- Deploy Heron using
Script Actions
on the clusterSelectOptional Configuration -> Script Actions
. - Find the Zookeeper hostnames using
Cluster Dashboard
(Ambari dashboard). This information will be needed later. - On the HDInsight Cluster view, select
Script Actions
to initiate Heron installation - Provide the Heron installer script url:
https://raw.githubusercontent.com/hdinsight/HeronOnHDInsight/master/src/scripts/heron-installer-v01.sh
- Select
Nimbus
andSupervisor
nodes as intallation targets. The script will install Heron client on theNimbus
nodes (the head nodes in this case) and install dependencies on theSupervisor
(worker) nodes. - The script takes the following parameters
1. Required
ZooKeeper
host name:-z <zk_host_name>
, for e.g.zk0-heron
. Use the value obtained fromCluster Dashboard
above. 1. Optional Heron version string:-v <version>
, for e.g.0.14.4.SNAPSHOT
. The default value is0.14.3
1. Optional flag to overwrite existing installation:-f
1. Sample parameter string:-z zk0-heron -v 0.14.4.SNAPSHOT -f
- Connect (SSH) to one of the nodes where the Heron client is installed, the
head
nodes in this case. - Use the [Heron CLI] to submit the topology. For e.g.
heron submit yarn /usr/heron/heron/examples/heron-examples.jar com.twitter.heron.examples.ExclamationTopology ExclamationTopology
- Execute
heron-tracker
andheron-ui
commands on ahead
nodes. - Establish SSH tunnel ([[ref]] (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-linux-ambari-ssh-tunnel/)) to access the Heron dashboard. Access
<ip_of_head_node>:8889
.