This document covers the procedures for bulding a storm cluster using the StormStarter script. When executed against a redhat-based ( yum ) system, this script will give you a working storm cluster.
Essentially, to get a working cluster, you will:
- Setup 5 VM’s ( any odd number will do, but 5 is recommended )
- git clone https://github.com/scott-mead/storm
- Configure the cluster.yml config file
- Configure SSH
- Run the ‘StormStarter’ script
Building a cluster with 5 VM’s will allow you to withstand two node failures. Always build clusters in odd numbers so that zookeeper’s quorum keeping operations can elect a master.
I’m using salt-cloud to quickly provision ec2 nodes for me. You can use whichever method you’d like, but, if you don’t have a tool, I highly recommend using salt-cloud.
Get a copy of the ‘StormStarter’ git repository from https://github.com/scott-mead/storm.
git clone https://github.com/scott-mead/storm
This is your primary config file. Look at the example and modify it to your needs.
The configuration file contains a group of servers that you will be addressing.
```yaml
group_name:
username: "sshusername"
node1:
hostname: "externally addressable name"
int_ip: "host internal ip"
int_name: "cluster1"
```
Example config
```yaml
dev_cluster:
username: "ec2-user"
node1:
hostname : "c1.example.com"
int_ip : "192.168.185.1"
int_name : "cluster1"
node2:
hostname : "c2.example.com"
int_ip : "192.168.185.2"
int_name : "cluster2"
node3:
hostname : "c3.example.com"
int_ip : "192.168.185.3"
int_name : "cluster3"
prod_cluster: username: "ec2-user" node1: hostname : "p1.example.com" int_ip : "192.168.185.1" int_name : "cluster1" node2: hostname : "p2.example.com" int_ip : "192.168.185.2" int_name : "cluster2" node3: hostname : "p3.example.com" int_ip : "192.168.185.3" int_name : "cluster3"
<div></div><div><ul><li><span style="line-height: 1.4;">group_name is what you will pass to StormStarter -g</span><br></li><ul><li><span style="line-height: 1.4;">hostname </span></li><ul><li><span style="line-height: 1.4;">should be the externally addressable, ec2 name i.e. ( </span>ec2-52-109-12-100.compute-1.amazonaws.com )</li></ul><li>int_ip</li><ul><li>This is the internal ( a.k.a. non-public / private ) ip address. This will be used by the nodes to communicate with one another</li></ul><li>int_name</li><ul><li>This is an internal name to be setup by StormStarter. If you change this, you will need to modify most of the chef-managed configuration files within the package. I do NOT recommend this.</li></ul></ul></ul></div><h1>Configure SSH</h1><div> Edit the $HOME/.ssh/config file and add the key for your aws hosts and disable prompting. </div><div><br></div><div><div>Host *aws*</div><div> IdentityFile /home/user/myawskey.pem</div><div> StrictHostKeyChecking no</div></div><div><br></div><div>You can either use wildcards like I did above, or, list each one individually.</div><div><br></div><div><div><div>Host <span style="line-height: 1.4;">ec2-52-109-12-100.compute-1.amazonaws.com</span></div><div> IdentityFile /home/user/myawskey.pem</div><div> StrictHostKeyChecking no</div></div></div><div><br></div><div><div>Host <span style="line-height: 1.4;">ec2-52-109-12-101.compute-1.amazonaws.com</span></div><div> IdentityFile /home/user/myawskey.pem</div><div> StrictHostKeyChecking no</div></div><div><br></div><div>Once you have set this up, validate that you can ssh to the host with no password.</div><div><br></div><h1>Run StormStarter</h1><div> Required options:</div><div><br></div><div> -c < path to config file></div><div> -g <group name></div><div><br></div><div> The group name is the top of the yaml tree, so if we had configured</div>
```yaml
dev_cluster:
username: "ec2-user"
node1:
hostname : "c1.example.com"
int_ip : "192.168.185.1"
int_name : "cluster1"
node2:
hostname : "c2.example.com"
int_ip : "192.168.185.2"
int_name : "cluster2"
node3:
hostname : "c3.example.com"
int_ip : "192.168.185.3"
int_name : "cluster3"
prod_cluster:
username: "ec2-user"
node1:
hostname : "p1.example.com"
int_ip : "192.168.185.1"
int_name : "cluster1"
node2:
hostname : "p2.example.com"
int_ip : "192.168.185.2"
int_name : "cluster2"
node3:
hostname : "p3.example.com"
int_ip : "192.168.185.3"
int_name : "cluster3"
I could use either ‘dev_cluster’ or ‘prod_cluster’ as the group name
Action:
You have 4 possible actions:
- provision
- This will copy files, clone the repository and install
- You can run this against a system with an existing install to update configs
- start
- This will connect to each node and run service supervisord start
- stop
- This will connect to each node and run service supervisord stop
- restart
- This will connect to each node and run service supervisord restart