This repository secures cloud provider servers, installs and configures CometBFT based chains for validator, sentry and relayer node types as well as Horcrux using Ansible.
- Secure server setup
- Extendable to most CometBFT-based (Formerly known as Tendermint) chains
- Support both mainnet and testnet
- Support horcrux install and node config updates
- Stable playbooks and roles; Customizable variables
- Support essential functions (snapshot, state-sync, public RPC/API endpoints) through separate playbooks
Run the desired playbook with the following arguments:
# Node Setup
ansible-playbook setup.yml -e "target=<mainnet|testnet|horcrux_cluster>" -e "ssh_port=<non_standard_ssh_port>"
# Install/Configure Chain
ansible-playbook main.yml -e "target=<mainnet|testnet>" -e "chain=<chain>"
# Install/Configure Horcrux
ansible-playbook horcrux.yml -e "target=horcrux_cluster|horcrux_cluster_testnet>"
# Configure Prometheus for Chain
ansible-playbook support_prometheus.yml -e "target=<mainnet|testnet|horcrux_cluster>"-e "chain=<chain>"
# Configure Tenderduty for Chain
ansible-playbook support_tenderduty.yml -e "target=<mainnet|testnet|horcrux_cluster>"-e "chain=<chain>"
For every chain where we run a validator on mainnet, we run 2 sentry nodes connected to a 3/3 cosigner node horcrux cluster.
Leveraging Horcrux provides high-availability while maintaining high security and avoiding double signing via consensus and failover detection mechanisms. This allows to connect multiple sentry nodes to cosigner nodes, which reduces downtime and block signing failures, and increases fault tolerance and resiliency of blockchain operations.
Typically, a cloud server provides a machine with root access and insecure setup. This ansible playbook is designed to address those issues. It is based on Ubuntu 22.04, but it should be applicable to other Ubuntu images. To run this playbook, you will need a user with sudo privileges. This playbook does not create a user on purpose as a security measure to avoid using root. This playbook will perform the following:
- Set the hostname (based on inventory file)
- Update server: Simply update and upgrade all applications shipped with the OS.
- Install and configure essential software dependencies
- Install ufw
- Install firewall
- Install fail2ban
- Install cosmovisor
- Optionally install node exporter (configurable in inventory)
- Optionally install promtail (configurable in inventory)
- Optionally install nginx (configurable in inventory)
- Disable the default ssh port of 22 and set up the alternative port.
- Deny all incoming traffic.
- Enable firewall to allow the ssh alternative port from the bastion/jumpbox ip.
- Disable root account access.
- Disable password authentication.
Look at the sample.inventory.yml
file. You will see an example of how the structure should be to configure your CometBFT clusters:
target
: Required. Whethermainnet
ortesnet
.ansible_host
: Required. The IP address of the server(s).ssh_port
: Required. Alternate ssh port to configure on the server. This can be different per host. By default, it will apply the same port for all servers.server_hostname
Required. Sets the hostname to this value.bastion_ip
Required. Bastion/Jumpbox IP to allow ssh access to the server. It can be an address range as well.
Change the file name from sample.inventory.yml to inventory.yml
and update the values accordingly.
# Node Setup
ansible-playbook setup.yml -e "target=<mainnet|testnet|horcrux_cluster>" -e "ssh_port=<non_standard_ssh_port>"
As mentioned above, we run 2 sentry nodes connected to a 3/3 cosigner node horcrux cluster. However, this repo supports configuring chain nodes as a validator
, sentry
or relayer
, each with different settings described below. If you do not wish to use Horcrux, set the type to validator
for each corresponding node.
We have 2 strong opinions about the node configuration:
- Each chain will have its custom 3-digit port prefix. This is to prevent port collision if you run multiple nodes on the same server. For example, you can configure Babylon with the custom port prefix 109 and Osmosis with 110. It is up to you what port prefix to use.
- Each type of node will have its setting based on our experience. For example, the main node (validator) has 100/0/ pruning, sentry node has 1000/100/ pruning, and relayer has 50000/100/ pruning. We will force these settings on you unless you fork the code.
Look at the inventory.sample.yml
file. You will see an example of how the structure should be to configure your CometBFT clusters. All these values can be set per mainnet/testnet, host, chain or global.
target
: Required. Whether mainnet or tesnet.ansible_host
: Required. The IP address of the server.chain
: Required. The chain network name to install/configure (should match file vars/<testnet/mainnet>).type
: Required. It can bevalidator
,sentry
orrelayer
. Each is opinionated in its configuration settings.ansible_user
: The sample file assumesubuntu
, but feel free to use another username. This user needs sudo privilege.ansible_port
: The sample file assumes22
. If you ran the node setup playbook, it should match ssh_port.ansible_ssh_private_key_file
: Path to ssh key file.var_file
: It tells the program where to look for the variable file.user_dir
: The user's home directory. In the sample inventory file this is a computed variable based on the ansible_user. It assumes that it is not a root user and its home directory is/home/{{ansible_user}}
.path
: This is to make sure that the ansible_user can access thego
executable.node_name
: This is your node name or moniker for the config.toml file.
There are additional variables under group_vars/all.yml
for global configuration applied to all chains.
node_exporter_version
: Node exporter version to install.promtail_version
: Promtail version to install.go_version
: Go version to install.cosmovisor_version
: Cosmovisor version to install.cosmovisor_service_name
: Systemctl prefix for the chain's cosmovisor service.node_exporter
: Default istrue
. Change it tofalse
if you do not want to install node_exporter. If true, enables the prometheus port in config.toml.promtail
: Default isfalse
. Change it totrue
if you want to install promtail.nginx
: Default isfalse
. Change it totrue
if you want to install nginx.log_monitor
: Enter your monitor server IP if you install promtail.log_name
: This is the server's name for the promtail service.pagerduty_key
: This is the PagerDuty key if you use TenderDuty.enableapi
Default isfalse
. Set totrue
if you want to enable the api endpoint.enablegrpc
: Default isfalse
. Set totrue
if you want to enable the grpc endpoint.publicrpc
: Default isfalse
. Set totrue
if you want to allow the rpc port on the server.external_address
: IP address to set as an external address in config.toml.
Look at vars/mainnet|testnet/<chain>.yaml
for chain specific variables.
# Install/Configure Chain
ansible-playbook main.yml -e "target=<mainnet|testnet>" -e "chain=<chain>"
This playbook will install Horcrux, a multi-party-computation (MPC) signing service for CometBFT, on the servers defined in inventory.yml under horcrux_cluster
.
ansible_host
: Required. The IP address of the server.type
: Should always be set to horcrux.restart_horcrux
: Defaults totrue
. Change tofalse
if you do not want the horcrux service to restart after a config update.nodes
: priv-val interface listen address for the chain sentry nodes to add to the config.
There are additional variables under group_vars/all.yml
for global configuration applied to all horcrux cosigner nodes.
horcrux_repo
: Repo URL where the horcrux code resides.horcrux_version
: Horcrux version to install.horcrux_cosigner_port
: Defaults to2222
. Port cosigner nodes listen on.
# Install/Configure Horcrux
ansible-playbook horcrux.yml
This playbook will configure a new prometheus target with info from the chain.yml on the servers defined in inventory.yml under telemetry
.
target
: Required. Whether mainnet or tesnet.chain
: Required. The chain network name to install/configure (should match file vars/<testnet/mainnet>).var_file
: It tells the program where to look for the variable file.cosmos_prom_file
: It tells the program the filename of the prometheus targets for the chains.
# Configure Prometheus for Chain
ansible-playbook support_prometheus.yml -e "target=<mainnet|testnet>" -e "chain=<chain>"
This playbook will configure a new Tenderduty chain with info from the chain.yml on the servers defined in inventory.yml under telemetry
.
target
: Required. Whether mainnet or tesnet.chain
: Required. The chain network name to install/configure (should match file vars/<testnet/mainnet>).var_file
: It tells the program where to look for the variable file.tender_config_file
: It tells the program the filename of the prometheus targets for the chains.tender_url
: It tells the program the url for to check for liveness and health after editing the tender_config_file.
# Configure Prometheus for Chain
ansible-playbook support_tenderduty.yml -e "target=<mainnet|testnet>" -e "chain=<chain>"
- Horcrux uses secp256k1 keys to encrypt (ECIES) and sign (ECDSA) cosigner-to-cosigner p2p communication. This is done by encrypting the payloads that are sent over GRPC between cosigners. Due to security reasons, this step must be done manually, and the key files should be copied to each cosigner accordingly after running the following command:
horcrux create-ecies-shards --shards 3
- Horcrux uses threshold Ed25519 cryptography to sign a block payload on the cosigners and combine the resulting signatures to produce a signature that can be validated against your validator's Ed25519 public key. On your local machine which contains your full
priv_validator_key.json
key file(s), run the following command and copy the files to each cosigner accordingly:
horcrux create-ed25519-shards --chain-id <chain> --key-file /path/to/priv_validator_key.json --threshold 2 --shards 3
- To avoid double-signing issues, manually supply signer state data to each cosigner.
For more information, refer to the documentation.
Playbook | Description |
---|---|
main.yml |
The main playbook to set up a node |
node_alertmanager.yml |
Installs and configures alert manager |
node_tenderduty.yml |
Install Tenderduty |
setup.yml |
Secure the server with ssh config changes and firewall rules, and install dependencies |
support_backup_node.yml |
Install snapshot, state_sync, resync, genesis and prune script on backup node |
support_snapshot.yml |
Install snapshot script and a cron job |
support_state_sync.yml |
Install state-sync script |
support_resync.yml |
Install weekly scheduled state-sync and recovery script |
support_genesis.yml |
Install a script to upload genesis |
support_prune.yml |
Install a script to prune using cosmprund |
support_public_endpoints.yml |
Set up Nginx reverse proxy for public RPC/API |
support_seed.yml |
Install seed node with Tenderseed. You need a node_key.json.j2 file so the node_id is consistent |
support_price_feeder.yml |
Install price feeders for selected chains (such Umee, Kujira, etc.) |
support_scripts.yml |
Install scripts to make node operations easier |
support_sync_snapshot.yml |
Sync node from a snapshot |
support_remove_node.yml |
Remove a node and clean up |
support_update_min_gas.yml |
Update minimum gas price |
horcrux.yml |
Install horcrux cluster |
support_horcrux_config.yml |
Add additional nodes to the horcrux config |
support_chain_horcrux |
Updates priv_validator_laddr with horcrux port |
support_bastion_firewall |
Allow additional IPs to connect to bastion |
support_prometheus |
Configure Prometheus with a given chain |
support_tenderduty |
Configure Tenderduty with a given chain |
ansible-playbook support_seed.yml -e "target=<mainnet|testnet>" -e "chain=<chain>" -e "[email protected]:36656"
ansible-playbook support_scripts.yml -e "target=<mainnet|testnet>"
Currently, we have 4 supported scripts. Their usage is documented below using Juno as example:
./scripts/bank_balances/juno.sh
./scripts/bank_send/juno.sh ADDRESS 1000000ujuno
./scripts/distribution_withdrawal/juno.sh
./scripts/gov_vote/juno.sh 1 yes
ansible-playbook support_horcrux_config.yml
We believe we can always improve so feel free to fork this repo and create a PR with your changes so other people can also benefit from them.
This could not have been possible without the help of the people listed below. Thank you very much for providing this framework and creating an environment of collaboration while promoting automation, reliability, and security:
- Polkachu
- Strangelove Labs Inc.
Because this repo tries to accommodate as many Tendermint-based chains as possible, it cannot adapt to all edge cases. Here are some known issues and how to resolve them.
Chain | Issue | Solution |
---|---|---|
Axelar | Some extra lines at the end of app.toml | Delete extra lines and adjust some settings these extra lines are supposed to change |
Canto | genesis file needs to be unwrapped from .result.genesis | Unwrap genesis with jq command |
Injective | Some extra lines at the end of app.toml | Delete extra lines and adjust some settings these extra lines are supposed to change |
Kichain | Some extra lines at the end of app.toml | Delete extra lines and adjust some settings these extra lines are supposed to change |
Celestia testnet | inconsistent config.toml file variable naming convention | Manually adjust config.toml file |