This project is inspired by Eth Docker and they are complementary. Whereas Eth Docker focuses on installing the Ethereum clients, this project focuses on setting up the linux server itself, on which Eth Docker will run. Eth Server uses ansible to automate SSH hardening, VPN setup, chrony, monitoring etc. Namely
- eth-server sets up your staking server
- then eth-docker sets up your ethereum clients
So far, this project works for me and I feel like sharing it. My current motivation is to inspire others, but not necessary to extend it so that it meets everyone's custom needs and setup.
Eth Docker is a real utility software. This project here is mainly hacked together.
- keep it simple: it should be easy for most people to onboard to the project and tweak it to their needs.
- it is opinionated: there could be a million of combination of OS and software to create a server for staking. As much as we want diversity in the staking ecosystem, this project only focuses on a simple and rather mainstream solution.
- production ready: although simple, this ansible playbook should be reliable to setup a secured server, ready for staking.
This project is meant for solo stakers with a single server.
Out of scope are:
- multi-tenant setup
- managing a fleet of N servers
I don't enjoy configuring my server. And I always forget what I've done and how. So I want to automate and document those mundane tasks. Moreover, if my server dies, I want to be able to recreate it within minutes, and in a deterministic and reproducible fashion. Ansible was designed for that !
This ansible playbook does the following things
- it upgrades packages.
- it installs Fail2Ban.
- it installs UFW and sets it up, only allowing the SSH, execution and consensus client ports.
- it hardens the SSH config.
- it updates the chrony config to improve time synchronization. See reddit discussion missing attestations, chrony and time sync drift.
- it installs a Cloudflare Tunnel that is used to SSH into the server from the public internet without having to expose the SSH port.
- it updates the service
systemd-networkd-wait-online
to avoid blocking the server at boot time (at least in my case). - it downloads eth-docker and uses it to install docker (making use of those great scripts).
- it installs a poor man heartbeat script as a systemd daemon to send heartbeats to Grafana Cloud OnCall every 15 seconds.
- it installs a Grafana Agent to monitoring the server (logs and metrics, including docker containers). It also takes care of setting up the docker integration.
It works. But there are other alternatives such a Tailscale. The beauty of those tools is that they are very simple to install and work without exposing any ports to the public internet, thus greatly reducing the attack surface of the server compared to running a VPN on the server. On the other hand, you're trusting a 3rd party. It works for me.
I want to keep my monitoring stack decoupled from docker (if docker fails for example), and I want to keep it super simple. I also want to keep my monitoring stack decoupled from eth-docker. Although eth-docker offers monitoring as part of its config, I prefer to reduce its scope to only managing my ethereum clients. Lastly, instead of running many components such prometheus, loki and other tools, the grafana agent ticks all the boxes for me.
Sometimes, I still miss attestations, and that's frustrating. So I've setup probes both ways:
- Grafana Cloud is pinging my clients with TCP probes every minutes (synthetic monitoring)
- My server is sending heartbeats to Grafana Cloud OnCall every 15s
It is useful to correlate network issues, or discarding them. Which points to the direction that my server is very reliable, but sometimes not so much my ISP... This is residential connection and the ISP doesn't offer the same guarantees as for businesses. Not much I can do. Nonetheless, still happy to see my heartbeat healthy.
You don't need most of the below to test the ansible script locally. On your laptop, Vagrant is used to spin up Ubuntu in a VM, that ansible uses to test the playbooks against. The below points are for the real staking server. For locally testing on your laptop, feel free to skip this section.
Laptop
- install Vagrant
- install VirtualBox to be used by Vagrant
- install Ansible
- install Cloudflared client
Internet Box
- either get a static public IP for your server, or set up a dynamic name resolution (most box should offer that). My ISP doesn't offer static IP, but I don't have a dynamic name of the form
<my-sub-domain>.<my-isp-domain>.com
. That will be used to create TCP health checks in Grafana Cloud. If you don't plan on setting up health checks, then you don't need to setup a dynamic name resolution in your home router. - setup port-forwarding for your execution (default
30303
) and consensus (default9000
) client ports for both UDP and TCP. If you use Cloudflare Tunnel or Tailscale, you shouldn't need to expose any other ports: i.e. my SSH port 22 is NOT exposed.
Domain name
- buy a domain name that will be used to connect to your server from the public internet through the Cloudflare Tunnel. Use any registrar of your choice or cloudflare directly.
Cloudflare
- Create a cloudflare account
- Onboard your domain (or buy the domain directly from cloudflare)
- Create 2 Tunnels (one for prod, and one for local) in the cloudflare console and copy-paste the tokens into inventory files in
inventories
under the keycloudflare.token
.
Grafana Cloud
- create an account (free tier)
- add your API key to the file
inventories/prod.yaml
at the keygrafana.grafana_cloud_api_key
. - setup metrics endpoint and update the keys
grafana.metrics_username
andgrafana.prometheus_url
in config fileinventories/prod.yaml
. - setup logs endpoint and update the keys
grafana.logs_username
andgrafana.loki_url
in config fileinventories/prod.yaml
. - define a name for your server with the key
grafana.instance_name
in config fileinventories/prod.yaml
. - create TCP probes (synthetic) on your EL and CL client ports, and using the dynamic name of your ISP internet box.
Server
- get your hardware.
- flash a USB stick with Ubuntu 22.04 LTS server (or the OS of your choice).
- get yourself a keyboard and screen and install the OS. Create strong passwords, for the BIOS also recommended. Saved them carefully (password manager ideally).
- install the OS.
- make sure you have correctly expanded your volumes and that the OS sees the full capacity of your disks.
- make sure you are using your SSH key when SSHing into the server: if you're using the username/password and run the ansible playbook, you will be locked out of the server, because password authentication is disabled.
Laptop
- create an SSH key pair. And upload the public key to the server.
- edit your ssh config
~.ssh/config
with the server host and the key to be used.
Configuration is stored in inventories/
. Because it contains secrets, a low tech solution is simply to ignore those files in .gitignore
. Copy the file example.yaml
to prod.yaml
and local.yaml
:
inventories/local.yaml
is used with vagrant, to test things locally.inventories/prod.yaml
is used with your server, the real thing.
if you prefer not to maintain your secrets in files (recommended), you can also pass the values with the flag --extra-vars
. For example
ansible-playbook -i inventories/prod.yaml playbooks/main.yaml --extra-vars='{
"cloudflare": {
"secret_token": "<your_secret_token>"
}
}'
it also works with env vars
CLOUDFLARE_TOKEN="<your_secret_token>"
ansible-playbook -i inventories/prod.yaml playbooks/main.yaml --extra-vars='{
"cloudflare": {
"secret_token": '"${CLOUDFLARE_TOKEN}"'
}
}'
the flag --extra-vars
will always overwrite variables defined elsewhere.
You can test everything on your laptop (locally) thanks to Vagrant. From all the prerequisites above, you only need
- the grafana api key and usernames and config. It's actually cool to already see metrics and logs flowing into grafana cloud from the vagrant VM.
- the cloudflare token for the tunnel. You will also see the tunnel showing up as "UP" in the UI.
If you don't want the 2 points above, then comment out (or delete) that code in the ansible playbook.
Edit the local config file inventories/local.yaml
accordingly to your setup. Good practices is to create distinct api key for local and prod configs.
clone the project on your laptop
git clone [email protected]:salanfe/eth-server.git
and add your api keys to the local config file inventories/local.yaml
.
Then, start a virtual ubuntu server on your laptop with vagrant
vagrant up
once the virtual server is up and running, run the playbook with the local config
ansible-playbook -i inventories/local.yaml playbooks/main.yaml --diff
ssh into the virtual server and double-check the config, and see by yourself the changes applied by ansible
vagrant ssh
Feel free to edit the playbook playbooks/main.yaml
as much as you want: e.g. you can remove grafana and cloudflare.
Create new api keys for Cloudflare Tunnel and Grafana Cloud, that are dedicated to "prod", which is your real server for staking. Edit the file inventories/prod.yaml
accordingly.
do a dry-run first
ansible-playbook -i inventories/prod.yaml playbooks/main.yaml --ask-become-pass --diff --check
carefully check the above. Then run the playbook
ansible-playbook -i inventories/prod.yaml playbooks/main.yaml --ask-become-pass --diff
if you want to dynamically pass secrets, you can use the flag --extra-vars
(see config section above).
below are possible ideas for future improvements
- add Tailscale and/or a VPN server as alternatives to Cloudflare Tunnel. Having both Tailscale and Cloudflare Tunnel installed can offer a fail-safe alternative.
- fine tune the grafana agent config: scraping logs, relabeling, etc.
- get a definitive solution for the chrony config. Using Google NTP servers works so far, but there's probably a better solution. Additionally, the consensus protocol don't use leap smear time like the Google NTP servers source.
- add more tests and validation
- get more eyes on the ansible playbook
- terraform 3rd parties (grafana cloud, cloudflare). Although, those are less critical than the server itself, and doing things by hand is probably fine: as it's mainly a one-time setup. Nonetheless, having terraform could possibly speed up onboarding of new joiners (to be balanced with the extra complexity of introducing yet another tool).
- use Ansible to manage the eth-docker
.env
file. But there should be a reconciliation logic going both ways, making it probably more error prone than necessary. Nonetheless, if the server dies, having a copy of the.env
file is also a good thing. - better solution to manage secrets (e.g. I like working with Google Cloud... add a script to pull secret from Google Secret Manager). E.g. see below
GCP_PROJECT_ID="<ID of your GCP project>"
GCP_SECRET_NAME="<name of the secret>"
ansible-playbook -i inventories/prod.yaml playbooks/main.yaml --extra-vars='{
"cloudflare": {
"secret_token": '"$(gcloud secrets versions access latest --secret=${GCP_SECRET_NAME} --project=${GCP_PROJECT_ID})"'
}
}'
Currently running a node will the following hardware
- Case: Silverstone SST-SG05BB-Lite (Mini ITX, Mini DTX)
- Motherboard: Supermicro X12STL-IF (reference)
- SSD: Samsung 980 Pro with Heatsink (2000 GB, M.2 2280)
- RAM: Crucial DDR4 ECC UDIMM 2Rx8 3200 (1 x 32GB, 3200 MHz, DDR4-RAM, DIMM)
- CPU: Intel Core i3-10105F (LGA 1200, 3.70 GHz, 4 -Core)
- PSU: Cooler Master V Series SFX (750 W)
Running Besu and Teku to contribute with the minority clients.
- https://github.com/eth-educators/eth-docker: no need to introduce that one.
- https://github.com/CryptoManufaktur-io/backend-ansible: the people being eth-docker also maintain an ansible repo for running nodes. Much more advanced (but also complex), this is worth giving a look.
Submit an issue or open a merge request.
Here's my ethereum address if you feel like buying me a coffee ☕❤️ 0x48707A2D8cf862E14401e4DeDB94cD6b97bd67d7