Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement automation for archival snapshots #428

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
cf2eb58
Implement new archival service for lite and diff snapshots
sudo-shashank Mar 22, 2024
5ebd15c
Use Ansible
sudo-shashank Mar 26, 2024
47630a4
use env vars
sudo-shashank Mar 26, 2024
c8afdd4
lint fix
sudo-shashank Mar 27, 2024
16b1770
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 27, 2024
bac9bf2
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Mar 27, 2024
ca24207
add CI workflow
sudo-shashank Apr 1, 2024
d4af8c6
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Apr 1, 2024
d145f33
setup cron job
sudo-shashank Apr 2, 2024
246a45b
setup pre requisites
sudo-shashank Apr 8, 2024
d8b8719
fix
sudo-shashank Apr 8, 2024
f2cb21d
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Apr 8, 2024
9cd8cef
fix logs
sudo-shashank Apr 8, 2024
5374561
fix shellcheck
sudo-shashank Apr 8, 2024
98382d2
fix init.sh
sudo-shashank Apr 8, 2024
78e2794
Handle silent failure cases
sudo-shashank Apr 9, 2024
7c998f0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 9, 2024
f340688
fmt
sudo-shashank Apr 9, 2024
d40d717
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Apr 9, 2024
dc58c93
testing ci connection
sudo-shashank Apr 9, 2024
d6ea9a9
testing ci
sudo-shashank Apr 9, 2024
f5151b4
try ci fix
sudo-shashank Apr 9, 2024
1be6808
fix ssh config in ci
sudo-shashank Apr 9, 2024
d94d7bd
fix ssh config
sudo-shashank Apr 9, 2024
44e5fd8
fix ssh config setup
sudo-shashank Apr 9, 2024
33269e8
use correct ssh key var
sudo-shashank Apr 9, 2024
d3bca3f
use web/factory/ssh-agent
sudo-shashank Apr 9, 2024
ff52e7e
test connection with cloudflared proxy
sudo-shashank Apr 9, 2024
52f6270
fix cloudflare installation
sudo-shashank Apr 9, 2024
68a1732
fix cloudflare pkg link
sudo-shashank Apr 9, 2024
63bd263
fix cloudflare pkg installation
sudo-shashank Apr 9, 2024
85932cf
use archie private key
sudo-shashank Apr 9, 2024
e0252f7
fix ssh connection
sudo-shashank Apr 9, 2024
ba4ddd6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 9, 2024
9b55ec9
add to known host
sudo-shashank Apr 10, 2024
a7142f5
check ssh connection
sudo-shashank Apr 10, 2024
d8aa928
fix
sudo-shashank Jun 6, 2024
f34ba6f
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Jun 6, 2024
5f4cd5b
fix forest pkg uzip
sudo-shashank Jun 6, 2024
9bea38f
fix unzip path
sudo-shashank Jun 6, 2024
b732021
cleanup
sudo-shashank Jun 6, 2024
8b0127f
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Jun 19, 2024
80a9c30
Merge branch 'main' into shashank/archival-snapshots
sudo-shashank Jul 1, 2024
016a9d3
handle script failure
sudo-shashank Jul 2, 2024
3842343
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions .github/workflows/deploy-archival-snapshots.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: Export archival snapshots

on:
schedule:
- cron: '0 0 * * *'
pull_request:
paths:
- 'ansible/archival-snapshots/**'
push:
paths:
- 'ansible/archival-snapshots/**'
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'

- name: Install Ansible
run: |
sudo apt-get update
sudo apt-get install -y ansible

- name: Download and install Cloudflared
run: |
wget https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
sudo dpkg -i cloudflared-linux-amd64.deb
cloudflared --version

- name: Configure ssh-agent
uses: webfactory/[email protected]
with:
ssh-private-key: ${{ secrets.ARCHIE_PRIVATE_KEY }}

- name: Store SSH key
env:
SSH_PRIVATE_KEY: ${{ secrets.ARCHIE_PRIVATE_KEY }}
run: |
cat "$GITHUB_WORKSPACE/ansible/archival-snapshots/resources/ssh_config" >> ~/.ssh/config
echo "$SSH_PRIVATE_KEY" > ~/.ssh/id_rsa_archie
chmod 600 ~/.ssh/id_rsa_archie

- name: Run Ansible playbook
env:
ANSIBLE_HOST_KEY_CHECKING: "False"
ARCHIVAL_SLACK_TOKEN: ${{ secrets.SLACK_TOKEN }}
ENDPOINT: https://2238a825c5aca59233eab1f221f7aefb.r2.cloudflarestorage.com/
run: |
ansible-playbook -i ansible/archival-snapshots/inventory.ini ansible/archival-snapshots/playbook.yml
2 changes: 2 additions & 0 deletions ansible/archival-snapshots/inventory.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[remote_server]
archie.chainsafe.dev ansible_user=archie ansible_ssh_private_key_file=~/.ssh/id_rsa_archie
66 changes: 66 additions & 0 deletions ansible/archival-snapshots/playbook.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
- name: Automate packaging, transferring, and executing a script
hosts: remote_server
vars:
local_resources_path: "resources"
remote_resources_path: "/mnt/md0/exported/archival"
zip_file_name: "resources.zip"
forest_version: "v0.17.2"
forest_release_url: "https://github.com/ChainSafe/forest/releases/download/{{ forest_version }}/forest-{{ forest_version }}-linux-amd64.zip"
tasks:
- name: Check if AWS CLI is installed
ansible.builtin.command:
cmd: "which aws"
register: aws_installed
changed_when: false
ignore_errors: true

- name: Install AWS CLI if not installed
ansible.builtin.command:
cmd: "sudo apt-get update && sudo apt-get install -y awscli"
when: aws_installed.rc != 0

- name: Check if Ruby is installed
ansible.builtin.command:
cmd: "which ruby"
register: ruby_installed
changed_when: false
ignore_errors: true

- name: Install Ruby if not installed
ansible.builtin.command:
cmd: "sudo apt-get update && sudo apt-get install -y ruby-full"
when: ruby_installed.rc != 0

- name: Zip the resources folder
ansible.builtin.command:
cmd: "zip -r {{ zip_file_name }} ."
chdir: "{{ local_resources_path }}"
delegate_to: localhost

- name: Transfer the zip file to the remote server
ansible.builtin.copy:
src: "{{ local_resources_path }}/{{ zip_file_name }}"
dest: "{{ remote_resources_path }}/{{ zip_file_name }}"

- name: Unzip the resources folder on the remote server
ansible.builtin.command:
cmd: "unzip -o {{ zip_file_name }}"
chdir: "{{ remote_resources_path }}"

- name: Download Forest release package
ansible.builtin.get_url:
url: "{{ forest_release_url }}"
dest: "{{ remote_resources_path }}/forest.zip"

- name: Unzip the Forest release package on the remote server
ansible.builtin.command:
cmd: "unzip -jo {{ remote_resources_path }}/forest.zip -d {{ remote_resources_path }}/forest/"

- name: Execute the init.sh script
ansible.builtin.shell:
cmd: "nohup ./init.sh > init.log 2>&1 &"
chdir: "{{ remote_resources_path }}"
environment:
ARCHIVAL_SLACK_TOKEN: "{{ lookup('env', 'ARCHIVAL_SLACK_TOKEN') }}"
ENDPOINT: "{{ lookup('env', 'ENDPOINT') }}"
3 changes: 3 additions & 0 deletions ansible/archival-snapshots/resources/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[client]
data_dir = "/mnt/md0/forest-archival-data"
encrypt_keystore = false
41 changes: 41 additions & 0 deletions ansible/archival-snapshots/resources/diff_script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/bin/env bash

set -euxo pipefail

FOREST=/mnt/md0/exported/archival/forest/forest-tool
UPLOADED_DIFFS=/mnt/md0/exported/archival/uploaded-diff-snaps.txt
UPLOAD_QUEUE="/mnt/md0/exported/archival/upload_files.txt"

EPOCH_START="$1"
shift
DIFF_STEP=3000
DIFF_COUNT=10
GENESIS_TIMESTAMP=1598306400
SECONDS_PER_EPOCH=30

# Clear Upload List
if [ -f "$UPLOAD_QUEUE" ]; then
# Clear the contents of the file
true > "$UPLOAD_QUEUE"
fi


aws --profile prod --endpoint "$ENDPOINT" s3 ls "s3://forest-archive/mainnet/diff/" > "$UPLOADED_DIFFS"

for i in $(seq 1 $DIFF_COUNT); do
EPOCH=$((EPOCH_START+DIFF_STEP*i))
EPOCH_TIMESTAMP=$((GENESIS_TIMESTAMP + EPOCH*SECONDS_PER_EPOCH))
DATE=$(date --date=@"$EPOCH_TIMESTAMP" -u -I)
FILE_NAME="forest_diff_mainnet_${DATE}_height_$((EPOCH-DIFF_STEP))+$DIFF_STEP.forest.car.zst"
FILE="/mnt/md0/exported/archival/diff_snapshots/$FILE_NAME"
if ! grep -q "$FILE_NAME" "$UPLOADED_DIFFS"; then
if ! test -f "$FILE"; then
# Export diff snapshot
"$FOREST" archive export --depth "$DIFF_STEP" --epoch "$EPOCH" --diff $((EPOCH-DIFF_STEP)) --diff-depth 900 --output-path "$FILE" "$@"
fi
# Add exported diff snapshot to upload queue
echo "$FILE" >> "$UPLOAD_QUEUE"
else
echo "Skipping $FILE"
fi
done
46 changes: 46 additions & 0 deletions ansible/archival-snapshots/resources/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/bin/bash

## Enable strict error handling, command tracing, and pipefail
set -eux

# Initialize snapshots directory
LITE_SNAPSHOT_DIR="/mnt/md0/exported/archival/lite_snapshots"
DIFF_SNAPSHOT_DIR="/mnt/md0/exported/archival/diff_snapshots"
FULL_SNAPSHOTS_DIR=/mnt/md0/exported/archival/snapshots

if [ ! -d "$LITE_SNAPSHOT_DIR" ]; then
mkdir -p "$LITE_SNAPSHOT_DIR"
echo "Created $LITE_SNAPSHOT_DIR"
else
echo "$LITE_SNAPSHOT_DIR exists"
fi

if [ ! -d "$DIFF_SNAPSHOT_DIR" ]; then
mkdir -p "$DIFF_SNAPSHOT_DIR"
echo "Created $DIFF_SNAPSHOT_DIR"
else
echo "$DIFF_SNAPSHOT_DIR exists"
fi

if [ ! -d "$FULL_SNAPSHOTS_DIR" ]; then
mkdir -p "$FULL_SNAPSHOTS_DIR"
echo "Created $FULL_SNAPSHOTS_DIR"
else
echo "$FULL_SNAPSHOTS_DIR exists"
fi

# Trigger main script
./main.sh

EXIT_STATUS=$?

# Notify on slack channel
if [ "$EXIT_STATUS" -eq 0 ]; then
echo "Script executed successfully"
ruby notify.rb success
else
echo "Script execution failed"
ruby notify.rb failure
fi

exit "$EXIT_STATUS"
112 changes: 112 additions & 0 deletions ansible/archival-snapshots/resources/main.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
#!/bin/bash

## Enable strict error handling, command tracing, and pipefail
set -euxo pipefail

## Set constants
GENESIS_TIMESTAMP=1598306400
SECONDS_PER_EPOCH=30

## Set forest artifacts path
FOREST="/mnt/md0/exported/archival/forest/forest"
FOREST_CLI="/mnt/md0/exported/archival/forest/forest-cli"
FOREST_TOOL="/mnt/md0/exported/archival/forest/forest-tool"

LITE_SNAPSHOT_DIR=/mnt/md0/exported/archival/lite_snapshots
FULL_SNAPSHOTS_DIR=/mnt/md0/exported/archival/snapshots

## Fetch last snapshot details
LAST_SNAPSHOT=$(aws --profile prod --endpoint "$ENDPOINT" s3 ls "s3://forest-archive/mainnet/lite/" | sort | tail -n 1 | awk '{print $NF}')
LAST_EPOCH=$(echo "$LAST_SNAPSHOT" | awk -F'_' '{gsub(/[^0-9]/, "", $6); print $6}')
LAST_FULL_SNAPSHOT_PATH="$FULL_SNAPSHOTS_DIR/$LAST_SNAPSHOT"

if [ ! -f "$LAST_FULL_SNAPSHOT_PATH" ]; then
echo "Downloading last snapshot: $LAST_FULL_SNAPSHOT_PATH"
aws --profile prod --endpoint "$ENDPOINT" s3 cp "s3://forest-archive/mainnet/lite/$LAST_SNAPSHOT" "$LAST_FULL_SNAPSHOT_PATH"
echo "Last snapshot download: $LAST_FULL_SNAPSHOT_PATH"
else
echo "$LAST_FULL_SNAPSHOT_PATH snapshot exists."
fi

# Clean forest db
$FOREST_TOOL db destroy --force

echo "Starting forest daemon"
nohup $FOREST --no-gc --config ./config.toml --save-token ./admin_token --rpc-address 127.0.0.1:3456 --metrics-address 127.0.0.1:5000 --import-snapshot "$LAST_FULL_SNAPSHOT_PATH" > forest.log 2>&1 &
FOREST_NODE_PID=$!

sleep 30
echo "Forest process started with PID: $FOREST_NODE_PID"

# Function to kill Forest daemon
function kill_forest_daemon {
echo "Killing Forest daemon with PID: $FOREST_NODE_PID"
kill -KILL $FOREST_NODE_PID
}

# Set trap to kill Forest daemon on script exit or error
trap kill_forest_daemon EXIT

# Set required env variables
function set_fullnode_api_info {
ADMIN_TOKEN=$(cat admin_token)
export FULLNODE_API_INFO="$ADMIN_TOKEN:/ip4/127.0.0.1/tcp/3456/http"
echo "Using: $FULLNODE_API_INFO"
}
set_fullnode_api_info

# Wait for network to sync
echo "Waiting for forest to sync to latest network head"
$FOREST_CLI sync wait

# Get latest epoch using sync status
echo "Current Height: $LAST_EPOCH"
LATEST_EPOCH=$($FOREST_CLI sync status | grep "Height:" | awk '{print $2}')
echo "Latest Height: $LATEST_EPOCH"

while ((LATEST_EPOCH - LAST_EPOCH > 30000)); do
set_fullnode_api_info
NEW_EPOCH=$((LAST_EPOCH + 30000))
echo "Next Height: $NEW_EPOCH"

# Export full snapshot to generate lite and diff snapshots
EPOCH_TIMESTAMP=$((GENESIS_TIMESTAMP + NEW_EPOCH*SECONDS_PER_EPOCH))
DATE=$(date --date=@"$EPOCH_TIMESTAMP" -u -I)
NEW_SNAPSHOT="forest_snapshot_mainnet_${DATE}_height_${NEW_EPOCH}.forest.car.zst"
if [ ! -f "$FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT" ]; then
echo "Exporting snapshot: $FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT"
echo "USING FULLNODE API: $FULLNODE_API_INFO"
$FOREST_CLI snapshot export --tipset "$NEW_EPOCH" --depth 30000 -o "$FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT" > export.txt
echo "Snapshot exported: $FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT"
else
echo "$FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT already exists."
fi

# Generate and upload lite snapshot
if [ ! -f "$LITE_SNAPSHOT_DIR/$NEW_SNAPSHOT" ]; then
echo "Generating Lite snapshot: $LITE_SNAPSHOT_DIR/$NEW_SNAPSHOT"
$FOREST_TOOL archive export --epoch "$NEW_EPOCH" --output-path "$LITE_SNAPSHOT_DIR" "$FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT"
echo "Lite snapshot generated: $LITE_SNAPSHOT_DIR/$NEW_SNAPSHOT"
else
echo "$NEW_SNAPSHOT lite snapshot already exists."
fi
echo "Uploading Lite snapshot: $LITE_SNAPSHOT_DIR/$NEW_SNAPSHOT"
aws --profile prod --endpoint "$ENDPOINT" s3 cp "$LITE_SNAPSHOT_DIR/$NEW_SNAPSHOT" "s3://forest-archive/mainnet/lite/"
echo "Lite snapshot uploaded: $LITE_SNAPSHOT_DIR/$NEW_SNAPSHOT"

# Generate and upload diff snapshots
if [ ! -f "$LAST_FULL_SNAPSHOT_PATH" ]; then
echo "File does not exist. Exporting..."
$FOREST_CLI snapshot export --tipset "$LAST_EPOCH" --depth 30000 -o "$LAST_FULL_SNAPSHOT_PATH"
else
echo "$LAST_FULL_SNAPSHOT_PATH file exists."
fi
echo "Generating Diff snapshots: $LAST_EPOCH - $NEW_EPOCH"
./diff_script.sh "$LAST_EPOCH" "$LAST_FULL_SNAPSHOT_PATH" "$FULL_SNAPSHOTS_DIR/$NEW_SNAPSHOT"
echo "Diff snapshots generated successfully"
echo "Uploading Diff snapshots"
./upload_diff.sh "$ENDPOINT"
echo "Diff snapshots uploaded successfully"

LAST_EPOCH=$NEW_EPOCH
done
17 changes: 17 additions & 0 deletions ansible/archival-snapshots/resources/notify.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# frozen_string_literal: true

require 'slack-ruby-client'

CHANNEL = '#forest-dump'
SLACK_TOKEN = ENV.fetch('ARCHIVAL_SLACK_TOKEN')
STATUS = ARGV[0]

client = Slack::Web::Client.new(token: SLACK_TOKEN)

message = if STATUS == 'success'
'✅ Lite and Diff snapshots updated. 🌲🌳🌲🌳🌲'
else
'❌ Failed to update Lite and Diff snapshots. 🔥🌲🔥'
end

client.chat_postMessage(channel: CHANNEL, text: message, as_user: true)
3 changes: 3 additions & 0 deletions ansible/archival-snapshots/resources/ssh_config
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Host archie.chainsafe.dev
ProxyCommand /usr/local/bin/cloudflared access ssh --hostname %h
User archie
11 changes: 11 additions & 0 deletions ansible/archival-snapshots/resources/upload_diff.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

## Enable strict error handling, command tracing, and pipefail
set -euxo pipefail

ENDPOINT="$1"

while read -r file; do
# Upload the file to the S3 bucket
aws --profile prod --endpoint "$ENDPOINT" s3 cp "$file" "s3://forest-archive/mainnet/diff/"
done < /mnt/md0/exported/archival/upload_files.txt
Loading