Skip to content

Commit

Permalink
Merge pull request #420 from celo-org/node-chart-sequencer
Browse files Browse the repository at this point in the history
[WIP] OP node stopped flag depending on RID
  • Loading branch information
alvarof2 authored Nov 1, 2024
2 parents 54ac9d7 + 1a73e1e commit 0680090
Show file tree
Hide file tree
Showing 19 changed files with 458 additions and 69 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ The charts are published to the OCI registry at `oci://us-west1-docker.pkg.dev/d
- [op-bootnode](./charts/op-bootnode/README.md) - Celo implementation for op-bootnode (Optimism Rollup)
- [op-conductor](./charts/op-conductor/README.md) - Helm chart deploying OP Conductor, a HA controller for op-node
- [op-conductor-mon](./charts/op-conductor-mon/README.md) - A Helm chart for OP Conductor monitoring
- [op-conductor-start-tool](./charts/op-conductor-start-tool/README.md) - OP Conductor start tool Cel2 network
- [op-geth](./charts/op-geth/README.md) - Celo implementation for op-geth execution engine (Optimism Rollup)
- [op-geth-bootnode](./charts/op-geth-bootnode/README.md) - Celo implementation for op-geth-bootnode execution engine (Optimism Rollup)
- [op-node](./charts/op-node/README.md) - Celo implementation for op-node consensus engine (Optimism Rollup)
Expand Down
27 changes: 27 additions & 0 deletions charts/op-conductor-start-tool/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
name: op-conductor-start-tool
apiVersion: v2
version: 0.0.1
description: OP Conductor start tool Cel2 network
home: https://clabs.co
sources:
- https://celo.org
- https://docs.celo.org
- https://clabs.co
- https://github.com/celo-org
keywords:
- celo
- blockchain
- optimism
- rollup
- ethereum
- layer2
- op-stack
- op-conductor
maintainers:
- name: cLabs
email: [email protected]
url: https://clabs.co
type: application
icon: https://pbs.twimg.com/profile_images/1613170131491848195/InjXBNx9_400x400.jpg
appVersion: v1.0.0
43 changes: 43 additions & 0 deletions charts/op-conductor-start-tool/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# op-conductor-start-tool

![Version: 0.0.1](https://img.shields.io/badge/Version-0.0.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v1.0.0](https://img.shields.io/badge/AppVersion-v1.0.0-informational?style=flat-square)

OP Conductor start tool Cel2 network

**Homepage:** <https://clabs.co>

## Maintainers

| Name | Email | Url |
| ---- | ------ | --- |
| cLabs | <[email protected]> | <https://clabs.co> |

## Source Code

* <https://celo.org>
* <https://docs.celo.org>
* <https://clabs.co>
* <https://github.com/celo-org>

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| image.pullPolicy | string | `"IfNotPresent"` | |
| image.repository | string | `"alpine"` | |
| image.tag | float | `3.19` | |
| opConductor.consensus.namePattern | string | `"op-conductor-consensus"` | |
| opConductor.consensus.port | string | `"50050"` | |
| opConductor.protocol | string | `"http"` | |
| opConductor.replicas | int | `3` | |
| opConductor.rpc.namePattern | string | `"op-conductor-rpc"` | |
| opConductor.rpc.port | string | `"8545"` | |
| opNode.namePattern | string | `"op-node-sequencer-rpc"` | |
| opNode.port | string | `"9545"` | |
| opNode.protocol | string | `"http"` | |
| opNode.replicas | int | `3` | |
| schedule | string | `"0 0 30 2 0"` | |
| suspend | bool | `true` | |

----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
52 changes: 52 additions & 0 deletions charts/op-conductor-start-tool/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{{/* vim: set filetype=mustache: */}}
{{/*
Expand the name of the chart.
*/}}
{{- define "op-conductor-start-tool.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
{{- end -}}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "op-conductor-start-tool.fullname" -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- end -}}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "op-conductor-start-tool.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
{{- end -}}

{{/*
Common labels
*/}}
{{- define "op-conductor-start-tool.labels" -}}
helm.sh/chart: {{ include "op-conductor-start-tool.chart" . }}
{{ include "op-conductor-start-tool.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "op-conductor-start-tool.selectorLabels" -}}
app.kubernetes.io/name: {{ include "op-conductor-start-tool.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
174 changes: 174 additions & 0 deletions charts/op-conductor-start-tool/templates/cronjob.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
apiVersion: batch/v1
kind: CronJob
metadata:
name: {{ .Release.Name }}
labels:
{{- include "op-conductor-start-tool.labels" . | nindent 4 }}
component: op-conductor-start-tool
spec:
schedule: "{{ .Values.schedule }}"
suspend: {{ .Values.suspend }}
concurrencyPolicy: Forbid
jobTemplate:
spec:
backoffLimit: 0
template:
metadata:
labels:
{{- include "op-conductor-start-tool.labels" . | nindent 12 }}
spec:
containers:
- name: switch-and-start
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- /bin/sh
- -c
args:
- |
apk add curl
apk add jq
first=0
last=$(( {{ .Values.opNode.replicas }} - 1 ))
echo "Check OP Node status"
i=0
while [ $i -lt {{ .Values.opNode.replicas }} ]
do
echo "Checking OP Node $i"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"admin_sequencerActive","params":[],"id":1}' \
{{ .Values.opNode.protocol }}://{{ .Values.opNode.namePattern }}-$i:{{ .Values.opNode.port }} -s | jq .result > /tmp/RESULT_NODE_${i}_ACTIVE
echo "OP Node $i active? $(cat /tmp/RESULT_NODE_${i}_ACTIVE)"
i=$((i + 1))
done
echo "Check OP Conductor status"
i=0
while [ $i -lt {{ .Values.opConductor.replicas }} ]
do
echo "Checking OP Conductor $i"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_stopped","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$i:{{ .Values.opConductor.rpc.port }} -s | jq .result > /tmp/RESULT_COND_${i}_STOPPED
echo "OP Conductor $i stopped? $(cat /tmp/RESULT_COND_${i}_STOPPED)"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_paused","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$i:{{ .Values.opConductor.rpc.port }} -s | jq .result > /tmp/RESULT_COND_${i}_PAUSED
echo "OP Conductor $i paused? $(cat /tmp/RESULT_COND_${i}_PAUSED)"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_sequencerHealthy","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$i:{{ .Values.opConductor.rpc.port }} -s | jq .result > /tmp/RESULT_COND_${i}_SEQ_HEALTHY
echo "OP Conductor $i sequencer healthy? $(cat /tmp/RESULT_COND_${i}_SEQ_HEALTHY)"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_leader","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$i:{{ .Values.opConductor.rpc.port }} -s | jq .result > /tmp/RESULT_COND_${i}_LEADER
echo "OP Conductor $i leader? $(cat /tmp/RESULT_COND_${i}_LEADER)"
i=$((i + 1))
done
if [ $(cat /tmp/RESULT_NODE_0_ACTIVE) = "true" ] && [ $(cat /tmp/RESULT_NODE_2_ACTIVE) = "false" ] && \
[ $(cat /tmp/RESULT_COND_2_STOPPED) = "false" ] && \
[ $(cat /tmp/RESULT_COND_2_PAUSED) = "true" ] && \
[ $(cat /tmp/RESULT_COND_2_SEQ_HEALTHY) = "true" ] && \
[ $(cat /tmp/RESULT_COND_2_LEADER) = "true" ]; then
echo "Requirements for switch and start are met"
SWITCH=true
else
echo "Requirements for switch and start are NOT met. Skipping..."
SWITCH=false
fi
if [ $SWITCH = true ]; then
echo "Stopping OP Node $first"
LAST_UNSAFE_HASH=$(curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"admin_stopSequencer","params":[],"id":1}' \
{{ .Values.opNode.protocol }}://{{ .Values.opNode.namePattern }}-$first:{{ .Values.opNode.port }} -s | jq -r .result)
echo "Stopped OP Node $first with unsafe hash $LAST_UNSAFE_HASH. Starting OP Node $last..."
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"admin_startSequencer","params":["'${LAST_UNSAFE_HASH}'"],"id":1}' \
{{ .Values.opNode.protocol }}://{{ .Values.opNode.namePattern }}-$last:{{ .Values.opNode.port }} -s
echo "Started OP Node $last, checking OP Node $last status..."
STARTED=$(curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"admin_sequencerActive","params":[],"id":1}' \
{{ .Values.opNode.protocol }}://{{ .Values.opNode.namePattern }}-$last:{{ .Values.opNode.port }} -s | jq .result)
if [ "$STARTED" = true ]; then
echo "OP Node $last is active"
else
echo "Failed to start OP Node $last"
echo "admin_sequencerActive result: $STARTED"
echo "Falling back to activating OP Node $first"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"admin_startSequencer","params":["'${LAST_UNSAFE_HASH}'"],"id":1}' \
{{ .Values.opNode.protocol }}://{{ .Values.opNode.namePattern }}-$first:{{ .Values.opNode.port }} -s
exit 1
fi
fi
echo "Checking raft cluster..."
CLUSTER_MEMBERS=$(curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_clusterMembership","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$last:{{ .Values.opConductor.rpc.port }} -s | jq '.result.servers | length')
echo "Current raft cluster members: $CLUSTER_MEMBERS"
if [ "$CLUSTER_MEMBERS" = 1 ]; then
echo "Forming raft cluster..."
echo "Checking OP Conductor $last is leader"
CHECK_LEADER=$(curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_leader","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$last:{{ .Values.opConductor.rpc.port }} -s | jq .result)
if [ "$CHECK_LEADER" = false ]; then
echo "OP Conductor $last is NOT the leader, exiting..."
exit 1
fi
i=0
echo "Sending addServerAsVoter to OP Conductor $last"
while [ $i -lt $(( {{ .Values.opNode.replicas }} - 1 )) ]
do
echo "Sending addServerAsVoter to OP Conductor $last for member $i"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_addServerAsVoter","params":["'${i}'", "{{ .Values.opConductor.consensus.namePattern }}-'${i}':{{ .Values.opConductor.consensus.port }}", 0],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$last:{{ .Values.opConductor.rpc.port }} -s
i=$((i + 1))
done
CLUSTER_MEMBERS=$(curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_clusterMembership","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$last:{{ .Values.opConductor.rpc.port }} -s | jq '.result.servers | length')
if [ "$CLUSTER_MEMBERS" = {{ .Values.opNode.replicas }} ]; then
echo "Done forming raft cluster"
else
echo "Failed to form raft cluster. Exiting..."
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_clusterMembership","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$last:{{ .Values.opConductor.rpc.port }} -s
exit 1
fi
else
echo "Raft cluster already formed"
fi
if [ $(cat /tmp/RESULT_COND_0_PAUSED) = "true" ] && \
[ $(cat /tmp/RESULT_COND_1_PAUSED) = "true" ] && \
[ $(cat /tmp/RESULT_COND_2_PAUSED) = "true" ]; then
echo "Conductors are paused"
UNPAUSE=true
else
echo "Conductors are NOT paused. Skipping..."
UNPAUSE=false
fi
if [ $UNPAUSE = true ]; then
echo "Unpausing OP Conductors..."
i=0
while [ $i -lt {{ .Values.opConductor.replicas }} ]
do
echo "Sending conductor_resume to OP Conductor $i"
curl -X POST -H "Content-Type: application/json" --data \
'{"jsonrpc":"2.0","method":"conductor_resume","params":[],"id":1}' \
{{ .Values.opConductor.protocol }}://{{ .Values.opConductor.rpc.namePattern }}-$i:{{ .Values.opConductor.rpc.port }} -s
i=$((i + 1))
done
echo "OP Conductors unpaused."
fi
restartPolicy: Never
21 changes: 21 additions & 0 deletions charts/op-conductor-start-tool/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
schedule: "0 0 30 2 0"
suspend: true
image:
repository: alpine
tag: 3.19
pullPolicy: IfNotPresent
opNode:
replicas: 3
protocol: "http"
namePattern: "op-node-sequencer-rpc"
port: "9545"
opConductor:
replicas: 3
protocol: "http"
rpc:
namePattern: "op-conductor-rpc"
port: "8545"
consensus:
namePattern: "op-conductor-consensus"
port: "50050"
2 changes: 1 addition & 1 deletion charts/op-conductor/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
name: op-conductor
apiVersion: v2
version: 0.0.5
version: 0.0.6
description: Helm chart deploying OP Conductor, a HA controller for op-node
home: https://clabs.co
sources:
Expand Down
12 changes: 7 additions & 5 deletions charts/op-conductor/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# op-conductor

![Version: 0.0.5](https://img.shields.io/badge/Version-0.0.5-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v1.8.0](https://img.shields.io/badge/AppVersion-v1.8.0-informational?style=flat-square)
![Version: 0.0.6](https://img.shields.io/badge/Version-0.0.6-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v1.8.0](https://img.shields.io/badge/AppVersion-v1.8.0-informational?style=flat-square)

Helm chart deploying OP Conductor, a HA controller for op-node

Expand All @@ -24,8 +24,10 @@ Helm chart deploying OP Conductor, a HA controller for op-node

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| config.consensus.addr | string | `""` | |
| config.consensus.port | int | `50050` | |
| config.execution.namePattern | string | `""` | |
| config.execution.port | string | `""` | |
| config.execution.protocol | string | `""` | |
| config.execution.rpc | string | `"http://op-geth:8545"` | |
| config.healthcheck.interval | int | `10` | |
| config.healthcheck.minPeerCount | int | `1` | |
Expand All @@ -38,11 +40,11 @@ Helm chart deploying OP Conductor, a HA controller for op-node
| config.metrics.enabled | bool | `true` | |
| config.metrics.port | int | `7300` | |
| config.network | string | `""` | |
| config.node.namePattern | string | `""` | |
| config.node.port | string | `""` | |
| config.node.protocol | string | `""` | |
| config.node.rpc | string | `"http://op-node:8547"` | |
| config.paused | bool | `false` | |
| config.raft.bootstrap | bool | `false` | |
| config.raft.server.id | int | `1` | |
| config.raft.storage.dir | string | `"/raft"` | |
| config.rpc.addr | string | `"0.0.0.0"` | |
| config.rpc.enableAdmin | bool | `false` | |
| config.rpc.enableProxy | bool | `true` | |
Expand Down
11 changes: 11 additions & 0 deletions charts/op-conductor/templates/configmap-scripts.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ template "op-conductor.fullname" . }}-scripts
labels:
{{- include "op-conductor.labels" . | nindent 4 }}
data:
download-rollup.sh: |-
{{- include (print $.Template.BasePath "/scripts/_download-rollup.tpl") . | nindent 4 }}
split-config-parameters.sh: |-
{{- include (print $.Template.BasePath "/scripts/_split-config-parameters.tpl") . | nindent 4 }}
11 changes: 11 additions & 0 deletions charts/op-conductor/templates/scripts/_download-rollup.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env sh
set -e

datadir="{{ .Values.persistence.mountPath }}"
if [ ! -f $datadir/.initialized ]; then
wget -qO $datadir/rollup.json "{{ .Values.init.rollup.url }}"
touch $datadir/.initialized
echo "Successfully downloaded rollup files"
else
echo "Already downloaded, skipping."
fi
Loading

0 comments on commit 0680090

Please sign in to comment.