Skip to content

Commit

Permalink
Kes expiration fixes (#1839)
Browse files Browse the repository at this point in the history
## Description
Improvements to KES expiration information calculation/display on
cntools launch, as well as pool -> list and pool -> show pages.

## Where should the reviewer start?
<!--- Describe where reviewer should start testing -->

## Motivation and context
<!--- Why is this change required? What problem does it solve? -->

## Which issue it fixes?
<!--- Link to issue: Closes #issue-number -->

## How has this been tested?
<!--- Describe how you tested changes -->

---------

Co-authored-by: Ola <[email protected]>
Co-authored-by: Greg B <[email protected]>
  • Loading branch information
3 people authored Nov 26, 2024
1 parent 3360a8c commit 4853494
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 24 deletions.
4 changes: 4 additions & 0 deletions docs/Scripts/cntools-changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ All notable changes to this tool will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [13.3.1] - 2024-11-25
#### Fixed
- Corrected KES expiration information calculation on cntools launch, pool -> list and pool -> show screens

## [13.3.0] - 2024-11-21
#### Added
- Own votes cast (SPO|DRep|CC) shown in proposal list.
Expand Down
5 changes: 3 additions & 2 deletions scripts/cnode-helper-scripts/cntools.library
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ CNTOOLS_MAJOR_VERSION=13
# Minor: Changes and features of minor character that can be applied without breaking existing functionality or workflow
CNTOOLS_MINOR_VERSION=3
# Patch: Backwards compatible bug fixes. No additional functionality or major changes
CNTOOLS_PATCH_VERSION=0
CNTOOLS_PATCH_VERSION=1

CNTOOLS_VERSION="${CNTOOLS_MAJOR_VERSION}.${CNTOOLS_MINOR_VERSION}.${CNTOOLS_PATCH_VERSION}"
DUMMYFEE=20000
Expand Down Expand Up @@ -762,13 +762,14 @@ isPoolRegistered() {
unset p_active_epoch_no p_vrf_key_hash p_margin p_fixed_cost p_pledge p_reward_addr p_owners p_relays p_meta_url p_meta_hash p_meta_json p_pool_status
unset p_retiring_epoch p_op_cert p_op_cert_counter p_active_stake p_epoch_block_cnt p_live_stake p_live_delegators p_live_saturation
if [[ ${CNTOOLS_MODE} != "LIGHT" ]]; then
[[ -f "${POOL_FOLDER}/${1}/${POOL_REGCERT_FILENAME}" ]] && return 2 || return 1
[[ -f "${POOL_FOLDER}/${1}/${POOL_REGCERT_FILENAME}" ]] && return 2 || (rm -f "${POOL_FOLDER}/${1}/${POOL_CURRENT_KES_START}" && return 1)
else
getPoolID "$1"
HEADERS=("${KOIOS_API_HEADERS[@]}" -H "Content-Type: application/json")
println ACTION "curl -sSL -f -X POST ${HEADERS[*]} -d '{\"_pool_bech32_ids\":[\"${pool_id_bech32}\"]}' ${KOIOS_API}/pool_info"
! pool_info=$(curl -sSL -f -X POST "${HEADERS[@]}" -d '{"_pool_bech32_ids":["'${pool_id_bech32}'"]}' "${KOIOS_API}/pool_info" 2>&1) && error_msg=${pool_info} && return 0
if [[ ${pool_info} = '[]' ]]; then
# possibly more cleanup needed, like rm -rf "${POOL_FOLDER}/${1}/${POOL_CURRENT_KES_START}" and ${POOL_REGCERT_FILENAME} if retirement was issued outside of CNTools?
return 1
fi
pool_info_tsv=$(jq -r '[
Expand Down
50 changes: 28 additions & 22 deletions scripts/cnode-helper-scripts/cntools.sh
Original file line number Diff line number Diff line change
Expand Up @@ -237,9 +237,12 @@ kes_rotation_needed="no"
if [[ ${CHECK_KES} = true ]]; then

while IFS= read -r -d '' pool; do
unset pool_kes_start
[[ ${CNTOOLS_MODE} = "LOCAL" ]] && getNodeMetrics
[[ (-z ${remaining_kes_periods} || ${remaining_kes_periods} -eq 0) && -f "${pool}/${POOL_CURRENT_KES_START}" ]] && unset remaining_kes_periods && pool_kes_start="$(cat "${pool}/${POOL_CURRENT_KES_START}")"
if [[ ! -f "${pool}/${POOL_CURRENT_KES_START}" ]]; then
continue
fi

unset remaining_kes_periods
pool_kes_start="$(cat "${pool}/${POOL_CURRENT_KES_START}")"

if ! kesExpiration ${pool_kes_start}; then println ERROR "${FG_RED}ERROR${NC}: failure during KES calculation for ${FG_GREEN}$(basename ${pool})${NC}" && waitToProceed && continue; fi

Expand Down Expand Up @@ -2801,25 +2804,26 @@ function main {
println "$(printf "%-21s : ${FG_LGRAY}%s${NC}" "ID (hex)" "${pool_id}")"
[[ -n ${pool_id_bech32} ]] && println "$(printf "%-21s : ${FG_LGRAY}%s${NC}" "ID (bech32)" "${pool_id_bech32}")"
println "$(printf "%-21s : %s" "Registered" "${pool_registered}")"
unset pool_kes_start
if [[ ${CNTOOLS_MODE} = "LOCAL" ]]; then
getNodeMetrics
else

if [[ ${pool_registered} = *YES* ]]; then
unset pool_kes_start
unset remaining_kes_periods
[[ -f "${pool}/${POOL_CURRENT_KES_START}" ]] && pool_kes_start="$(cat "${pool}/${POOL_CURRENT_KES_START}")"
fi
if ! kesExpiration ${pool_kes_start}; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC}%s${FG_GREEN}%s${NC}" "KES expiration date" "ERROR" ": failure during KES calculation for " "$(basename ${pool})")"
else
if [[ ${expiration_time_sec_diff} -lt ${KES_ALERT_PERIOD} ]]; then
if [[ ${expiration_time_sec_diff} -lt 0 ]]; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC} %s ago" "KES expiration date" "${kes_expiration}" "EXPIRED!" "$(timeLeft ${expiration_time_sec_diff:1})")"

if ! kesExpiration ${pool_kes_start}; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC}%s${FG_GREEN}%s${NC}" "KES expiration date" "ERROR" ": failure during KES calculation for " "$(basename ${pool})")"
else
if [[ ${expiration_time_sec_diff} -lt ${KES_ALERT_PERIOD} ]]; then
if [[ ${expiration_time_sec_diff} -lt 0 ]]; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC} %s ago" "KES expiration date" "${kes_expiration}" "EXPIRED!" "$(timeLeft ${expiration_time_sec_diff:1})")"
else
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC} %s until expiration" "KES expiration date" "${kes_expiration}" "ALERT!" "$(timeLeft ${expiration_time_sec_diff})")"
fi
elif [[ ${expiration_time_sec_diff} -lt ${KES_WARNING_PERIOD} ]]; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_YELLOW}%s${NC} %s until expiration" "KES expiration date" "${kes_expiration}" "WARNING!" "$(timeLeft ${expiration_time_sec_diff})")"
else
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC} %s until expiration" "KES expiration date" "${kes_expiration}" "ALERT!" "$(timeLeft ${expiration_time_sec_diff})")"
println "$(printf "%-21s : ${FG_LGRAY}%s${NC}" "KES expiration date" "${kes_expiration}")"
fi
elif [[ ${expiration_time_sec_diff} -lt ${KES_WARNING_PERIOD} ]]; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_YELLOW}%s${NC} %s until expiration" "KES expiration date" "${kes_expiration}" "WARNING!" "$(timeLeft ${expiration_time_sec_diff})")"
else
println "$(printf "%-21s : ${FG_LGRAY}%s${NC}" "KES expiration date" "${kes_expiration}")"
fi
fi
done < <(find "${POOL_FOLDER}" -mindepth 1 -maxdepth 1 -type d -print0 | sort -z)
Expand Down Expand Up @@ -3119,7 +3123,6 @@ function main {
println "$(printf "%-21s : ${FG_LBLUE}%s${NC} %%" "Saturation" "${p_live_saturation}")"
fi

unset pool_kes_start
if [[ -n ${KOIOS_API} ]]; then
[[ ${p_op_cert_counter} != null ]] && kes_counter_str="${FG_LBLUE}${p_op_cert_counter}${FG_LGRAY} - use counter ${FG_LBLUE}$((p_op_cert_counter+1))${FG_LGRAY} for rotation in offline mode.${NC}" || kes_counter_str="${FG_LGRAY}No blocks minted so far with active operational certificate. Use counter ${FG_LBLUE}0${FG_LGRAY} for rotation in offline mode.${NC}"
println "$(printf "%-21s : %s" "KES counter" "${kes_counter_str}")"
Expand All @@ -3137,9 +3140,12 @@ function main {
fi
println "$(printf "%-21s : %s" "KES counter" "${kes_counter_str}")"
getNodeMetrics
else
[[ -f "${POOL_FOLDER}/${pool_name}/${POOL_CURRENT_KES_START}" ]] && pool_kes_start="$(cat "${POOL_FOLDER}/${pool_name}/${POOL_CURRENT_KES_START}")"
fi

unset pool_kes_start
[[ -f "${POOL_FOLDER}/${pool_name}/${POOL_CURRENT_KES_START}" ]] && pool_kes_start="$(cat "${POOL_FOLDER}/${pool_name}/${POOL_CURRENT_KES_START}")"
unset remaining_kes_periods

if ! kesExpiration ${pool_kes_start}; then
println "$(printf "%-21s : ${FG_LGRAY}%s${NC} - ${FG_RED}%s${NC}%s${FG_GREEN}%s${NC}" "KES expiration date" "ERROR" ": failure during KES calculation for " "$(basename ${pool})")"
else
Expand Down

6 comments on commit 4853494

@getwildr
Copy link
Collaborator

@getwildr getwildr commented on 4853494 Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

It looks like there might be something wrong with what's been done here i.e. cntools.sh L240.

I am running cntools.sh on my on my online node (no cold keys here) and gets an ALERT KES EXPIRED though it's not (GLV says it's all OK and is right I can tell).
Indeed in this code you modified, you don't run anymore getNodeMetrics.
And only use kes.start file. But on an online node this file contains the copy of the first import, that means more or less the kes start when we first created/registered the pool.
I have tried to remove this file to check whether you have a fallback but then I get a KES calculation error in POOL >> SHOW

Here are some useful data taken from my env

** WARNING **
Pool POOL in need of KES key rotation
Keys expired! : 993d 00:00:00 ago

press any key to proceed ..

CNTools terminated, cleaning up...

ubuntu@cn:/opt/cardano/cnode/scripts$ curl -s IP:PORT/metrics | grep -i kes
cardano_node_metrics_operationalCertificateStartKESPeriod_int 1093
cardano_node_metrics_operationalCertificateExpiryKESPeriod_int 1155
cardano_node_metrics_remainingKESPeriods_int 46
cardano_node_metrics_currentKESPeriod_int 1109
ubuntu@cn:/opt/cardano/cnode/scripts$ cat ../priv/pool/POOL/kes.start
385

After updating kes.start with the number seen in metric, i.e. 1093, it's all back to normal as usual.

@Scitz0
Copy link
Contributor

@Scitz0 Scitz0 commented on 4853494 Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check if we can re-add the online check when in LOCAL mode.

But the idea is that when rotating KES offline, to also move over the kes.start file updated on the offline machine. You can also in cntools.sh file in User Variables section at the top set CHECK_KES=false to disable the KES check on startup (just remove the # in front of the line).

@getwildr
Copy link
Collaborator

@getwildr getwildr commented on 4853494 Jan 9, 2025 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scitz0
Copy link
Contributor

@Scitz0 Scitz0 commented on 4853494 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@getwildr The best thing if you want a change around this is to first raise a proper issue, these comments on old commits are often missed and/or forgotten. Please be as specific as possible in the issue raised without assuming that people have read the comments here.

@getwildr
Copy link
Collaborator

@getwildr getwildr commented on 4853494 Jan 16, 2025 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@getwildr
Copy link
Collaborator

@getwildr getwildr commented on 4853494 Jan 16, 2025 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.