Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skale admin doesn't restart schain monitoring if monitoring hangs #1044

Closed
oleksandrSydorenkoJ opened this issue Feb 7, 2024 · 3 comments · Fixed by #1098
Closed

Skale admin doesn't restart schain monitoring if monitoring hangs #1044

oleksandrSydorenkoJ opened this issue Feb 7, 2024 · 3 comments · Fixed by #1098
Assignees
Labels
bug Something isn't working
Milestone

Comments

@oleksandrSydorenkoJ
Copy link

oleksandrSydorenkoJ commented Feb 7, 2024

Describe the bug
In Skale Admin, there are two processes for each chain - one monitor is responsible for the status of contracts (creating chains, rotating ERC), and the second monitor checks the status of each Skaled container. Sometimes, the chain monitoring process hangs, ceases to initiate checks, and, in case of issues within the Skaled container, fails to perform the specified actions for recovery or restarting of such a Skaled.

Preconditions:
16 nodes
8 active chains
2 GETH nodes for 8 nodes each
Disabled auto_repair option in container config stream

Versions:
skalenetwork/admin:2.5.4-develop.1

To Reproduce
Clear steps for reproduction are missing.
To investigate, it is necessary for a failure to occur and to increase the log size for Skale-admin on QANet

Expected behavior
Skale admin should run 2 monitors after Geth nodes are synced

Actual state:
In some cases, skale admin initiates only contract monitoring, but does not restart the schain monitoring if it hangs.

Temporary workaround:
Restart the Skale admin container manually

@PolinaKiporenko
Copy link

investigate during 2.3 release

@DmytroNazarenko DmytroNazarenko modified the milestones: SKALE 2.3, SKALE 2.4 Feb 7, 2024
@PolinaKiporenko PolinaKiporenko moved this to Ready For Pickup in SKALE Engineering 🚀 Feb 21, 2024
@PolinaKiporenko PolinaKiporenko modified the milestones: SKALE 2.4, SKALE 2.5 Apr 9, 2024
@PolinaKiporenko
Copy link

Temporary workaround - repair with restart admin container

@badrogger badrogger moved this from Ready For Pickup to In Progress in SKALE Engineering 🚀 Jun 4, 2024
@badrogger badrogger moved this from In Progress to Ready For Pickup in SKALE Engineering 🚀 Jul 18, 2024
@badrogger badrogger moved this from Ready For Pickup to In Progress in SKALE Engineering 🚀 Jul 19, 2024
@badrogger badrogger linked a pull request Aug 30, 2024 that will close this issue
@github-project-automation github-project-automation bot moved this from Code Review to Ready For Release Candidate in SKALE Engineering 🚀 Oct 21, 2024
@PolinaKiporenko PolinaKiporenko moved this from Ready For Release Candidate to Merged To Release Candidate in SKALE Engineering 🚀 Oct 21, 2024
@yatsunastya
Copy link

Checked ✅
Version: 2.8.0-beta.1

As it's not clear how to verify this ticket, it was decided to check that main basic functionality isn't broken.

@EvgeniyZZ EvgeniyZZ moved this from QA to Done in SKALE Engineering 🚀 Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants