Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skaled slow processing transctions during catchup #1640

Closed
oleksandrSydorenkoJ opened this issue Sep 7, 2023 · 11 comments · Fixed by #1770
Closed

Skaled slow processing transctions during catchup #1640

oleksandrSydorenkoJ opened this issue Sep 7, 2023 · 11 comments · Fixed by #1770
Assignees
Labels
bug Something isn't working skaled
Milestone

Comments

@oleksandrSydorenkoJ
Copy link

oleksandrSydorenkoJ commented Sep 7, 2023

Related to #1560
Actual for Core, archive and sync nodes

Preconditions
Active chain with MTM

Versions
3.17.0-beta.8
3.17.0-develop.49

Steps to reproduce

  1. Stop skaled container on node A
  2. Run load test of tokent transfer with average block's stats
440 - amount of transactions per block
177289 -  block size in bytes
45098688 - gas usage per block 
  1. Restart skaled container on node A from the catchup
  2. Check skaled logs

Expected state:
Blocks should be catching up faster than new block are mined

Actual state
Skaled processed transactions in catchup block the same time as skaled mined new (or slower)

catchup_3_17_0_develop_49.txt

catchup_3_17_0_beta-8.txt

@PolinaKiporenko
Copy link
Contributor

be investigate during 2.2 release

@PolinaKiporenko PolinaKiporenko moved this from Ready For Pickup to In Progress in SKALE Engineering 🚀 Sep 11, 2023
@oleksandrSydorenkoJ
Copy link
Author

3.17.0-beta.8

root@ip-172-31-59-101:~# cat  catchup_full_3_17_beta_8.txt  | grep CATCHUP
[2023-09-07 15:34:18.898] [25:catchup] [info] 10826:CATCHUP_GOT_BLOCKS:COUNT:383:STARTBLOCK:10827:FROM_NODE:1.1.1.1
[2023-09-07 15:51:10.180] [25:main] [info] 11209:CATCHUP_PROCESSED_BLOCKS:COUNT: 383
[2023-09-07 15:51:31.323] [25:catchup] [info] 11209:CATCHUP_GOT_BLOCKS:COUNT:394:STARTBLOCK:11210:FROM_NODE:2.2.2.2
[2023-09-07 16:12:53.283] [25:main] [info] 11603:CATCHUP_PROCESSED_BLOCKS:COUNT: 394

3.16.1

[2023-09-08 15:51:52.151] [18:catchup] [info] 384129:CATCHUP_GOT_BLOCKS:COUNT:439:STARTBLOCK:384130:FROM_NODE:1.1.1.1
[2023-09-08 15:55:07.808] [18:main] [info] 384568:CATCHUP_PROCESSED_BLOCKS:COUNT: 439
[2023-09-08 15:55:29.313] [18:catchup] [info] 384568:CATCHUP_GOT_BLOCKS:COUNT:392:STARTBLOCK:384569:FROM_NODE:2.2.2.2.
[2023-09-08 15:58:40.330] [18:main] [info] 384960:CATCHUP_PROCESSED_BLOCKS:COUNT: 392
[2023-09-08 15:59:00.348] [18:catchup] [info] 384960:CATCHUP_GOT_BLOCKS:COUNT:272:STARTBLOCK:384961:FROM_NODE:3.3.3.3
[2023-09-08 15:59:07.594] [18:main] [info] 385232:CATCHUP_PROCESSED_BLOCKS:COUNT: 272
[2023-09-08 15:59:39.757] [18:main] [info] 385235:CATCHUP_PROCESSED_BLOCKS:COUNT: 1

@olehnikolaiev olehnikolaiev moved this from In Progress to Ready For Release Candidate in SKALE Engineering 🚀 Sep 22, 2023
@DmytroNazarenko
Copy link
Collaborator

@DmytroNazarenko
Copy link
Collaborator

skaled: 3.17.0-beta.11

@DmytroNazarenko DmytroNazarenko moved this from Ready For Release Candidate to Merged To Release Candidate in SKALE Engineering 🚀 Sep 23, 2023
@oleksandrSydorenkoJ
Copy link
Author

oleksandrSydorenkoJ commented Sep 25, 2023

Relevant for 3.17.0-beta.11

3.16.1 - 0.5 seconds per block
3.17.0-beta.11 - 1.6 seconds per block

DTM:" << _catchupDownloadTimeMs
PTM:" << catchupProcessTimeMs
LTM:" << catchupLockTimeMs

[2023-09-25 16:56:08.227] [19:catchup] [info] 98620:CATCHUP_GOT_BLOCKS:COUNT:376:STARTBLOCK:98621:FROM_NODE:1.1.1.1
[2023-09-25 17:01:56.041] [19:main] [info] 98996:CATCHUP_PROCESSED_BLOCKS:COUNT:376:DTM:19562:PTM:347810:LTM:0
[2023-09-25 17:02:00.752] [19:catchup] [info] 98996:CATCHUP_GOT_BLOCKS:COUNT:81:STARTBLOCK:98997:FROM_NODE:2.2.2.2
btrfs subvolume delete -c /data_dir/snapshots/97324/28e07f34
...
btrfs property set -ts /data_dir/snapshots/99053/blocks_19.db ro true
[2023-09-25 17:04:19.257] [19:main] [info] 99077:CATCHUP_PROCESSED_BLOCKS:COUNT:80:DTM:4398:PTM:138505:LTM:11000
[2023-09-25 17:04:21.157] [19:catchup] [info] 99077:CATCHUP_GOT_BLOCKS:COUNT:34:STARTBLOCK:99078:FROM_NODE:3.3.3.3
[2023-09-25 17:05:09.082] [19:main] [info] 99111:CATCHUP_PROCESSED_BLOCKS:COUNT:34:DTM:1765:PTM:47925:LTM:0
[2023-09-25 17:05:09.823] [19:catchup] [info] 99111:CATCHUP_GOT_BLOCKS:COUNT:12:STARTBLOCK:99112:FROM_NODE:4.4.4.4
[2023-09-25 17:05:25.073] [19:main] [info] 99123:CATCHUP_PROCESSED_BLOCKS:COUNT:12:DTM:510:PTM:15250:LTM:0
[2023-09-25 17:05:25.545] [19:catchup] [info] 99123:CATCHUP_GOT_BLOCKS:COUNT:4:STARTBLOCK:99124:FROM_NODE:5.5.5.5

@github-project-automation github-project-automation bot moved this from Merged To Release Candidate to Ready For Pickup in SKALE Engineering 🚀 Sep 25, 2023
@PolinaKiporenko PolinaKiporenko modified the milestones: 2.2, SKALE 2.3 Sep 26, 2023
@olehnikolaiev olehnikolaiev moved this from Ready For Pickup to In Progress in SKALE Engineering 🚀 Oct 5, 2023
@olehnikolaiev
Copy link
Contributor

olehnikolaiev commented Oct 5, 2023

Relevant for 3.17.0-beta.11

3.16.1 - 0.5 seconds per block 3.17.0-beta.11 - 1.6 seconds per block

DTM:" << _catchupDownloadTimeMs PTM:" << catchupProcessTimeMs LTM:" << catchupLockTimeMs

[2023-09-25 16:56:08.227] [19:catchup] [info] 98620:CATCHUP_GOT_BLOCKS:COUNT:376:STARTBLOCK:98621:FROM_NODE:1.1.1.1
[2023-09-25 17:01:56.041] [19:main] [info] 98996:CATCHUP_PROCESSED_BLOCKS:COUNT:376:DTM:19562:PTM:347810:LTM:0
[2023-09-25 17:02:00.752] [19:catchup] [info] 98996:CATCHUP_GOT_BLOCKS:COUNT:81:STARTBLOCK:98997:FROM_NODE:2.2.2.2
btrfs subvolume delete -c /data_dir/snapshots/97324/28e07f34
...
btrfs property set -ts /data_dir/snapshots/99053/blocks_19.db ro true
[2023-09-25 17:04:19.257] [19:main] [info] 99077:CATCHUP_PROCESSED_BLOCKS:COUNT:80:DTM:4398:PTM:138505:LTM:11000
[2023-09-25 17:04:21.157] [19:catchup] [info] 99077:CATCHUP_GOT_BLOCKS:COUNT:34:STARTBLOCK:99078:FROM_NODE:3.3.3.3
[2023-09-25 17:05:09.082] [19:main] [info] 99111:CATCHUP_PROCESSED_BLOCKS:COUNT:34:DTM:1765:PTM:47925:LTM:0
[2023-09-25 17:05:09.823] [19:catchup] [info] 99111:CATCHUP_GOT_BLOCKS:COUNT:12:STARTBLOCK:99112:FROM_NODE:4.4.4.4
[2023-09-25 17:05:25.073] [19:main] [info] 99123:CATCHUP_PROCESSED_BLOCKS:COUNT:12:DTM:510:PTM:15250:LTM:0
[2023-09-25 17:05:25.545] [19:catchup] [info] 99123:CATCHUP_GOT_BLOCKS:COUNT:4:STARTBLOCK:99124:FROM_NODE:5.5.5.5

@oleksandrSydorenkoJ could you please add skaled logs regarding the 3.17.0-beta.11 performance? values of SWT are appreciated

@oleksandrSydorenkoJ
Copy link
Author

Relevant for 3.17.0-beta.11
3.16.1 - 0.5 seconds per block 3.17.0-beta.11 - 1.6 seconds per block
DTM:" << _catchupDownloadTimeMs PTM:" << catchupProcessTimeMs LTM:" << catchupLockTimeMs

[2023-09-25 16:56:08.227] [19:catchup] [info] 98620:CATCHUP_GOT_BLOCKS:COUNT:376:STARTBLOCK:98621:FROM_NODE:1.1.1.1
[2023-09-25 17:01:56.041] [19:main] [info] 98996:CATCHUP_PROCESSED_BLOCKS:COUNT:376:DTM:19562:PTM:347810:LTM:0
[2023-09-25 17:02:00.752] [19:catchup] [info] 98996:CATCHUP_GOT_BLOCKS:COUNT:81:STARTBLOCK:98997:FROM_NODE:2.2.2.2
btrfs subvolume delete -c /data_dir/snapshots/97324/28e07f34
...
btrfs property set -ts /data_dir/snapshots/99053/blocks_19.db ro true
[2023-09-25 17:04:19.257] [19:main] [info] 99077:CATCHUP_PROCESSED_BLOCKS:COUNT:80:DTM:4398:PTM:138505:LTM:11000
[2023-09-25 17:04:21.157] [19:catchup] [info] 99077:CATCHUP_GOT_BLOCKS:COUNT:34:STARTBLOCK:99078:FROM_NODE:3.3.3.3
[2023-09-25 17:05:09.082] [19:main] [info] 99111:CATCHUP_PROCESSED_BLOCKS:COUNT:34:DTM:1765:PTM:47925:LTM:0
[2023-09-25 17:05:09.823] [19:catchup] [info] 99111:CATCHUP_GOT_BLOCKS:COUNT:12:STARTBLOCK:99112:FROM_NODE:4.4.4.4
[2023-09-25 17:05:25.073] [19:main] [info] 99123:CATCHUP_PROCESSED_BLOCKS:COUNT:12:DTM:510:PTM:15250:LTM:0
[2023-09-25 17:05:25.545] [19:catchup] [info] 99123:CATCHUP_GOT_BLOCKS:COUNT:4:STARTBLOCK:99124:FROM_NODE:5.5.5.5

@oleksandrSydorenkoJ could you please add skaled logs regarding the 3.17.0-beta.11 performance? values of SWT are appreciated

skaled_slow_transaction_processing_during_catchup.txt

@PolinaKiporenko
Copy link
Contributor

The fix is ready but not merged yet @olehnikolaiev please link the corresponding PR once you merged it, it can't be properly tested on DevNet due to low disk storage, waiting for QA testing

olehnikolaiev added a commit that referenced this issue Dec 5, 2023
@olehnikolaiev olehnikolaiev moved this from In Progress to Blocked in SKALE Engineering 🚀 Dec 11, 2023
@olehnikolaiev olehnikolaiev moved this from Blocked to In Progress in SKALE Engineering 🚀 Dec 26, 2023
@olehnikolaiev olehnikolaiev linked a pull request Dec 27, 2023 that will close this issue
@olehnikolaiev olehnikolaiev moved this from In Progress to Code Review in SKALE Engineering 🚀 Dec 27, 2023
@olehnikolaiev olehnikolaiev moved this from Code Review to Ready For Release Candidate in SKALE Engineering 🚀 Jan 12, 2024
@DmytroNazarenko
Copy link
Collaborator

skaled: 3.18.0-beta.0

@DmytroNazarenko DmytroNazarenko moved this from Ready For Release Candidate to Merged To Release Candidate in SKALE Engineering 🚀 Jan 23, 2024
@EvgeniyZZ EvgeniyZZ moved this from Merged To Release Candidate to QA in SKALE Engineering 🚀 Jan 23, 2024
@OleksanderSalamatov
Copy link

Verified

Skaled: 3.18.0

Each block contains from 50 to 1000 txs from 1000 users, tested via load tests in qa-test-automation.
Result:
Stopped node catchups around 34753 tx/m, while regular mining speed of these blocks was 4953 tx/m, so catchup is faster ~7 times

Short logs for comparison:
stopped_tx+blocks.log
running_tx+blocks.log

@EvgeniyZZ EvgeniyZZ moved this from QA to Done in SKALE Engineering 🚀 Mar 25, 2024
@oleksandrSydorenkoJ
Copy link
Author

Actual for skaled:3.20.0 with connected 3 indexer nodes
Related to https://github.com/skalenetwork/internal-support/issues/1099

Each of these nodes increased the number of catchuped blocks over some time:

17_node_slow_catchup.txt
11_node_slow_catchup.txt
14_node_slow_catchup.txt
6_node_slow_catchup.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working skaled
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants