Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Improve stability of system tests #6486

Closed
wants to merge 41 commits into from

Conversation

fasmat
Copy link
Member

@fasmat fasmat commented Nov 22, 2024

Motivation

This PR tries to improve unstable system tests

Description

  • sendTransactions should now cancel when something goes wrong much earlier. Before it would occasionally not catch the stop layer and continue indefinitely. Now it stopes on any layer on or after stop. Additionally the context used to query the nodes via GRPC is also cancelled in a timely fashion to prevent a test from getting stuck here.
  • I tried to get rid of the hard coded sleep in watchLayers by changing the asserted status from status approved to applied but even when a layer is applied an address can still not be spawned - so I re-added the sleep.
  • Updated the images used during system tests: old_smesher, post_init, post_service, certifier since they were all outdated already
  • Decreased the number of nodes used in a typical test from 30 to 20, reduced poets in tests from 3 to 2.
  • Added additional logging information in the fetcher, because in some tests transactions that were seemingly fetched from peers are later not found in the DB and not fetched again (still investigating)
  • Made test runner more verbose - instead of only printing logs from failing tests it now prints logs from all tests, to avoid investigating the wrong test for a failed test run (should be reverted when tests are more stable again)
  • Set GOMAXPROCS from a hard coded 4 to limits.cpu set on the k8s cluster
  • Fixed a possible deadlock in partition tests where a go channel can run full causing the test to never complete
  • Persisting active sets in the proposal builder can cause SQL_BUSY errors during system tests so I changed it to TxImmediate for now
  • If transactions fail to be sent (because e.g. they were sent to the node before the account they are referencing has been spawned) the node now stops waiting for the transaction results in a timely manner instead of until the timeout of the test hits
  • Instead of sleeping between spawning an account and sending a transaction I changed the code to wait at least 2 layers and then send the first transaction.

Test Plan

  • system tests pass

TODO

  • Explain motivation or link existing issue(s)
  • Test changes and document test plan
  • Update documentation as needed
  • Update changelog as needed

@fasmat fasmat self-assigned this Nov 22, 2024
Copy link

codecov bot commented Nov 22, 2024

Codecov Report

Attention: Patch coverage is 95.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.8%. Comparing base (40fec4d) to head (f8cb87d).
Report is 1 commits behind head on develop.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
api/grpcserver/globalstate_service.go 0.0% 1 Missing ⚠️
sql/layers/layers.go 95.0% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           develop   #6486   +/-   ##
=======================================
  Coverage     79.8%   79.8%           
=======================================
  Files          353     353           
  Lines        46491   46512   +21     
=======================================
+ Hits         37138   37158   +20     
+ Misses        7245    7239    -6     
- Partials      2108    2115    +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fasmat fasmat changed the title WiP: Debug instable system tests Debug instable system tests Nov 22, 2024
@fasmat fasmat changed the title Debug instable system tests Debug unstable system tests Nov 22, 2024
@fasmat fasmat marked this pull request as ready for review November 22, 2024 16:00
@fasmat
Copy link
Member Author

fasmat commented Nov 22, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Nov 22, 2024
## Motivation

This PR tries to improve instable system tests
@spacemesh-bors
Copy link

Build failed:

@fasmat
Copy link
Member Author

fasmat commented Nov 22, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Nov 22, 2024
## Motivation

This PR tries to improve instable system tests
@spacemesh-bors
Copy link

Build failed:

@fasmat
Copy link
Member Author

fasmat commented Nov 25, 2024

bors try

spacemesh-bors bot added a commit that referenced this pull request Nov 25, 2024
@fasmat
Copy link
Member Author

fasmat commented Nov 25, 2024

bors cancel

@fasmat
Copy link
Member Author

fasmat commented Nov 25, 2024

bors try

@spacemesh-bors
Copy link

try

Already running a review

@spacemesh-bors
Copy link

Build failed (retrying...):

  • systest-status

spacemesh-bors bot pushed a commit that referenced this pull request Nov 28, 2024
## Motivation

This PR tries to improve unstable system tests
spacemesh-bors bot pushed a commit that referenced this pull request Nov 28, 2024
## Motivation

This PR tries to improve unstable system tests
@spacemesh-bors
Copy link

Build failed:

  • systest-status

@fasmat
Copy link
Member Author

fasmat commented Nov 28, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Nov 28, 2024
## Motivation

This PR tries to improve unstable system tests
@spacemesh-bors
Copy link

Build failed (retrying...):

spacemesh-bors bot pushed a commit that referenced this pull request Nov 28, 2024
## Motivation

This PR tries to improve unstable system tests
@spacemesh-bors
Copy link

Build failed:

@fasmat
Copy link
Member Author

fasmat commented Nov 28, 2024

bors try

spacemesh-bors bot added a commit that referenced this pull request Nov 28, 2024
@spacemesh-bors
Copy link

try

Build failed:

@fasmat
Copy link
Member Author

fasmat commented Nov 28, 2024

bors try

spacemesh-bors bot added a commit that referenced this pull request Nov 28, 2024
@spacemesh-bors
Copy link

try

Build failed:

@fasmat
Copy link
Member Author

fasmat commented Nov 29, 2024

bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Nov 29, 2024
## Motivation

This PR tries to improve unstable system tests
@spacemesh-bors
Copy link

Pull request successfully merged into develop.

Build succeeded:

@spacemesh-bors spacemesh-bors bot changed the title Improve stability of system tests [Merged by Bors] - Improve stability of system tests Nov 29, 2024
@spacemesh-bors spacemesh-bors bot closed this Nov 29, 2024
@spacemesh-bors spacemesh-bors bot deleted the debug-test-partition branch November 29, 2024 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants