-
Notifications
You must be signed in to change notification settings - Fork 973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support manual closing N ledgers #4040
Comments
can you describe better the actual test setup? Also as far as test environment, afaik integration tests spin up several nodes: one validator and one ingestion node (configured with captive core)/transaction submission node. Manual close right now only works with nodes configured with |
Horizon ingests asset balances which are written to storage by the stellar asset contract. There is code in horizon that accounts for when these asset balances expire and when they get restored. The expiration / restoration case is difficult to test currently because the stellar asset contract configures the lifetime of asset balances to span a month's worth of ledgers: If we had the capability to fast-forward stellar-core by an arbitrary amount of ledgers then this would no longer be an issue for us. |
that's not what I was asking wrt test setup: I was asking if you have 1 core node or multiple ones. If it's just 1 you should be able to use command line catchup instead of a live node? |
@MonsieurNicolas in our tests we have a core validator node running in its own docker container and horizon runs with a captive-core watcher node. I don't know if that counts as 1 node because although there is only 1 validator we do have the captive-core instance |
I see so some non-trivial topology. Manual close won't work in that context as it only works on a standalone node (manual closing would have to be triggered on all nodes to work properly). For most end to end tests, any reason you need this setup and can't just rely on command line catchup (which I think is supported from Horizon)? |
We need to submit transactions in order to populate the ledger with the ledger entries required by the test. For example, if we're testing the claimable balances endpoint, we need to submit a transaction to create a claimable balance in the ledger. I suppose we could submit a transaction directly to core without going through horizon but going through horizon is easier in terms of debugging because we can immediately check whether the transaction succeeded or failed. Also, in a lot of our tests, we submit a transaction, assert that the current state of a ledger entry matches our expectation, then we submit another transaction to modify the ledger entry and assert that the ledger entry changed according to expectation. This workflow where you test several changes to a ledger entry within a single test requires live ingestion. The command line catchup mode of ingestion in Horizon only ingests immutable time series data (e.g. transaction history or effects), it does not update the state of mutable ledger entries. |
sorry I missed your response -- yeah what you are describing is fine for legacy tests. What I am proposing is to potentially write new tests (like the ones that depend on potentially closing a lot of ledgers) to be decoupled into:
This approach would be similar to what core does today by attaching various footprints to meta data (and other host state) as to ensure that we catch issues as early as possible in core (as opposed to more complex integration tests). The benefits of 3 are that you get to write your own "testing protocol rules": Horizon/Soroban-RPC are supposed to be agnostic of protocol behavior (and only care about "meta"), and this allows you to test edge cases that are probably very hard to repro using core directly. Testing the actual protocol is not the best use of time for the platform team as the core team is supposed to do this for you. |
@MonsieurNicolas for step (1) wouldn't it take a really long time to generate the tx meta if we wanted to close a month's worth of ledgers (the TTL for asset balances is 1 month in the stellar asset contract), or is there a stellar-core command to close lots of ledgers ? Regarding point (3), we do have unit tests on the horizon ingestion system which run ingestion on mocked up tx meta. That allows us to test corner cases as you alluded to. However, I disagree with:
The horizon integration tests have found a decent amount of bugs in core and horizon (where horizon misconfigures core). Testing horizon with a real core is particularly useful when we're integrating with core builds for the upcoming protocol release. Here are several examples where our integration tests with core have found legitimate bugs / issues: https://stellarfoundation.slack.com/archives/C0367MGFL/p1663965045480269 https://stellarfoundation.slack.com/archives/C02B04RMK/p1691064220584809 |
bumping this as I know you're about to meet people from core.
manual close in a loop would do that. the time it takes to run this is not super relevant as long as it does not take too long as we could generate this when a new core build is produced for example. Note that this is pretty much the pattern you will have to follow when validating an architecture built on top of "data lake".
I think the cost paid by the platform team in this context is just not worth it:
In any case, I think you should sit down with the core team and discuss how you can together improve coverage and agility. |
Description
Explain in detail the additional functionality you would like to see in stellar-core.
Expand the capabilities of the
manualclose
http endpoint so that it accepts a parametercount
that defaults to1
if not provided, preserving existing behavior, but if another value is passed that the node closes a ledger and progresses the network the given number of ledgers.Explain why this feature is important
Applications need to integration test with the network.
Describe the solution you'd like
See above under the section titled "Explain in detail the additional functionality you would like to see in stellar-core.
Describe alternatives you've considered
Waiting around for ledgers to close manually, resulting in really long tests.
Mocking stellar-core and not running integration tests for functionality where closing many ledgers is required.
Using stellar-core's offline-close cli command to close many ledgers, but this requires stopping stellar-core which is difficult to do in an integration test environment.
Additional context
Because the network is adding state archival, which only runs after a large number of ledgers occur, it's becoming more critical that integration tests can control the progression of ledgers.
cc @tamirms
The text was updated successfully, but these errors were encountered: