Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core crash caused by LedgerCloseMeta during upgrade of protocol #3909

Closed
anupsdf opened this issue Sep 7, 2023 · 3 comments · Fixed by #3925
Closed

Core crash caused by LedgerCloseMeta during upgrade of protocol #3909

anupsdf opened this issue Sep 7, 2023 · 3 comments · Fixed by #3925
Assignees
Labels

Comments

@anupsdf
Copy link
Contributor

anupsdf commented Sep 7, 2023

Issue Description

Core crashes when an upgrading ledger is constructed with version N but is amended with version N+1 .
@marta-lokhova pointed out the problematic assert:

releaseAssert(mVersion == 2);

The problem is that on the upgrade ledger we bump ledger version, however we still emit old-style meta, which conflicts with the assert.

Steps to reproduce

Described by @tamirms in stellar/go#5035 (comment)

Detail description

Summary of the issue described by @graydon in stellar/go#5035 (comment)

@anupsdf anupsdf added the bug label Sep 7, 2023
@MonsieurNicolas
Copy link
Contributor

(From the linked issue)

There are a couple ways we can fix this:

  1. Since the LCM amending only happens on-or-after v20, and there will be nothing much to do on the v19->v20 boundary ledger, we can just relax the assertion that's tripping here and let the LCM-amending methods work in v19 as well. Make them no-ops -- you can call them but they just won't access the v20-related fields of the LCM. Put a big comment in explaining why this is ok and hope there's no other weird case that we're letting through by weakening that condition.
  2. Add a method to the LCM that fishes out the protocol version it was constructed with and use that as the condition in ledgerClosed (and its callees) to decide to call the LCM-amending methods, rather than the post-close ledger protocol number. Again, if we do this there had better be a big comment explaining it, and some tests checking, because it's very subtle!

I am not sure how 1 would work in this context: we really do not want to start evicting entries in that upgrade ledger: changing scanForEviction to deal with this special case seems pretty error prone.

So it seems that option 2 is the safest thing to do. Only thing we need to be sure of is that mApp.getBucketManager().addBatch (that will continue to take the "current protocol version") does not do weird things that depend on eviction.

@MonsieurNicolas
Copy link
Contributor

Actually option 2 is a bit ugly as observing the meta should not change protocol behavior, we may just need to add a new argument to ledgerClosed that tells it the protocol version used during apply

@marta-lokhova
Copy link
Contributor

I think @MonsieurNicolas alluded to this in his comment, but just to re-iterate here: we should ensure that no eviction logic (or any protocol 20 code for that matter) is called at all at the upgrade ledger, not just meta-related functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants