-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EFM] Recoverable Random Beacon State Machine follow up updates #6815
base: feature/efm-recovery
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/efm-recovery #6815 +/- ##
========================================================
- Coverage 41.81% 41.74% -0.08%
========================================================
Files 1588 2033 +445
Lines 144098 181300 +37202
========================================================
+ Hits 60251 75677 +15426
- Misses 78887 99385 +20498
- Partials 4960 6238 +1278
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
||
// start up the engine | ||
unittest.AssertClosesBefore(suite.T(), suite.engine.Ready(), time.Second) | ||
unittest.AssertClosesBefore(suite.T(), suite.engine.Ready(), 100*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really not starting up within a second? 100s seems like a lot.. Maybe 5s or 10s?
// verify that the key is part of the EpochCommit | ||
if err = ensureKeyIncludedInEpoch(epochCounter, key, commit); err != nil { | ||
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, | ||
"previously storred key has not been found in epoch commit event: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"previously storred key has not been found in epoch commit event: %w", err) | |
"previously stored key has not been found in epoch commit event: %w", err) |
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. | ||
// Effectively, this method transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | ||
// Caller needs to supply the [flow.EpochCommit] which is an evidence that the key has been indeed included for the given epoch. | ||
// No errors are expected during normal operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// No errors are expected during normal operations. | |
// If the current state is already [flow.RandomBeaconKeyCommitted], this function is a no-op regardless of input. | |
// No errors are expected during normal operations. |
Describing the behaviour on line 238.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding Jordan's suggestion:
If the current state is already [flow.RandomBeaconKeyCommitted], this function is a no-op regardless of input.
This is probably the most minimal implementation, but I am not sure it would be the most robust. I think there are two non-happy path scenarios I think we should consider for the CommitMyBeaconPrivateKey
:
- A random beacon key was previously committed (i.e. we are in the
flow.RandomBeaconKeyCommitted
state). If there is a call toCommitMyBeaconPrivateKey
with anflow.EpochCommit
that is inconsistent with the committed key, we have a clear failure condition. - A random beacon key was previously committed (i.e. we are in the
flow.RandomBeaconKeyCommitted
state). The state machine receives a call toCommitMyBeaconPrivateKey
with anflow.EpochCommit
that is consistent with the committed key. Here, we could either return an exception and say that repeated calls are in principle not allowed. Alternatively, we can treat this call as confirming information that the state machine already has and ignore this call (as information is idempotent). The latter is my preference.
What makes me nervous is taking the response strategy for 2. and also applying it to 1., because we would be proceeding in a clear failure case. I think it would be comparatively easy to be more strict in this case.
@@ -344,9 +515,16 @@ func TestDKGState_RandomBeaconKeyCommittedState(t *testing.T) { | |||
store, err := NewRecoverableRandomBeaconStateMachine(metrics, db) | |||
require.NoError(t, err) | |||
|
|||
var evidence *flow.EpochCommit | |||
var pkey crypto.PrivateKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var pkey crypto.PrivateKey | |
var privKey crypto.PrivateKey |
Nitpick: avoiding just the "p" prefix (because public also starts with "p" 😄)
pkey = unittest.StakingPrivKeyFixture() | ||
evidence = unittest.EpochCommitFixture(func(commit *flow.EpochCommit) { | ||
commit.Counter = epochCounter | ||
commit.DKGParticipantKeys[0] = pkey.PublicKey() | ||
}) | ||
err = store.UpsertMyBeaconPrivateKey(epochCounter, pkey, evidence) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkey = unittest.StakingPrivKeyFixture() | |
evidence = unittest.EpochCommitFixture(func(commit *flow.EpochCommit) { | |
commit.Counter = epochCounter | |
commit.DKGParticipantKeys[0] = pkey.PublicKey() | |
}) | |
err = store.UpsertMyBeaconPrivateKey(epochCounter, pkey, evidence) | |
privKey = unittest.StakingPrivKeyFixture() | |
evidence = unittest.EpochCommitFixture(func(commit *flow.EpochCommit) { | |
commit.Counter = epochCounter | |
commit.DKGParticipantKeys[0] = privKey.PublicKey() | |
}) | |
err = store.UpsertMyBeaconPrivateKey(epochCounter, privKey, evidence) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks for the detailed work and solid tests. My comments mostly fall into two categories:
- how we represent the underlying logic in code - in particular, we want to make it as easy as possible for somebody reading the code to convince themself of correctness (and which parts can be modified / extended without breaking correctness)
- clearly documenting under which conditions operations are idempotent and structuring the code accordingly for clarity.
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. | ||
// Effectively, this method transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | ||
// Caller needs to supply the [flow.EpochCommit] which is an evidence that the key has been indeed included for the given epoch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. | |
// Effectively, this method transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | |
// Caller needs to supply the [flow.EpochCommit] which is an evidence that the key has been indeed included for the given epoch. | |
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. Effectively, this method | |
// transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | |
// The caller needs to supply the [flow.EpochCommit] as evidence that the stored key is valid for the specified epoch. Repeated | |
// calls for the same epoch are accepted (idempotent operation), if and only if the provided EpochCommit confirms the already | |
// committed key. |
// if we are in committed state then there is nothing to do | ||
if currentState == flow.RandomBeaconKeyCommitted { | ||
return nil | ||
} | ||
key, err := ds.keyCache.Get(epochCounter)(tx.DBTxn) | ||
if err != nil { | ||
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, "cannot transition without a valid random beacon key: %w", err) | ||
} | ||
|
||
// verify that the key is part of the EpochCommit | ||
if err = ensureKeyIncludedInEpoch(epochCounter, key, commit); err != nil { | ||
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, | ||
"previously storred key has not been found in epoch commit event: %w", err) | ||
} | ||
return ds.processStateTransition(epochCounter, currentState, flow.RandomBeaconKeyCommitted)(tx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, I would suggest to always enforce that the stored random beacon key is consistent with the provided EpochCommit
. Only the state transition itself should be omitted, if this is a repeated call. In a nutshell, the block if currentState == flow.RandomBeaconKeyCommitted
would just move down.
// if we are in committed state then there is nothing to do | |
if currentState == flow.RandomBeaconKeyCommitted { | |
return nil | |
} | |
key, err := ds.keyCache.Get(epochCounter)(tx.DBTxn) | |
if err != nil { | |
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, "cannot transition without a valid random beacon key: %w", err) | |
} | |
// verify that the key is part of the EpochCommit | |
if err = ensureKeyIncludedInEpoch(epochCounter, key, commit); err != nil { | |
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, | |
"previously storred key has not been found in epoch commit event: %w", err) | |
} | |
return ds.processStateTransition(epochCounter, currentState, flow.RandomBeaconKeyCommitted)(tx) | |
// Repeated calls for same epoch are idempotent, but only if they consistently confirm the stored private key. We explicitly | |
// enforce consistency with the already committed key here. Repetitions are considered rare, so performance overhead is acceptable. | |
key, err := ds.keyCache.Get(epochCounter)(tx.DBTxn) | |
if err != nil { | |
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, "cannot transition without a valid random beacon key: %w", err) | |
} | |
// verify that the key is part of the EpochCommit | |
if err = ds.ensureKeyIncludedInEpoch(epochCounter, key, commit); err != nil { | |
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, | |
"according to EpochCommit event, my stored random beacon key is not valid for signing: %w", err) | |
} | |
// transition to RandomBeaconKeyCommitted, unless this is a repeated call, in which case there is nothing else to do | |
if currentState == flow.RandomBeaconKeyCommitted { | |
return nil | |
} | |
return ds.processStateTransition(epochCounter, currentState, flow.RandomBeaconKeyCommitted)(tx) |
// verify that the key is part of the EpochCommit | ||
if err = ensureKeyIncludedInEpoch(epochCounter, key, commit); err != nil { | ||
return storage.NewInvalidDKGStateTransitionErrorf(currentState, flow.RandomBeaconKeyCommitted, | ||
"previously storred key has not been found in epoch commit event: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"previously storred key has not been found in epoch commit event: %w", err) | |
"according to EpochCommit event, the input random beacon key is not valid for signing: %w", err) |
err := operation.RetrieveDKGStateForEpoch(epochCounter, ¤tState)(txn) | ||
if err != nil && !errors.Is(err, storage.ErrNotFound) { | ||
return currentState, fmt.Errorf("could not retrieve current state for epoch %d: %w", epochCounter, err) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// if we are in committed state, we cannot overwrite the key, but we can ignore this input iff the provided key is the same | ||
if currentState == flow.RandomBeaconKeyCommitted { | ||
// check if the stored key is equal to the provided key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// if we are in committed state, we cannot overwrite the key, but we can ignore this input iff the provided key is the same | |
if currentState == flow.RandomBeaconKeyCommitted { | |
// check if the stored key is equal to the provided key | |
// Repeated calls for same epoch are idempotent, but only if they consistently confirm the stored private key. We explicitly | |
// enforce consistency with the already committed key here. Repetitions are considered rare, so performance overhead is acceptable. | |
if currentState == flow.RandomBeaconKeyCommitted { |
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. | ||
// Effectively, this method transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | ||
// Caller needs to supply the [flow.EpochCommit] which is an evidence that the key has been indeed included for the given epoch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. | |
// Effectively, this method transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | |
// Caller needs to supply the [flow.EpochCommit] which is an evidence that the key has been indeed included for the given epoch. | |
// CommitMyBeaconPrivateKey commits the previously inserted random beacon private key for an epoch. Effectively, this method | |
// transitions the state machine into the [flow.RandomBeaconKeyCommitted] state if the current state is [flow.DKGStateCompleted]. | |
// The caller needs to supply the [flow.EpochCommit] as evidence that the stored key is valid for the specified epoch. Repeated | |
// calls for the same epoch are accepted (idempotent operation),if and only if the provided EpochCommit confirms the already | |
// committed key. |
// from Epoch Fallback Mode. State transitions are allowed if and only if the current state is not equal to | ||
// [flow.RandomBeaconKeyCommitted]. The resulting state of this method call is [flow.RandomBeaconKeyCommitted]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// from Epoch Fallback Mode. State transitions are allowed if and only if the current state is not equal to | |
// [flow.RandomBeaconKeyCommitted]. The resulting state of this method call is [flow.RandomBeaconKeyCommitted]. | |
// from Epoch Fallback Mode. The resulting state of this method call is [flow.RandomBeaconKeyCommitted]. | |
// State transitions are allowed if and only if the current state is not equal to [flow.RandomBeaconKeyCommitted]. | |
// Repeated calls for the same epoch are idempotent, if and only if the provided EpochCommit confirms the already | |
// committed key (error otherwise). |
// from Epoch Fallback Mode. State transitions are allowed if and only if the current state is not equal to | ||
// [flow.RandomBeaconKeyCommitted]. The resulting state of this method call is [flow.RandomBeaconKeyCommitted]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// from Epoch Fallback Mode. State transitions are allowed if and only if the current state is not equal to | |
// [flow.RandomBeaconKeyCommitted]. The resulting state of this method call is [flow.RandomBeaconKeyCommitted]. | |
// from Epoch Fallback Mode. The resulting state of this method call is [flow.RandomBeaconKeyCommitted]. | |
// State transitions are allowed if and only if the current state is not equal to [flow.RandomBeaconKeyCommitted]. | |
// Repeated calls for the same epoch are idempotent, if and only if the provided EpochCommit confirms the already | |
// committed key (error otherwise). |
@@ -316,7 +316,7 @@ func main() { | |||
return fmt.Errorf("could not load beacon key file: %w", err) | |||
} | |||
|
|||
rootEpoch := node.State.AtBlockID(node.FinalizedRootBlock.ID()).Epochs().Current() | |||
rootEpoch := rootSnapshot.Epochs().Current() | |||
epochCounter, err := rootEpoch.Counter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency, could you rename this variable
epochCounter, err := rootEpoch.Counter() | |
rootEpochCounter, err := rootEpoch.Counter() |
@@ -346,7 +346,11 @@ func main() { | |||
// perform this only if state machine is in initial state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would suggest documenting the logical block of lines 341 to 357 as a whole. Something like
// store my beacon key for the first epoch post-spork (only if we haven't run this logic before, i.e. state machine is in initial state)
started, err := myBeaconKeyStateMachine.IsDKGStarted(epochCounter)
if err != nil {
return fmt.Errorf("could not get DKG started flag for root epoch %d: %w", epochCounter, err)
}
if !started {
epochProtocolState, err := rootSnapshot.EpochProtocolState()
⋮
Context
This PR addresses some open comments from PR #6771.
Specifically:
By far the biggest change is how we insert/upsert keys, now caller needs to provide evidence in form of
EpochCommit
so state machine can sanity check if it's inserting a valid private key.