-
Notifications
You must be signed in to change notification settings - Fork 116
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
1 changed file
with
69 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
## 5.0.6 | ||
|
||
This is an upgrade with patches mitigate an ongoing network incident. More details here: https://hackmd.io/KY7VklgAR72Bg7ETTQZAxA | ||
|
||
These changes do not affect Move code (no changes to system state or policies). It only upgrades `diem-node` | ||
|
||
#### TL;DR | ||
|
||
Just update binaries to use patches. | ||
|
||
``` | ||
# install new binaries | ||
cd ~/libra | ||
git checkout v5.0.6 -f | ||
make bins install | ||
<stop diem-node> | ||
<restart diem-node> | ||
``` | ||
|
||
### Summary | ||
|
||
TBD | ||
|
||
### Changes | ||
|
||
##### Move Changes | ||
None | ||
|
||
##### Rust Changes | ||
##### - prevent DiemAccount::1003 txs from remaining in mempool | ||
|
||
One line change to prevent the VM transaction verification on admission from allowing transactions from clients that are more advanced than the current state of validator. The change prevents a DiemAccount::1003 PROLOGUE_ESEQUENCE_NUMBER_TOO_NEW transaction error from remaining in the mempool. | ||
|
||
This happens when a TX has a sequence number newer than the validator has for that account. This means the validator is behind sync compared to the client. | ||
|
||
Possible fix to inundation of prologue errors DiemAccount::1003 during the Dec 16-18 network incident described here https://hackmd.io/KY7VklgAR72Bg7ETTQZAxA. | ||
|
||
It seems txs with prologue errors keep accumulating but they in mempool and then keep getting rebroadcast to other nodes that were disconnected (related PR: diem/diem#9437). This seems to contribute network to halt. But it's still unclear why this would cause transport errors, and eventually Validators from contacting each other. cc @gregnazario | ||
|
||
Tip from user shb | ||
https://discord.com/channels/903339070925721652/916100220855648296/921932021884928030 | ||
"One idea for a mitigation: try setting this flag to false (which should tell mempool to drop any txs with new sequence #'s on the floor instead of holding on to them) https://github.com/OLSF/libra/blob/main/language/diem-vm/src/diem_transaction_validator.rs#L91 | ||
|
||
https://github.com/OLSF/libra/pull/906 | ||
|
||
##### - disable mempool resends all txns after disconnect | ||
|
||
Validators don't resend txns after disconnect to ones that have already | ||
confirmed which could cause some issues when many nodes restart or get | ||
stuck. | ||
|
||
This patch already exists in diem Upstream in a newer version (1.4) than 0L's fork (1.3). The upstream PR is https://github.com/diem/diem/pull/9437 | ||
|
||
https://github.com/OLSF/libra/pull/905 | ||
|
||
|
||
##### Tests | ||
|
||
- All continuous integration tests passed. | ||
- The diem-code was tested on production nodes that were both synced and out of sync. | ||
|
||
##### Compatibility | ||
The Move framework changes are backwards compatible with `diem-node` from v5.0.1 | ||
|
||
### Rust changes | ||
#### Diem Node | ||
Changes to `diem-node` is compatible with on-chain state from `v5.0.5`. |