This section contains a set of dBFT's formal specifications written in TLA⁺ language. The models describe the core algorithm logic represented in a high-level way and can be used to illustrate some basic dBFT concepts and to validate the algorithm in terms of liveness and fairness. It should be noted that presented models do not precisely follow the dBFT implementation presented in the repository and may omit some implementation details in favor of the specification simplicity and the fundamental philosophy of the TLA⁺. However, the presented models directly reflect some liveness problems dBFT 2.0 has; the models can and are aimed to be used for the dBFT 2.0 liveness evaluation and further algorithm improvements.
Any contributions, questions and discussions on the presented models are highly appreciated.
This specification is a basis that was taken for the further algorithm investigation. We recommend to begin acquaintance with the dBFT models from this one.
The specification describes the process of a single block acceptance: the set of
resource managers RM
(which is effectively a set of consensus nodes)
communicating via the shared consensus message pool msgs
and taking the
decision in a few consensus rounds (views). Each consensus node has its own state
at each step of the behaviour. Consensus node may send a consensus message by
adding it to the msgs
pool. To perform the transition between states the
consensus node must send a consensus message or there must be a particular set of
consensus messages in the shared message pool required for a particular
transition.
Here's the scheme of transitions between consensus node states:
The specification also describes two kinds of malicious nodes behaviour that can be combined, i.e. enabled or disabled independently for each particular node:
- "Dead" nodes. "Dead" node is completely excluded from the consensus process and not able to send the consensus messages and to perform state transitions. The node may become "dead" at any step in the middle of the consensus process. Once the node becomes "dead" there's no way for it to rejoin the consensus process.
- "Faulty" nodes. "Faulty" node is allowed to send consensus messages of any type at any step and to change its view without regarding the dBFT view changing rules. The node may become "faulty" at any step in the middle of the consensus process. Once the node becomes "faulty" there's no way for it to become "good" again.
The specification contains several invariants and liveness properties that must be checked by the TLC Model Checker. These formulas mostly describe two basic concepts that dBFT algorithm expected to guarantee:
- No fork must happen. There must be no situation such that two different blocks are accepted at two different consensus rounds (views).
- The block must always be accepted. There must be no situation such that nodes are stuck in the middle of consensus process and can't take any further steps.
The specification is written and working under several assumptions:
- All consensus messages are valid. In real life it is guaranteed by verifiable message signatures. In case if malicious or corrupted message is received it won't be handled by the node.
- The exact timeouts (e.g. t/o on waiting a particular consensus message, etc.) are not included into the model. However, the model covers timeouts in general, i.e. the timeout is just the possibility to perform a particular state transition.
- All consensus messages must eventually be delivered to all nodes, but the exact order of delivering isn't guaranteed.
- The maximum number of consensus rounds (views) is restricted. This constraint was introduced to reduce the number of possible model states to be checked. The threshold may be specified via model configuration, and it is highly recommended to keep this setting less or equal to the number of consensus nodes.
Here you can find the specification file and the TLC Model Checker launch configuration:
This is an experimental dBFT 2.0 specification that extends the
basic model in the following way: besides the shared pool
of consensus messages msgs
each consensus node has its own local pool of
received and handled messages. Decisions on transmission between the node states
are taken by the node based on the state of the local message pool. This approach
allows to create more accurate low-leveled model which is extremely close to the
dBFT implementation presented in this repository. At the same time such approach
significantly increases the number of considered model states which leads to
abnormally long TLC Model Checker runs. Thus, we do not recommend to use this
model in development and place it here as an example of alternative (and more
detailed) dBFT specification. These two models are expected to be equivalent in
terms of the liveness locks that can be discovered by both of them, and, speaking
the TLA⁺ language, the Extended dBFT specification implements the
basic one (which can be proven and written in TLA⁺, but
stays out of the task scope).
Except for this remark and a couple of minor differences all the basic model description, constraints and assumptions are valid for the Extended specification as far. Thus, we highly recommend to consider the basic model before going to the Extended one.
Here you can find the specification file and the TLC Model Checker launch configuration:
Based on the liveness locks scenarios found by the TLC model checker in the basic dBFT 2.0 model we've developed two extensions of dBFT 2.0 protocol that allow to avoid the liveness lock problem and to preserve the safety properties of the algorithm. The extensions currently don't have any code-level implementation and presented as a TLA⁺ specifications ready to be reviewed and discussed. The improved protocol presented in the extensions will be referred below as dBFT 2.1.
We've checked both dBFT 2.1 models with the TLC Model Checker against the same set of launch configurations that was used to reveal the liveness problems of the basic dBFT 2.0 model. The improved models have larger set of states, thus, the TLC Model Checker wasn't able to finish the liveness requirements checks for all possible states. However, the checks have passed for a state graph diameter that was large enough to believe the presented models solve the dBFT 2.0 liveness lock problems.
Here and below we assume that Commit
messages should bear preparation hashes
from all nodes that have sent the preparation message (>= M
but not including
a whole PrepareRequest
). This quickly synchronizes nodes still at the preparation
stage when someone else collects enough preparations. It at the same time prevents
malicious/byzantine nodes from sending spoofed Commit
messages. The Commit
message size becomes a little bigger, but since it's just hashes it still fits
into a single packet in the vast majority of the cases, so it doesn't really matter.
The basic idea of this model is to split the consensus process into three subsequent
stages marked as I
, II
and III
at the scheme. To perform a transition between
two subsequent stages each consensus node should wait for a set of messages from
at least M
consensus nodes to be received so that it's possible to complete a full
picture of the neighbours' decisions in the current consensus round. In other words,
no transition can happen unless we have M
number of messages from the subsequent round,
timers are only set up after we have this number of messages, just to wait for
(potentially) a whole set of them. At the same time, each of the stages has
its own ChangeView[1,2,3]
message to exit to the next consensus round (view) if
something goes wrong in the current one and there's definitely no ability to
continue consensus process in the current view. Below there's a short description
of each stage. Please, refer to the model scheme and specification for further
details.
Once initialized, consensus node has two ways:
- Send its
PrepareRequest
/PrepareResponse
message (and transmit to theprepareSent
state). - Decide to go to the next view on timeout or any other valid reason (like
transaction missing in the node's mempool or wrong proposal) via sending
ChangeView1
message (and transmit to thecv1
state).
This scheme is quite similar to the basic dBFT 2.0 model except the new type of
ChangeView
message. After that the node enters stage I
and waits for consensus
messages of stage I
(PrepareRequest
or PrepareResponse
or ChangeView1
)
from at least M
neighbours which is needed to decide about the next actions.
The set of received messages can be arranged in the following way:
M
messages ofChangeView1
type denote thatM
nodes have decided to change their view directly after initialization due to invalid/missingPrepareRequest
which leads to immediate view changing. This is a "fail fast" route that is the same as with dBFT 2.0 for the widespread case of missing primary. No additional delay is added, everything works as usual.M
preparation messages (of typePrepareRequest
orPrepareResponse
) with missingChangeView3
denote that the majority of nodes have decided to commit which denotes the safe transition to stageII
can be performed andCommit
message can safely be sent even if there'sChangeView1
message in the network. Notice thatChangeView3
check is just a protection against node seriously lagging behind.M
messages each of the typePrepareRequest
orPrepareResponse
orChangeView1
where at least one message is of the typeChangeView1
denote that at leastM
nodes have reached the stageI
and the node can safely take further steps. The additional| Commit | ≤ F ∪ | CV3 | > 0
condition requires the majority of nodes not to have theCommit
message to be sent so that it's still possible to collect enoughChangeView[2,3]
messages to change the view in further stages. If so, then the safe transition to stageII
can be performed andChangeView2
message can safely be sent.
Once the node has Commit
or ChangeView2
message sent, it enters the stage II
of the consensus process and waits for at least M
messages of stage II
(Commit
or ChangeView2
) to perform the transition to the next stage. The set
of accepted messages can be arranged in the following way:
M
messages ofChangeView2
type denote thatM
nodes have decided to change their view directly after entering the stageII
due to timeout while waiting for theCommit
messages which leads to immediate view changing.M
messages of typeCommit
denote that the majority of nodes have decided to commit which denotes the block can be accepted immediately without entering the stageIII
. Notice that this is the regular flow of normal dBFT 2.0 consensus, it also hasn't been changed and proceeds the way it was before.M
messages each of the typeCommit
orChangeView2
where not more thanF
messages are of the typeCommit
denotes that the majority of nodes decided to change their view after entering the stageII
and there's not enoughCommit
messages to create the block (and produce the fork), thus, the safe transition to stageIII
can be performed andChangeView3
message can safely be sent even if there'sCommit
message in the network.
In addition, the direct transition from cv2
state to the commitSent
state is
added in case if it's clear that there's more than F
nodes have decided to
commit and no ChangeView3
message has been received which means that it's possible
to produce block in the current view. This path handles a corner case of missing
stage I
messages, in fact, because Commit
messages prove that there are at
least M
preparation messages exist, but the node went cv2
path just because
it missed some of them.
Unlike the basic dBFT 2.0 model where consensus node locks on the commit phase,
stage III
gives the ability to escape from the commit phase via collecting the set
of M
ChangeView3
messages. This phase is reachable as soon as at least M
nodes
has reached the phase II
(have ChangeView2
or Commit
messages sent) and if
there's not enough Commit
messages (<=F
) to accept the block. This stage is
added to avoid situation when the node is being locked on the commitSent
state
whereas the rest of the nodes (>F
) is willing to go to the next view.
Here's the scheme of transitions between consensus node states for the improved dBFT 2.1 stages-based model:
Here you can find the specification file and the basic TLC Model Checker launch configuration:
The improvement which was taken as a base for this model is taken from the pBFT algorithm and is as follows. The consensus process is split into two stages with the following meaning:
- Stage
I
holds the node states from which it's allowed to transmit to the subsequent view under assumption that the new proposal will be generated (as the basic dBFT 2.0 model does). - Stage
II
holds the node states from which it's allowed to perform view change preserving the proposal from the previous view.
Another vital difference from the basic dBFT 2.0 model is that view changes are
being performed by the node on the DoCV[1,2]
command (consensus message) sent by the
leader of the target view specified via DoCV[1,2]
parameters. Aside from the target view
parameter, DoCV[1,2]
message contains the set of all related pre-received consensus messages
so that the receivers of DoCV[1,2]
are able to check its validness before the subsequent
view change.
Below presented the short description of the proposed consensus process. Please, refer to the model scheme and specification for further details.
Once initialized at view v
, the consensus node has two ways:
- Send its
PrepareRequest
/PrepareResponse
message (and transmit to theprepareSent
state). - Decide to go to the next view
v+1
on timeout or any other valid reason like transaction missing in the node's mempool via sendingChangeView1(v+1)
message (and transmitting to thecv1 to v'=v+1
state).
After that the node enters stage I
and perform as follows:
- If the node has its
PrepareRequest
orPrepareResponse
sent:- If at least
M
preparation messages (including its own) collected, then it's clear that the majority has proposal for viewv
being accepted as valid and the node can safely send theCommit
message and transmit to the phaseII
(commitSent
state). - If there's not enough preparation payloads received from the neighbours for a
long time, then the node is allowed to transmit to the stage
II
via sending itsChangeView2
message (and changing its state to thecv2 to v'=v+1
). It denotes the node's desire to change view to the next one with the current proposal to be preserved.
- If at least
- If the node entered the
cv1 to v'=v+1
state:- If there's a majority (>=
M
) ofChangeView1(v+1)
messages and the node is primary in the viewv+1
then it should send the signal (DoCV1(v+1)
message) to the rest of the group to change their view tov+1
with the new proposal generated. The rest of the group (backup onv+1
view that have sent theirChangeVeiew1(v+1)
messages) should change their view onDoCV1(v+1)
receiving. - If there's a majority (>=
M
) ofChangeView1(v+1)
messages collected, butDoCV1(v+1)
is missing for a long time, then the node is able to "skip" viewv+1
and send theChangeView1(v+2)
message hoping that the primary ofv+2
will be faster enough to send theDoCV1(v+2)
signal. The process can be repeated on timeout for viewv+3
, etc. - If there's more than
F
nodes that have sent their preparation messages (and, consequently, announced their desire to transmit to the stageII
of the current view rather than to change view), then it's clear that it won't be more thanF
messages of typeChangeView1
to perform transition to the next view from the stageII
. Thus, the node is allowed to send itsChangeView2
message (and change its state to thecv2 to v'=v+1
). Such situation may happen if the node haven't proposal received in time (consider valid proposal).
- If there's a majority (>=
Once the node has entered the stage II
, the proposal of the current round is
considered to be valid. Depending on the node's state the following decisions are
possible:
- If the node has its
Commit
sent and is in thecommitSent
state:- If the majority (>=
M
) of theCommit
messages has been received, then the block may be safely accepted for the current proposal. - If there's not enough
Commit
messages for a long time, then it's legal to send theChangeView2(v+1)
message, transmit to thecv2 to v'=v+1
state and decide to go to the next viewv+1
preserving the current proposal and hoping that it would be possible to collect enoughCommit
messages for it in the viewv+1
.
- If the majority (>=
- If the node is in the
cv2 to v'=v+1
state then:- If there's a majority (>=
M
) ofChangeView2(v+1)
messages and the node is primary in the viewv+1
then it should send the signal (DoCV2(v+1)
message) to the rest of the group to change their view tov+1
with the old proposal of viewv
preserved. The rest of the group (backup onv+1
view that have sent theirChangeVeiew2(v+1)
messages) should change their view onDoCV2(v+1)
receiving. - If there's a majority (>=
M
) ofChangeView2(v+1)
messages collected, butDoCV2(v+1)
is missing for a long time, then the node is able to "skip" viewv+1
and send theChangeView2(v+2)
message hoping that the primary ofv+2
will be faster enough to send theDoCV2(v+2)
signal. The process can be repeated on timeout for viewv+3
, etc. - Finally, if the node receives at least
M
messages from the stageI
at maxF
of which are preparations (the blue dotted arrow fromcv2 to v'=v+1
tocv1 to v'=v+1
state), it has the ability to go back to thecv1
state to start the new consensus round with the new proposal. This case is kind of special, it allows to escape from the deadlock situation when the node is locked oncv2
state unable to perform any further steps whereas the rest of the network are waiting in thecv1
state. Consider the case of four-nodes network where the first node is permanently "dead", the primary have sent itsPrepareRequest
and went to thecv2
state on the timeout and the rest two nodes are waiting in thecv1
state not able to move further.
- If there's a majority (>=
It should be noted that "preserving the proposal of view v
in view v+1
" means
that the primary of view v+1
broadcasts the PrepareRequest
message at view v+1
that contains the same set of block's fields (transactions, timestamp, primary, etc) as
the PrepareRequest
proposed in the view v
has.
Here's the scheme of transitions between consensus node states for the improved dBFT 2.1 model with the centralized view changes process:
Here you can find the specification file and the basic TLC Model Checker launch configuration:
Neo X chain uses dBFT 2.0 algorithm as a consensus engine. As a part of
the Neo X anti-MEV feature implementation, dBFT 2.0 extension was designed to
provide single-block finality for encrypted transactions (a.k.a. envelope
transactions). Compared to dBFT 2.0, MEV-resistant dBFT algorithm includes an
additional post-Commit
phase that is required to be passed through by consensus
nodes before every block acceptance. This phase allows consensus nodes to exchange
some additional data related to encrypted transactions and to the final state of
accepting block using a new type of consensus messages. The improved protocol based
on dBFT 2.0 with an additional phase will be referred below as MEV-resistant dBFT.
We've checked MEV-resistant dBFT model with the TLC Model Checker against the same set of launch configurations that was used to reveal the liveness problems of the basic dBFT 2.0 model. MEV-resistant dBFT model brings no extra problems to the protocol, but it has been proved that this model has exactly the same liveness bug that the original dBFT 2.0 model has which is expected.
This specification is an extension of the basic dBFT 2.0 model. Compared to the base model, MEV-resistant dBFT specification additionally includes:
- New message type
CommitAck
aimed to reflect an additional protocol message that should be sent by resource manager if at leastM
Commit
messages were collected by the node (that confirms a.k.a. "PreBlock" final acceptance). - New resource manager state
commitAckSent
aimed to reflect the additional phase of the protocol needed for consensus nodes to exchange some data that was not available at the time of the first commit. This RM state represents a consensus node state when it has sent these additional post-commit data but has not accepted the final block yet. - New specification step
RMSendCommitAck
describing the transition betweencommitSent
andcommitAckSent
phases of the protocol, or, which is the same, corresponding resource managers states. This step allows the resource manager to sendCommitAck
message if at leastM
validCommit
messages are collected. - Adjusted behaviour of
RMAcceptBlock
step: block acceptance is possible iff the node has sent theCommitAck
message and there are at leastM
CommitAck
messages collected by the node. - Adjusted behaviour of "faulty" resource managers: allow malicious nodes to send an
CommitAck
message viaRMFaultySendCommitAck
step.
It should be noted that, in comparison with the dBFT 2.0 protocol where the node is
being locked in the commitSent
state until the block acceptance, MEV-resistant dBFT
does not allow to accept the block right after the commitSent
state. However, it
allows the node to move from commitSent
phase further to the commitAckSent
state
and locks the node at this state until the block acceptance. No view change may be
initiated or accepted by a node entered the commitAckSent
state.
Here's the scheme of transitions between consensus node states for MEV-resistant dBFT algorithm:
Here you can find the specification file and the basic MEV-resistant dBFT TLC Model Checker launch configuration for the four "honest" consensus nodes scenario:
- Download and install the TLA⁺ Toolbox following the official guide.
- Read the brief introduction to the TLA⁺ language and TLC Model Checker at the official site.
- Download and take a look at the TLA⁺ cheat sheet.
- For a proficient learning watch the TLA⁺ Video Course and read the Specifying Systems book.
- Clone the repository.
- Open the TLA⁺ Toolbox, open new specification and provide path to the desired
*.tla
file that contains the specification description. - Create the model named
AllGoodModel
in the TLA⁺ Toolbox. - Copy the corresponding
*___AllGoodModel.launch
file to the*.toolbox
folder. Reload/refresh the model in the TLA⁺ Toolbox. - Open the
Model Overview
window in the TLA⁺ Toolbox and check that behaviour specification, declared constants, invariants and properties of the model are filled in with some values. - Press
Run TLC on the model
button to start the model checking process and explore the progress in theModel Checkng Results
window.
It should be noted that all TLA⁺ specifications provided in this repo can be perfectly checked
with MaxView
model constraint set to be 1 for the four-nodes network scenario. Larger
MaxView
values produces too many behaviours to be checked, so TLC Model Checker is likely
to fail with OOM during the checking process. However, MaxView
set to be 1 is enough to check
the model liveness properties for the four-nodes scenario as there are two views to be checked
in this case (0 and 1).