-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate scenario from trace #5875
Generate scenario from trace #5875
Conversation
…amayou/CCF into generate_scenario_from_trace
generate_scenario_from_trace@80033 aka 20240109.7 vs main ewma over 20 builds from 79781 to 80028 Click to see tablemain
generate_scenario_from_trace
|
Some discoveries: Node with TLA+ state "None" never created in scenario causing
|
@lemmy apologies for editing the comment, but adding the collapsible blocks makes the PR much more readable. The first issue is that the mapping ignores
But The second issue is that we should use The third issue is that this performing complete replacement, not just a node addition, and that we don't have good way to do that in the driver now. |
Found another mismatch, before even getting to the reconfig: TLA Trace
The generated assert for the end of state 4 is for commit index 2 on n1:
But it is 4 in the implementation, since 4 is committable and the whole config ({n1}) is aware. Which now that I think about is exactly what #5798 uncovered. |
@lemmy I've fixed reconfiguration by tweaking |
Trace > execution > trace - spec to code and back :-)
root@codespaces-a47fa6:/workspaces/CCF/tla/consensus# TS=$(date +%s) && \
../tlc.sh -note SIMccfraft.tla -simulate num=1 -dumptrace json SIMccfraft-$TS.json > SIMccfraft-$TS.out ; \
python3 ../trace2scen.py SIMccfraft-$TS.json > SIMccfraft-$TS.rt ; \
python3 ../raft_scenarios_runner.py ../../build/raft_driver SIMccfraft-$TS.rt | tee >(grep '"tag":"raft_trace"' > SIMccraft-$TS.ndjson) && \
JSON=SIMccraft-$TS.ndjson ../tlc.sh -note Traceccfraft.tla diff --git a/tla/consensus/SIMccfraft.cfg b/tla/consensus/SIMccfraft.cfg
index b920aaba8..e4375a917 100644
--- a/tla/consensus/SIMccfraft.cfg
+++ b/tla/consensus/SIMccfraft.cfg
@@ -90,3 +90,4 @@ INVARIANTS
\* DebugInvSuccessfulCommitAfterReconfig
\* DebugInvAllMessagesProcessable
\* DebugInvRetirementReachable
+ DebugUpToDepth
diff --git a/tla/consensus/SIMccfraft.tla b/tla/consensus/SIMccfraft.tla
index bfacc4e7f..e14af7924 100644
--- a/tla/consensus/SIMccfraft.tla
+++ b/tla/consensus/SIMccfraft.tla
@@ -39,7 +39,7 @@ SIMNext ==
[] OTHER -> IF ENABLED Forward THEN Forward ELSE Next
SIMSpec ==
- Init /\ [][SIMNext]_vars
+ Init /\ [][Next]_vars \* Next and not SIMNext to get proper action names & params.
\* The state constraint StopAfter stops TLC after the alloted
\* time budget is up, unless TLC encounteres an error first.
@@ -48,6 +48,8 @@ StopAfter ==
(* The smoke test has a time budget of 20 minutes. *)
IN TLCSet("exit", TLCGet("duration") > timeout)
+DebugUpToDepth ==
+ TLCGet("level") < TLCGet("config").depth
=============================================================================
------------------------------- MODULE SIMPostCondition ------------------------------- diff --git a/tla/trace2scen.py b/tla/trace2scen.py
index d4e9ddb66..2811a5c09 100644
--- a/tla/trace2scen.py
+++ b/tla/trace2scen.py
@@ -32,6 +32,12 @@ MAP = {
"RejectAppendEntriesRequest": lambda _, __, ___: ["# Noop"],
"ReturnToFollowerState": lambda _, __, ___: ["# Noop"],
"AppendEntriesAlreadyDone": lambda _, __, ___: ["# Noop"],
+ "RcvUpdateTerm": lambda _, __, ___: ["# Noop"],
+ "RcvAppendEntriesRequest": lambda _, __, ___: ["# Noop"],
+ "RcvAppendEntriesResponse": lambda _, __, ___: ["# Noop"],
+ "RcvDropIgnoredMessage": lambda _, __, ___: ["# Noop"],
+ "RcvRequestVoteRequest": lambda _, __, ___: ["# Noop"],
+ "RcvRequestVoteResponse": lambda _, __, ___: ["# Noop"],
}
def post_commit(post): |
This (syntactic) change (now part of #5880) causes diff --git a/tla/consensus/ccfraft.tla b/tla/consensus/ccfraft.tla
index 7d5fae84c..af1eefd69 100644
--- a/tla/consensus/ccfraft.tla
+++ b/tla/consensus/ccfraft.tla
@@ -635,6 +635,8 @@ SignCommittableMessages(i) ==
ChangeConfigurationInt(i, newConfiguration) ==
\* Only leader can propose changes
/\ state[i] = Leader
+ \* Configuration is not a previously removed configuration.
+ /\ newConfiguration \notin removedFromConfiguration
\* Configuration is non empty
/\ newConfiguration /= {}
\* Configuration is a proper subset of the Servers
@@ -657,7 +659,7 @@ ChangeConfigurationInt(i, newConfiguration) ==
leaderVars, commitIndex, committableIndices>>
ChangeConfiguration(i) ==
- \E newConfiguration \in SUBSET(Servers \ removedFromConfiguration) :
+ \E newConfiguration \in SUBSET(Servers) :
ChangeConfigurationInt(i, newConfiguration)
\* Leader i advances its commitIndex to the next possible Index. |
What are the use cases for lockstep debugging (ccfraft.tla > raft.h > Traececcfraft.tla) of the spec and the impl (via
|
@lemmy thank you for making the change in #5880, I've update the conversion, and also added the missing actions you spotted when running the sim. It's a little annoying that the default SIM traces don't work, and that changing the Next is necessary, but the improved coverage is desirable.
raft_driver is single threaded, nodes are separate objects but all execution is serialised.
Agreed, though as discussed before the new year, making the driver able to reverse the state of each node is probably achievable: nearly all of it already supports trivial json ser/deser. We would also need to put the messages back on the queues, and restore time elapsed. I don't know that we want to go much further before we complete #5822 at least. For now, the intermediate result this gives us is: "A validation trace can (mostly) be fed back to the driver for investigation". |
Can we get traces from the existing (matching) scenarios, and see what scenarios this generates from them? |
We have only two scenarios compatible with the new bootstrapping logic. @achamayou does it make sense to port all deprecated scenarios? |
I've been trying to do that (#5839), but have not found enough time so far. The traces won't match, because all information related to message dispatch and periodic/time elapsed is lost right now. I don't think we can get them to match without substantially reworking either the spec or the driver, as discussed. |
Very basic script, using functionality introduced by tlaplus/tlaplus#853, try and translate a trace to a raft scenario, for the purpose of executing counter-examples on the implementation.
Example run, using the following spec break:
It's not straightforward to do this correctly, because the spec is set up differently from the driver: individual message delivery and consumption is contained in steps, whereas the driver has a separate dispatch action. This difference in granularity means that we probably can't reproduce every scenario precisely, but we can get quite close most of the time still.