-
Notifications
You must be signed in to change notification settings - Fork 1
Recipe: PMDB Coalesced Write Discard Old Leader Uncommitted Writes
- pumicedb-client-test
Create a scenario such that the PMDB client write requests will expire as leader paused.
Store the leader-uuid, term and commit-idx.
Select a leader for pausing here. Prior to pausing the leader, that leader’s coalesced_writes mode is enabled to write requests that expire as the client will try to send requests. Add the following fault to the leader which will be paused:
APPLY enabled@true
WHERE /fault_injection_points/name@coalesced_writes
OUTFILE /err.out
After applying the fault, ensure that /fault_injection_points/coalesced_writes/enabled == true
The pumicedb-client-test should be newly spawned in this recipe. The recipe should be willing to wait for a few seconds for the client to report "leader-viable" : true
.
{
"raft_client_root_entry" : [
{
"raft-uuid" : "3f266232-fde4-11ea-86f8-90324b2d1e89",
"client-uuid" : "3f28d148-fde4-11ea-9c5f-90324b2d1e89",
"leader-uuid" : "3f27d9fa-fde4-11ea-a172-90324b2d1e89",
"state" : "client",
"commit-latency-msec" : {},
"read-latency-msec" : {},
"leader-viable" : true,
"leader-alive-cnt" : 82,
"last-request-sent" : "Thu Jan 01 00:00:00 UTC 1970",
"last-request-ack" : "Thu Jan 01 00:00:00 UTC 1970",
"recent-ops-wr" : [],
"recent-ops-rd" : []
}
]
}
Leader-viable tells the recipe that the client has been in recent contact with the leader - it is a general indicator of health.
After the client has been started, lower the timeout from the default to more test-friendly timeout such as 1 seconds:
APPLY default-request-timeout-sec@1
WHERE /raft_client_root_entry/default-request-timeout-sec
OUTFILE /err.out
Verify that the default-request-timeout-sec
key has been modified accordingly.
{
"raft_client_root_entry" : [
{
"raft-uuid" : "7cbcf2fc-f522-11ea-8890-90324b2d1e89",
"client-uuid" : "7cbeb470-f522-11ea-bfde-90324b2d1e89",
"leader-uuid" : "7cbe3810-f522-11ea-bae4-90324b2d1e89",
"state" : "client",
"default-request-timeout-sec" : 1,
"commit-latency-msec" : {},
"read-latency-msec" : {},
"leader-viable" : false,
"leader-alive-cnt" : 0,
"last-request-sent" : "Thu Jan 01 00:00:00 UTC 1970",
"last-request-ack" : "Thu Jan 01 00:00:00 UTC 1970",
"recent-ops-wr" : [],
"recent-ops-rd" : []
}
]
}
Issue a write command from the client (the recipe should generate its own UUID to replace the one below):
APPLY input@00000000-ffff-ffff-ffff-ffffffffffff:0:0:0:0.write:1
WHERE /pumice_db_test_client/input
OUTFILE /pmdb-write.out
Wait for the +2 seconds (this should be enough time for the request to have expired).
Here we set election-timeout to 2
APPLY election-timeout-ms@2
WHERE /raft_net_info/election-timeout-ms
OUTFILE /err.out
/raft_client_root_entry/0/recent-ops-wr/*/status : "Connection timed out"
"raft_client_root_entry" : [
{
"raft-uuid" : "b4deecc6-1b82-11ec-bdbd-8761d6acdca6",
"client-uuid" : "b569722e-1b82-11ec-b8fa-cb98d01e6c03",
"leader-uuid" : "b54052ea-1b82-11ec-bb9d-139055d36d46",
"state" : "client",
"default-request-timeout-sec" : 1,
"commit-latency-msec" : {},
"read-latency-msec" : {},
"leader-viable" : true,
"leader-alive-cnt" : 7,
"last-request-sent" : "Wed Sep 22 08:56:19 UTC 2021",
"last-request-ack" : "Thu Jan 01 00:00:00 UTC 1970",
"recent-ops-wr" : [
{
"sub-app-user-id" : "cea81092-1b82-11ec-aaa3-5fc18afac25c:0:0:0:0",
"rpc-id" : 6468388818436227143,
"rpc-user-tag" : 1111036264,
"blocking" : false,
"status" : "Connection timed out",
"server" : "0.0.0.0:0",
"submitted" : "Wed Sep 22 08:56:19 UTC 2021",
"attempts" : 1,
"completion-time-ms" : 0,
"timeout-ms" : 1000,
"reply-size" : 0,
"op" : "write"
}
],
"recent-ops-rd" : [],
"pending-ops" : []
}
],
/raft_root_entry/0/leader-uuid != <original-leader> (which paused in step 5)
- Reset the client request timeout to default timeout i.e. 60
APPLY default-request-timeout-sec@60
WHERE /raft_client_root_entry/default-request-timeout-sec
OUTFILE /err.out
Issue a write command from the client (the recipe should generate its new UUID to replace the one below):
APPLY input@00000000-ffff-ffff-ffff-ffffffffffff:0:0:0:0.write:1
WHERE /pumice_db_test_client/input
OUTFILE /pmdb-write.out
Now old leader(which paused in step) should becomes “follower”
/raft_root_entry/0/state : "follower"
10. Make sure write with app-uuid2 should get “success” and write with app-uuid1 should gets “Connection timed out"
"raft_client_root_entry" : [
{
"raft-uuid" : "b4deecc6-1b82-11ec-bdbd-8761d6acdca6",
"client-uuid" : "b569722e-1b82-11ec-b8fa-cb98d01e6c03",
"leader-uuid" : "b54052ea-1b82-11ec-bb9d-139055d36d46",
"state" : "client",
"default-request-timeout-sec" : 60,
"commit-latency-msec" : {
"8" : 1
},
"read-latency-msec" : {},
"leader-viable" : true,
"leader-alive-cnt" : 5,
"last-request-sent" : "Wed Sep 22 08:56:29 UTC 2021",
"last-request-ack" : "Wed Sep 22 08:56:29 UTC 2021",
"recent-ops-wr" : [
{
"sub-app-user-id" : "f9a8cac0-1b82-11ec-a328-c7f169c767bd:0:0:0:0",
"rpc-id" : 6468388818436227172,
"rpc-user-tag" : 1227719982,
"blocking" : false,
"status" : "Success",
"server" : "127.0.0.1:12000",
"submitted" : "Wed Sep 22 08:56:29 UTC 2021",
"attempts" : 1,
"completion-time-ms" : 12,
"timeout-ms" : 60000,
"reply-size" : 88,
"op" : "write"
},
{
"sub-app-user-id" : "cea81092-1b82-11ec-aaa3-5fc18afac25c:0:0:0:0",
"rpc-id" : 6468388818436227143,
"rpc-user-tag" : 1111036264,
"blocking" : false,
"status" : "Connection timed out",
"server" : "0.0.0.0:0",
"submitted" : "Wed Sep 22 08:56:19 UTC 2021",
"attempts" : 1,
"completion-time-ms" : 0,
"timeout-ms" : 1000,
"reply-size" : 0,
"op" : "write"
}
],
"recent-ops-rd" : [],
"pending-ops" : []
}
],
Reset election-timeout to its default timeout i.e 300
APPLY election-timeout-ms@300
WHERE /raft_net_info/election-timeout-ms
OUTFILE /err.out