Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic backpressure mechanism if primary has too many in-flight transactions #5692

Merged
merged 14 commits into from
Sep 28, 2023

Conversation

achamayou
Copy link
Member

@achamayou achamayou commented Sep 27, 2023

Resolve #3871 and #5690.

Added a consensus.max_uncommitted_tx_count, which caps the maximum number of uncommitted transactions a primary allows before pushing back with HTTP errors. An exception is carved out for the /node frontend, to avoid hampering diagnostic attempts on a primary that may be stuck at the cap (for example because > f nodes have died).

Working on the tests (see comments), but the change itself is ready to review.

@achamayou achamayou changed the title Add basic backpressure mechanism if primary has too many in-flight tr… Add basic backpressure mechanism if primary has too many in-flight transactions Sep 27, 2023
@achamayou
Copy link
Member Author

The basicperf test also needs improving, to at least exclude 5xx from the timing results, or maybe run more iterations until the target is run.

Report error rate, fail if non-zero

Pros: safe, provided an exception is made for stop_primary_after_s
Cons: can't run a test that's close to the cap in CI

Report error rate, don't fail if non-zero

Pros: can run tests close to the cap in CI
Cons: hides fluctuations in error rate from cimetrics

Report error rates and plot them

Pros: can run tests close to the cap in CI
Cons: that's a lot of plots now

@ghost
Copy link

ghost commented Sep 27, 2023

committable_threshold_backpressure@76588 aka 20230928.29 vs main ewma over 20 builds from 76284 to 76583

Click to see table

main

build_id build_number Commit latency factor tpcc_sgx_cft^ tpcc_sgx_cft_mem pi_basic_mt_sgx_cft^ pi_basic_mt_sgx_cft_mem pi_basic_mt_virtual_cft^ ls_sgx_cft^ ls_sgx_cft_mem pi_ls_sgx_cft^ pi_ls_sgx_cft_mem pi_basic_sgx_cft^ pi_basic_sgx_cft_mem pi_basic_js_sgx_cft^ pi_basic_js_sgx_cft_mem ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem pi_ls_jwt_sgx_cft^ pi_ls_jwt_sgx_cft_mem ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem tpcc_virtual_cft^ hist_sgx_cft^ ls_virtual_cft^ RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^ pi_ls_virtual_cft^ pi_basic_virtual_cft^ pi_basic_js_virtual_cft^ ls_jwt_virtual_cft^ pi_ls_jwt_virtual_cft^ ls_js_virtual_cft^ ls_full_js_virtual_cft^ ls_js_jwt_virtual_cft^
76284 20230922.1 0.795209 5633.54 8.59996e+07 27961.4 2.30851e+07 87501.2 14010.7 1.67936e+07 14106.8 1.05021e+07 15551.8 1.25993e+07 1446.7 1.25993e+07 6827.79 1.88908e+07 6927.2 6.30784e+06 5776.51 1.67936e+07 5763.08 1.67936e+07 4000.17 1.67936e+07 17301 42409.1 45670.3 824925 1.18035e+06 8.1735e+06 3.1513e+07 48224.4 54547.6 4451.7 17379.9 19636.7 15047.1 14992.1 9793.49
76298 20230922.7 0.800525 5594.3 8.59996e+07 27722.6 2.30851e+07 74808.3 14004.8 1.67936e+07 14048.4 1.05021e+07 15518.1 1.25993e+07 1428.6 1.25993e+07 6899.81 1.88908e+07 7081.9 6.30784e+06 5779.93 1.67936e+07 5468 1.67936e+07 3981.43 1.67936e+07 17323.7 44172.3 45640.5 825413 1.17652e+06 8.15043e+06 3.07277e+07 41875.2 54379 4437.9 17134.8 19126.7 17461.3 14816.7 9779.41
76315 20230922.13 0.80863 5538.6 8.59996e+07 27876.9 2.30851e+07 69448.7 13943 1.67936e+07 14077.1 1.05021e+07 15400.6 1.25993e+07 1435.8 1.25993e+07 6800.32 1.88908e+07 6885.7 6.30784e+06 5765.87 1.67936e+07 5441.2 1.67936e+07 3999.43 1.67936e+07 17303.6 42190.1 45826.4 832221 1.17806e+06 8.15602e+06 3.16548e+07 48277 54985.6 4473.6 17227.3 19660.3 17271 14960 9777.67
76330 20230922.19 0.858037 5615.85 8.59996e+07 28049.9 2.30851e+07 89163.8 14027.7 1.67936e+07 14054.2 1.05021e+07 15590.5 1.25993e+07 1442.4 1.25993e+07 6816.46 1.88908e+07 6925.9 6.30784e+06 5810.87 1.67936e+07 5790.79 1.67936e+07 4004.39 1.67936e+07 17306.3 40339.6 43745.9 828854 1.18105e+06 8.17118e+06 3.07715e+07 48548.6 54489 4421.3 16991.2 19476.1 17415.7 15098.1 9979.65
76345 20230925.1 0.797862 5580.25 8.59996e+07 28121 2.30851e+07 71294.3 13980.5 1.88908e+07 14091.4 1.05021e+07 15590.5 1.25993e+07 1438 1.25993e+07 6912.6 1.88908e+07 6932.1 6.30784e+06 5804.88 1.67936e+07 5769.73 1.67936e+07 4001.2 1.67936e+07 17379.9 44686.4 45613 840036 1.17118e+06 8.13205e+06 3.07304e+07 48236.5 54857.8 4423.1 17220.9 19323.7 17336.5 15000.3 9841.31
76354 20230925.5 0.777537 5658.2 8.59996e+07 27781.6 2.30851e+07 66649 14047.8 1.88908e+07 14132 1.05021e+07 15677.6 1.25993e+07 1448.6 1.25993e+07 6864.84 1.88908e+07 7153.1 6.30784e+06 5772.5 1.67936e+07 5787.49 1.67936e+07 3983.73 1.67936e+07 17275 44715.7 45840.2 829194 1.17874e+06 8.13641e+06 3.2025e+07 47774.9 55252.5 4454.5 16896 19510.3 17301.6 14893.3 9953.05
76364 20230925.8 0.794293 5660.35 8.59996e+07 27926.4 2.30851e+07 66865.4 14006.7 1.88908e+07 14101.4 1.05021e+07 15631 1.46964e+07 1441.2 1.25993e+07 6848.56 1.88908e+07 6926.3 6.30784e+06 5813.72 1.67936e+07 5762.48 1.67936e+07 3997.44 1.67936e+07 17152.3 36232.5 45670.8 831674 1.18443e+06 8.1528e+06 3.20571e+07 47631.7 55394.6 4444.5 17025.6 19263.4 17562.5 14890 10304
76401 20230925.21 0.778057 5618.43 8.59996e+07 28098.9 2.30851e+07 90296.9 14012.6 1.88908e+07 14100.3 1.05021e+07 15619.1 1.46964e+07 1437.8 1.25993e+07 6854.04 1.88908e+07 7073.1 6.30784e+06 5762.99 1.67936e+07 5777.3 1.67936e+07 3997.84 1.67936e+07 17284.6 46217.4 45791.4 815297 1.18394e+06 8.13712e+06 3.11029e+07 48390.8 55563.4 4425.9 17183.4 19213 17522.5 15052.4 9862.43
76413 20230926.1 0.809525 5615.12 8.59996e+07 27525.8 2.51822e+07 74534.1 14001.8 1.67936e+07 14121.4 1.05021e+07 15666.9 1.25993e+07 1439.4 1.25993e+07 6880.56 1.88908e+07 6945 6.30784e+06 5807.19 1.67936e+07 5735.29 1.88908e+07 4005.07 1.67936e+07 17088.7 44374.2 43810.5 830904 1.18316e+06 8.1553e+06 3.0742e+07 47540.7 54719.5 4400.7 17075.2 19481 17178.2 14729.2 10266.5
76422 20230926.5 0.79564 5613.41 8.59996e+07 28064.8 2.51822e+07 72164.4 14068.8 1.88908e+07 14153.7 1.05021e+07 15715.8 1.25993e+07 1449 1.25993e+07 6867.21 1.88908e+07 6970.9 6.30784e+06 5808.85 1.67936e+07 5786.28 1.67936e+07 4006.72 1.67936e+07 17377.5 43158.3 43901.8 830821 1.17914e+06 8.15517e+06 3.07102e+07 48144.5 54917.7 4422.7 17099.4 19656.8 16945.3 16940.7 9833.1
76435 20230926.8 0.823415 5627.76 8.59996e+07 28080.9 2.51822e+07 64658.1 14013 1.88908e+07 14100.3 1.05021e+07 15574.6 1.46964e+07 1442.5 1.25993e+07 6882.71 1.88908e+07 6978.2 6.30784e+06 5807.38 1.67936e+07 5475.2 1.67936e+07 4007.11 1.67936e+07 17284.5 44819.6 43652 825950 1.1811e+06 8.15293e+06 3.06803e+07 47840.3 55021.8 4433.6 17344.7 16708.4 17266.6 14926.9 9789.57
76438 20230926.10 0.835597 5576.8 8.59996e+07 27931.9 2.51822e+07 75846.6 14014.7 1.67936e+07 14064.8 1.05021e+07 15539.7 1.46964e+07 1427.9 1.25993e+07 6862.04 1.67936e+07 7064.2 6.30784e+06 5809.27 1.67936e+07 5727.33 1.67936e+07 4009.57 1.67936e+07 17155.3 46415 45799.6 824332 1.18245e+06 8.15494e+06 3.18087e+07 47629.9 55604.4 4434.8 17463.2 19661 16901.6 14842 9853.62
76439 20230926.11 0.793031 5596.84 8.59996e+07 27916.3 2.30851e+07 87121.1 14028.1 1.88908e+07 14188.3 1.05021e+07 15644.5 1.25993e+07 1441.2 1.25993e+07 6830.18 1.67936e+07 6935.2 6.30784e+06 5810.09 1.67936e+07 5487.23 1.88908e+07 4011.23 1.67936e+07 17371.4 42833.5 45538.3 832485 1.18416e+06 8.17108e+06 3.081e+07 48330.9 55660.6 4427.4 17134.9 19796.1 16934 16935 9858.85
76454 20230926.16 0.815815 5553.02 8.59996e+07 28221 2.30851e+07 65869.4 14020.3 1.88908e+07 14156.6 1.05021e+07 15565.7 1.25993e+07 1421 1.25993e+07 6870.1 1.88908e+07 6932 6.30784e+06 5804.52 1.67936e+07 5713.04 1.67936e+07 4005.46 1.67936e+07 17322.7 47071.2 45821.2 792004 1.17574e+06 8.15394e+06 3.15533e+07 47714.7 53470.9 4436.9 17347.5 19578.3 17671.1 15008.7 9991.23
76474 20230927.1 0.766724 5608.81 8.59996e+07 27575.6 2.30851e+07 69514.1 14057 1.88908e+07 14233.1 1.05021e+07 15637.3 1.46964e+07 1437 1.25993e+07 6861.92 1.67936e+07 6967.8 6.30784e+06 5795.6 1.67936e+07 5771.67 1.67936e+07 3999.03 1.67936e+07 17281.2 45630.4 45879.6 831956 1.17776e+06 8.15585e+06 3.07203e+07 47971.1 53380.4 4463 17307.8 19704 17087.1 14716.8 9919.16
76488 20230927.7 0.790189 5550.33 8.59996e+07 28284.8 2.30851e+07 63452.7 13976.3 1.88908e+07 14123.5 1.05021e+07 15542.1 1.25993e+07 1436.7 1.25993e+07 7237.01 1.67936e+07 6956.7 6.30784e+06 5812.98 1.67936e+07 5448.81 1.67936e+07 3998.52 1.67936e+07 17354.7 41180.1 45823.4 831215 1.18125e+06 8.14751e+06 3.09361e+07 47430.7 54927.3 4447.6 17260.7 19870.8 17176.1 14914.4 9796.07
76520 20230928.1 0.811302 5528.73 8.59996e+07 27622.7 2.30851e+07 65695.6 13953 1.88908e+07 13992.5 1.05021e+07 15250 1.46964e+07 1419.2 1.25993e+07 6863.35 1.88908e+07 6818.3 6.30784e+06 5785.81 1.67936e+07 5461.2 1.67936e+07 3978.65 1.67936e+07 17264.1 42824 45732.8 831666 1.18348e+06 8.15342e+06 3.10661e+07 47735.2 54858.6 4433.4 17130.3 19066.5 16924.4 16923.5 9805.85
76543 20230928.11 0.803093 5559.9 8.59996e+07 27915.4 2.51822e+07 72582.5 14011.7 1.88908e+07 13946.2 1.05021e+07 15425.8 1.46964e+07 1423.5 1.25993e+07 6834.74 1.88908e+07 6929.7 6.30784e+06 5778.19 1.67936e+07 5724.81 1.88908e+07 3988.81 1.67936e+07 17380.6 43403 45620.7 836597 1.17951e+06 8.15283e+06 3.14936e+07 47866.8 54588.7 4463.5 17525.7 19217.1 16974.1 16888.9 9918.21
76558 20230928.18 0.830452 5604.7 8.59996e+07 27885.8 2.30851e+07 88575.4 14047.9 1.88908e+07 14110.8 1.05021e+07 15503.8 1.25993e+07 1430.7 1.25993e+07 6843.02 1.67936e+07 6962.7 6.30784e+06 5769.09 1.67936e+07 5491.12 1.67936e+07 3993.7 1.67936e+07 17234 42154.9 45454.2 831276 1.18056e+06 8.13202e+06 3.02641e+07 47870 54772.5 4433.7 17341.5 19225.2 17098.4 14922.3 9863.52
76583 20230928.28 0.801501 5537.92 8.59996e+07 27848.7 2.51822e+07 65434.1 13999.1 1.88908e+07 14007.6 1.05021e+07 15378.9 1.46964e+07 1419.9 1.25993e+07 6870.19 1.88908e+07 6977.8 6.30784e+06 5781.9 1.67936e+07 5471.22 1.67936e+07 3964.63 1.67936e+07 17308 44707.7 45863.7 831168 1.17972e+06 8.14965e+06 3.15285e+07 47715.4 54659.8 4426.7 17042.4 19021.5 17126.6 16809.9 9867.79

committable_threshold_backpressure

build_id build_number pi_basic_mt_sgx_cft^ pi_basic_mt_sgx_cft_mem pi_basic_mt_virtual_cft^ Commit latency factor tpcc_virtual_cft^ ls_virtual_cft^ tpcc_sgx_cft^ tpcc_sgx_cft_mem pi_ls_virtual_cft^ pi_basic_virtual_cft^ pi_basic_js_virtual_cft^ ls_jwt_virtual_cft^ pi_ls_jwt_virtual_cft^ ls_sgx_cft^ ls_sgx_cft_mem ls_js_virtual_cft^ pi_ls_sgx_cft^ pi_ls_sgx_cft_mem ls_full_js_virtual_cft^ pi_basic_sgx_cft^ pi_basic_sgx_cft_mem ls_js_jwt_virtual_cft^ pi_basic_js_sgx_cft^ pi_basic_js_sgx_cft_mem ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem hist_sgx_cft^ pi_ls_jwt_sgx_cft^ pi_ls_jwt_sgx_cft_mem ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
76563 20230928.20 27781.6 2.51822e+07 62031 0.815659 17148.3 45874.9 5515.48 8.59996e+07 48460.2 52891.8 4432.5 16969.2 19713.8 13974.8 1.88908e+07 17164.7 14114.8 1.05021e+07 16869.3 15576 1.25993e+07 9976.01 1428.4 1.25993e+07 6864.78 1.88908e+07 45292.4 6938.7 6.30784e+06 5796.05 1.67936e+07 5780.52 1.67936e+07 4010.14 1.67936e+07 835747 1.18411e+06 8.16483e+06 3.07517e+07
76571 20230928.23 27640.8 2.30851e+07 72829.3 0.819568 17113.8 43844.3 5578.87 8.59996e+07 47677.6 54130.9 4443.1 17032.8 19152.7 14006.8 1.88908e+07 17220.5 14090.5 1.05021e+07 14995.5 15615.8 1.25993e+07 9948.58 1433.1 1.25993e+07 6856.92 1.88908e+07 45130.4 7107.3 6.30784e+06 5815.07 1.67936e+07 5737.67 1.67936e+07 3967.1 1.67936e+07 828110 1.17529e+06 8.1732e+06 3.14946e+07
76588 20230928.29 27975.9 2.51822e+07 89069.5 0.816237 17410 45797.8 5618.03 8.59996e+07 48192.8 55093.1 4474 17143.1 19406.2 14050.1 1.88908e+07 17152.5 14161.3 1.05021e+07 14773.5 15643.5 1.25993e+07 9886.98 1439.3 1.25993e+07 6915.84 1.88908e+07 46176.4 6923.9 6.30784e+06 5769.39 1.67936e+07 5771.4 1.67936e+07 3975.64 1.67936e+07 829807 1.18489e+06 8.13509e+06 3.07461e+07

images

@achamayou
Copy link
Member Author

achamayou commented Sep 27, 2023

Summary of discussion with @eddyashton:

  • max_uncommited_tx_count in consensus configuration
  • do not apply on backups
  • do not apply on /node frontend, probably through a endpoint registry flag
  • do not set on perf tests, make them strict for errors

@achamayou
Copy link
Member Author

achamayou commented Sep 28, 2023

The change seems to work as intended in performance runs:

image

Still to do, add:

  • end to end test hitting the cap deterministically
  • same thing on the /node frontend not hitting the cap

@achamayou achamayou marked this pull request as ready for review September 28, 2023 10:39
@achamayou achamayou requested a review from a team September 28, 2023 10:39
src/consensus/aft/raft.h Outdated Show resolved Hide resolved
include/ccf/service/consensus_config.h Outdated Show resolved Hide resolved
doc/host_config_schema/cchost_config.json Outdated Show resolved Hide resolved
src/consensus/consensus_types.h Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@achamayou achamayou added auto-backport Automatically backport this PR to LTS branch 4.x-todo PRs which should be backported to 4.x labels Sep 28, 2023
@achamayou achamayou merged commit d1d9d8e into microsoft:main Sep 28, 2023
@ghost
Copy link

ghost commented Sep 28, 2023

💔 All backports failed

Status Branch Result
release/4.x Backport failed because of merge conflicts

You might need to backport the following PRs to release/4.x:
- Add CLI argument for SNP context directory (#5686)
- Fix lts_compatibility: Bump version check for --enclave-file arg (#5681)
- Pass enclave path as CLI argument rather than in configuration (#5665)

Manual backport

To create the backport manually run:

backport --pr 5692

Questions ?

Please refer to the Backport tool documentation and see the Github Action logs for details

@achamayou
Copy link
Member Author

💔 All backports failed

Status Branch Result
release/4.x Conflict resolution was aborted by the user

Manual backport

To create the backport manually run:

backport --pr 5692

Questions ?

Please refer to the Backport tool documentation

achamayou added a commit to achamayou/CCF that referenced this pull request Oct 18, 2023
…ansactions (microsoft#5692)

(cherry picked from commit d1d9d8e)

# Conflicts:
#	CHANGELOG.md
#	tests/infra/basicperf.py
#	tests/infra/remote.py
@achamayou achamayou added the backported This PR was successfully backported to LTS branch label Oct 18, 2023
achamayou added a commit that referenced this pull request Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.x-todo PRs which should be backported to 4.x auto-backport Automatically backport this PR to LTS branch backported This PR was successfully backported to LTS branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to reject requests if too many transactions pending
2 participants