Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

khepri_tx_adv: Prefer list prepending to build side effect lists #291

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

the-mikedavis
Copy link
Member

@the-mikedavis the-mikedavis commented Aug 24, 2024

This is an optimization for functions in the transaction APIs. Specifically this should be very noticeable when any command from the khepri_tx or khepri_tx_adv API (delete, put, etc.) is used with a pattern that affects very many nodes (10,000 or more). The gist is that we can get surprisingly nice speed ups for some functions by building our side effect lists with [Effect | Effects] rather than ++/2.

For background: I have WIP changes to simplify the exchange deletion part of vhost deletion in RabbitMQ by using a transaction (commit). It deletes all exchanges and bindings in a single command without adding/removing runtime parameters and returns the combined rabbit_bindings:deletions() dict. It also changes the way we delete bindings in a transaction to use khepri_tx_adv:delete_many/1 (which returns a map of the deletions) rather than khepri_tx:get_many/1 plus khepri_tx:delete/1 for each binding. Against a stress test of deleting a vhost with 50,000 exchanges with no bindings or queues, I expected the deletion to be almost instant but rabbitmqctl delete_vhost myvhost took around 500s.

A flamegraph points to the BIF behind erlang:'++'/2 (and allocation functions associated with list appending) dominating the time taken to execute a khepri_tx_adv:delete_many/1 call in this block:

khepri/src/khepri_tx_adv.erl

Lines 997 to 1006 in be71afe

handle_state_for_call(Fun) ->
{State, SideEffects} = get_tx_state(),
case Fun(State) of
{NewState, Ret, NewSideEffects} ->
set_tx_state(NewState, SideEffects ++ NewSideEffects),
?raise_exception_if_any(Ret);
{NewState, Ret} ->
set_tx_state(NewState, SideEffects),
?raise_exception_if_any(Ret)
end.

Flamegraph...

Zoomed in on this function in particular:

flamegraph

perf.data.zip


That is most expensive when deleting bindings which is odd since this test vhost only has these 50,000 exchanges - no queues or bindings or anything else.

List ++ [] is not optimized in Erlang/OTP (see erlang/otp#8743): erlang:'++'/2 clones the left-hand-side unless it is [] which spends time and also introduces a lot of garbage. The reason that the reproduction case is slow is that we basically call List ++ [] twice for the non-existent bindings of each exchange we delete which is O(n^2) behavior (n being the number of exchanges) since cloning the list takes O(n) time and memory. List prepending is O(1) but even better we don't need to pay any cost when we don't end up prepending a new element.

Regardless of whether List ++ [] is optimized in Erlang/OTP we should switch to prepending: a large transaction that issues many individual commands will also cause many appends and the right-hand side of that append might be non-empty.

Reproduction steps in RabbitMQ...

With the RabbitMQ repository checked out at branch md/simplify-vhost-exchange-deletion or the linked commit above:

  1. Start a single-node broker with make run-broker.
  2. Add a blank vhost: sbin/rabbitmqctl add_vhost myvhost.
  3. Enable Khepri: sbin/rabbitmqctl enable_feature_flag --experimental khepri_db.
  4. Create many exchanges. For example [rabbit_exchange:declare({resource,<<"myvhost">>,exchange,erlang:list_to_binary(lists:flatten(io_lib:format("e~b", [N])))},topic,true,false,false,[],<<"rmq-internal">>) || N <- lists:seq(1,50_000)] in the the shell or an eval command.
  5. Delete the vhost: time sbin/rabbitmqctl delete_vhost myvhost --timeout 10000.

When the server is running with this branch the deletion takes around 12s. Before this change the deletion takes around 500s.


@the-mikedavis the-mikedavis added the enhancement New feature or request label Aug 24, 2024
@the-mikedavis the-mikedavis self-assigned this Aug 24, 2024
Copy link

codecov bot commented Aug 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.23%. Comparing base (be71afe) to head (b750381).
Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #291      +/-   ##
==========================================
- Coverage   89.55%   89.23%   -0.33%     
==========================================
  Files          21       21              
  Lines        3130     3380     +250     
==========================================
+ Hits         2803     3016     +213     
- Misses        327      364      +37     
Flag Coverage Δ
erlang-25 88.49% <100.00%> (-0.20%) ⬇️
erlang-26 89.05% <100.00%> (-0.41%) ⬇️
erlang-27 88.99% <100.00%> (-0.44%) ⬇️
os-ubuntu-latest 89.20% <100.00%> (-0.36%) ⬇️
os-windows-latest 89.05% <100.00%> (-0.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@the-mikedavis the-mikedavis marked this pull request as ready for review August 24, 2024 14:28
@the-mikedavis the-mikedavis marked this pull request as draft August 25, 2024 20:49
Copy link
Member

@dumbbell dumbbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find!

I only have cosmetic comments.

src/khepri_machine.erl Outdated Show resolved Hide resolved
src/khepri_machine.erl Outdated Show resolved Hide resolved
src/khepri_machine.erl Outdated Show resolved Hide resolved
@the-mikedavis the-mikedavis force-pushed the md/optimize-transaction-functions branch from ea0ea26 to 23df91d Compare August 26, 2024 14:58
@the-mikedavis
Copy link
Member Author

Forgot to clean up and makes sure things were wrapped properly, whoops!

I've also added a test case to assert that we handle the effects in the proper order.

@the-mikedavis the-mikedavis marked this pull request as ready for review August 26, 2024 14:59
test/projections.erl Outdated Show resolved Hide resolved
src/khepri_machine.erl Outdated Show resolved Hide resolved
@the-mikedavis the-mikedavis force-pushed the md/optimize-transaction-functions branch from 23df91d to f565c8d Compare August 27, 2024 13:05
src/khepri_machine.erl Outdated Show resolved Hide resolved
In a transaction we build the set of `ra_machine:effect()`s for the
machine to handle by using `++/2` with the old and new effects for each
`khepri_tx`/`khepri_tx_adv` command. This is a noticeable bottleneck
though when there are many (thousands or more) commands used in a
transaction.

`++/2` works by prepending every element in the left-hand side list
(reversed) to the right-hand side list. As a transaction calls more
`khepri_tx`/ `khepri_tx_adv` commands, the left-hand side list (old side
effects) gets much longer than the right-hand side list (new side
effects) and time it takes to append increases.

We can avoid `++/2` by building the side effect list gradually and
prepend every effect. The ordering of effects within a command isn't
consequential: the effects update triggers and projections and each
update is disjoint from the others. The ordering of the full side effect
list is important though - a delete's effects should be handled before
a put for the same path pattern for example - so we need to reverse the
side effect list for the whole transaction before handling the effects.
The parent commit changes the way we build side effect lists within a
transaction to use prepends instead of '++'/2. This case asserts that
we still process the effects in order when it comes to projections. If
we process effects out of order projections might behave strangely. A
simple example is that a deletion of a path in a transaction followed by
a put with new data should result in a projection on that path pattern
reflecting the new data. If we handled side effects out of order then
the projection table would be missing the record for that path instead.
@the-mikedavis the-mikedavis force-pushed the md/optimize-transaction-functions branch from f565c8d to b750381 Compare August 29, 2024 13:45
@dumbbell dumbbell merged commit b802393 into main Aug 29, 2024
12 checks passed
@dumbbell dumbbell deleted the md/optimize-transaction-functions branch August 29, 2024 14:18
@dumbbell dumbbell added this to the v0.15.0 milestone Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants