Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize lists:flatmap/2 for filtering #8749

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

sabiwara
Copy link
Contributor

In case there is an interest, I'm proposing this optimization we added to elixir (elixir-lang/elixir#13793), since the implementation and the expected improvement are the same. Feel free to close if you feel it is unnecessary.

This could especially generate important speedups (1.5~3x) for cases where flatmap is used as a filtermap, like this or this, but should benefit any case which returns some amount of empty lists or size-1 lists.

Copy link
Contributor

github-actions bot commented Aug 27, 2024

CT Test Results

No tests were run for this PR. This is either because the build failed, or the PR is based on a branch without GH actions tests configured.

Results for commit 29bb2b4

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@bjorng bjorng self-assigned this Aug 27, 2024
@bjorng bjorng added team:VM Assigned to OTP team VM enhancement labels Aug 27, 2024
Copy link
Contributor

@bjorng bjorng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a worthwhile optimization.

I only have one minor comment; please amend the commit and force-push when you do the correction.

lib/stdlib/src/lists.erl Outdated Show resolved Hide resolved
@bjorng bjorng added the testing currently being tested, tag is used by OTP internal CI label Aug 27, 2024
@Maria-12648430
Copy link
Contributor

@bjorng @sabiwara could you explain to me why this change actually results in such a considerable (1.5-3x) speedup? The optimization cases, namely appending to an empty list vs passing through and appending to a singleton list vs consing the singleton, don't seem to explain this.

@jhogberg
Copy link
Contributor

BIF call overhead is considerable compared to consing or doing nothing at all.

@bjorng bjorng merged commit 5fa0be7 into erlang:master Aug 28, 2024
15 checks passed
@bjorng
Copy link
Contributor

bjorng commented Aug 28, 2024

I did some measurements using your version of append/1 compared to lists:append/1:

% erlperf --init_runner_all 'lists:duplicate(10_000, []).' 'r(L) -> lists:append(L).' 'r(L) -> t:append(L).'
Code                             ||        QPS       Time   Rel
r(L) -> t:append(L).              1      58535   17089 ns  100%
r(L) -> lists:append(L).          1       8179     122 us   14%
% erlperf --init_runner_all 'lists:duplicate(10_000, [x]).' 'r(L) -> lists:append(L).' 'r(L) -> t:append(L).'
Code                             ||        QPS       Time   Rel
r(L) -> t:append(L).              1      11444   87408 ns  100%
r(L) -> lists:append(L).          1       5311     188 us   46%
% erlperf --init_runner_all 'lists:duplicate(10_000, [x,y,z]).' 'r(L) -> lists:append(L).' 'r(L) -> t:append(L).'
Code                             ||        QPS       Time   Rel
r(L) -> lists:append(L).          1       3789     264 us  100%
r(L) -> t:append(L).              1       3716     269 us   98%

According to these measurements, your version is faster if all lists have 0 or 1 elements. As soon as most lists have more than one element, you will lose time by having those extra clauses.

@sabiwara sabiwara deleted the optimize-filtermap branch August 28, 2024 11:16
@Maria-12648430
Copy link
Contributor

Ah, I get it now 💡 The performance gain is most pronounced when the given Fun returns empty or singleton lists, and diminishes when it returns longer lists.

@Maria-12648430
Copy link
Contributor

@bjorng you beat me to it XD

@Maria-12648430
Copy link
Contributor

Maria-12648430 commented Aug 28, 2024

@bjorng that said... doesn't that mean that with the changes in this PR, lists:flatmap also got slower when the Fun returns lists with more than 1 element? 😅 That is, doesn't it thereby improve a special case (flatmap being used as filter/map/filtermap) at the expense of the general case?

@bjorng
Copy link
Contributor

bjorng commented Aug 28, 2024

@Maria-12648430

doesn't that mean that with the changes in this PR, lists:flatmap also got slower when the Fun returns lists with more than 1 element? 😅

Yes, it does. I think it's worth it because using flatmap for filtering is a common use case, and it only gets a little bit slower when the lists have more than one element. Another way to express that is that the gain for the filtering use case is much higher than the small loss for the general use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants