-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IntMap should offer lookup variant for likely successful lookup #794
Comments
This sounds like an excellent idea. Could you *briefly* raise the question
of naming on the libraries list? It's even possible we'd want to call it
`lookup`, and name the fast-fail version something else. Did you test for
how bad this version is in an often-failing situation?
…On Sun, Aug 15, 2021, 11:17 PM Callan McGill ***@***.***> wrote:
Currently lookup is defined to fail early by checking if the prefix mask
agrees with the key:
lookup :: Key -> IntMap a -> Maybe a
lookup !k = go
where
go (Bin p m l r) | nomatch k p m = Nothing
| zero k m = go l
| otherwise = go r
go (Tip kx x) | k == kx = Just x
| otherwise = Nothing
go Nil = Nothing
In the original paper they note that if we expect our lookups to be
successful then we should instead prioritize testing if the kth bit is zero
and failing only when we reach leaves. If I understand correctly then this
would just be the following code:
lookupSucc :: Key -> IntMap a -> Maybe a
lookupSucc !k = go
where
go (Bin _p m l r)
| zero k m = go l
| otherwise = go r
go (Tip kx x) | k == kx = Just x
| otherwise = Nothing
go Nil = Nothing
Is that correct? It seems to pass the tests in any case. This can
certainly lead to large speed ups when we know we are doing many successful
lookups. For example, in the lookup benchmark we get the following (this is
a best case scenario as the benchmark does not test a map with misses which
seems like a a bad design):
benchmarked lookup
time 172.2 μs (170.3 μs .. 174.1 μs)
0.998 R² (0.994 R² .. 1.000 R²)
mean 176.6 μs (175.3 μs .. 181.2 μs)
std dev 7.148 μs (2.786 μs .. 15.37 μs)
benchmarked lookupSucc
time 153.0 μs (151.6 μs .. 154.0 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 152.2 μs (151.7 μs .. 152.7 μs)
std dev 1.783 μs (1.306 μs .. 2.362 μs)
Even if this is not quite the correct implementation, containers should
consider offering this sort of function to allow users to tune their code
to their needs as both variants are worthwhile.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#794>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7IQ4E6T2EOIX6TK47TT5B7LNANCNFSM5CG35XSQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
Here is how it looks on the original benchmark, the same benchmark with 50% hits, with 90% hits and with all failures:
I can definitely ask about naming on the libraries list tomorrow, yes! |
I can't see any good reason that |
Agreed regarding `find`, but we need to settle the safe versions too.
…On Sun, Aug 15, 2021, 11:45 PM Callan McGill ***@***.***> wrote:
I can't see any good reason that find should not be this variant since it
otherwise throws an error.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#794 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7LMAHWLUILV5BEHK3TT5CCVLANCNFSM5CG35XSQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
A name that does not suggest success is P.S.: In Agda, we do not use |
|
Could y'all also please comment on whether you think |
Maybe lookup should get a |
The usual dilemma: (1) some apps out there might have been tested to perform well with the present fast-failing In the absence of real-world data, I can only recommend the conservative path: Leave |
We might be overthinking this. The performance difference isn't so great that this change would be worse to some than other likely-optimizations that in the past were simply introduced in prior versions. So making a change should be fine I believe. This does not help us in deciding whether there to make that change or not. Gut feeling says yes: it makes the run time of the function more uniform in general. |
How did you set up the “all failures” test? I expect that an “all failures (but uniform distribution of keys)” will behave very different from “all failures (but existing keys are far away from the queries)”. In fact, I expect for usages where the keys of the failed lookups are “within” the range of the existing values, the new code is actually faster for the failed lookups too. That makes me wonder if it is justifiable to just use new code always? |
@nomeata I did not do anything sensible. The current benchmark has keys |
@nomeata Your suspicion was correct, if we use lookup on lots of near-misses then the new version is faster. |
@sjakobi When making these changes it would be nice to know how it impacts GHC as a real-world consumer of IntMap, do you have any advice on how to do that? |
JFTR, conversation has continued at #800 |
in issue haskell/containers#794 and PR haskell/containers#800 an alternative implementation of `lookup` is proposed which checks if the key is as expected only when reaching the leaf. This means less branching in the success case or when the lookup key shares long prefixes with existing values, but may do more work when the lookup key is far away from existing keys. Not saying that it’s particularly doubtful that the changed code is correct, but since we have the machinery, why not verify it? So here we go, and yes, it goes through. I don’t think we need to merge the PR like this, and can probably wait for the code to reach a containers release, and then include it there. Also, this PR currently contains a bunch of local hacks that I used to get going again (now that I use NixOS), which need to be removed, merged separately, or made obsolete by better changes to the setup.
Yes, that would be very good to check! Unfortunately I'm not aware of any good documentation on doing this – although it should definitely exist, either in the GHC wiki or even in this repo. The basic process is that you update the I've previously done a tiny bit of testing for #340: See #340 (comment) @doyougnu has done more testing and may have better advice: #340 (comment) |
allocations will be useless here, I expect, as (laziness aside) both functions should allocate the same (the return Just/Nothing). Also make sure to change |
Ben Gamari gave me some ideas for what to do in order to measure a cabal build, just waiting on GHC to re-build. |
You can read through the commits in this branch to get a step-by-step guide on altering the boot libs (although I'm adding two boot libs in that branch). Instead of adding a boot lib you would just be changing its You'll want to mimic this commit: https://gitlab.haskell.org/doyougnu/ghc/-/commit/297fe09edbfab7c5a358b0485090a4ed8121f427 And these docs will be useful: https://gitlab.haskell.org/ghc/ghc/-/wikis/working-conventions/git/submodules If you run into trouble I can test it out sometime next week or continue to offer advice here. I'm +1 for this change and would be interested in observing of |
Currently lookup is defined to fail early by checking if the prefix mask agrees with the key:
In the original paper they note that if we expect our lookups to be successful then we should instead prioritize testing if the kth bit is zero and failing only when we reach leaves. If I understand correctly, then this would just be the following code:
Is that correct? It seems to pass the tests in any case. This can certainly lead to large speed ups when we know we are doing many successful lookups. For example, in the lookup benchmark we get the following (this is a best case scenario as the benchmark does not test a map with misses which seems like a bad design for a benchmark):
Even if this is not quite the correct implementation, containers should probably offer this sort of function to allow users to tune their code to their needs as both variants are worthwhile.
The text was updated successfully, but these errors were encountered: