Optimisation of subsequence #490

Kleidukos · 2023-05-31T22:56:06Z

Hi @moodmosaic! I would like to upstream an optimisation to subsequence.

At present time, subsequence is implement as such:

filterM (const bool_) xs

Which we can interpret as "Traverse the whole list and at each element, perform a random boolean generation". This is quite expensive as random generation happens for every element, and is thus unguarded.

Here is an alternative:

We can generate a random Word64 that will act as a bitfield, and iterate on it with our list. Each time we encounter a 1, then we keep the element at this index in our list. If we encounter one or multiple 0, we drop them (which is quite efficient, as for instance five successive 0s are dropped once and not five times).

Then we advance either one step (if we found a 1) or as many 0s we had found, and keep going until there are no bits left in our list.

If the Word64 was not enough, we generate one when to retrieve another random list of 1s and 0s.
Considering the average size of random subsequences, this becomes very cheap because we get 1 random operation, perhaps 2, in most cases.

Here are exerpts of the benchmark:

All
  Subsequence
    Hedgehog subsequence, native, 10 items:       OK (6.13s)
      5.87 μs ±  21 ns
    Hedgehog subsequence64, optimised, 10 items:  OK (1.41s)
      1.32 μs ±  21 ns, 0.22x

    Hedgehog subsequence, native, 20 items:       OK (100.85s)
      12.1 μs ± 1.2 μs
    Hedgehog subsequence64, optimised, 20 items:  OK (58.55s)
      1.74 μs ±  17 ns, 0.14x

    Hedgehog subsequence, native, 30 items:       OK (2.34s)
      17.8 μs ± 299 ns
    Hedgehog subsequence64, optimised, 30 items:  OK (17.73s)
      2.11 μs ±  18 ns, 0.12x

	[…]

    Hedgehog subsequence, native, 90 items:       OK (3.59s)
      54.4 μs ± 471 ns
    Hedgehog subsequence64, optimised, 90 items:  OK (1.37s)
      5.11 μs ±  64 ns, 0.09x

    Hedgehog subsequence, native, 100 items:      OK (7.90s)
      60.6 μs ± 1.0 μs
    Hedgehog subsequence64, optimised, 100 items: OK (1.47s)
      5.61 μs ± 103 ns, 0.09x

You can run the benchmarks manually at: https://github.com/Kleidukos/subsequence-bench/. They also contain comparisons with QuickCheck, both baseline and improved.

The text was updated successfully, but these errors were encountered:

ChickenProp · 2023-06-01T10:15:27Z

(As a note - your implementation has Gen.word64 (Range.constant 0 10000) in a couple places. Should that be Range.constant minBound maxBound?)

Kleidukos · 2023-06-01T10:35:30Z

Good catch, I've updated the repo and the description of the ticket

moodmosaic · 2023-06-05T15:14:20Z

This looks impressive! Shall we keep both or replace the existing one? /cc @ChickenProp, @ocharles (and any other who wants to vote, feel free!)

ocharles · 2023-06-05T15:15:31Z

It certainly makes sense to me!

ChickenProp · 2023-06-05T15:28:16Z

Replace makes sense to me, I can't immediately think of a downside.

Oh, but I'd suggest double-checking the shrink behavior. Right now I suspect it can shrink to include things that weren't in the original sublist (e.g. if the original Gen.word64 gives us 8 = 0b1000, then shrinks to 7 = 0b0111, we're including things we previously weren't), but that it could easily be fixed with src -> prune $ Gen.word64 .... But I haven't checked on either count. I'm not sure if hedgehog currently has tests for how subsequence shrinks.

(Quick edit: oh, and shrink Shrink.list $ Gen.element [[], [x]] might be off too, I think if it generates [x] it might try [] as a shrink twice.)

Kleidukos · 2023-06-07T22:03:04Z

@ChickenProp Thank you very much for these suggestions, I'll implement them over the week-end. If you have more ideas regarding how to test the behaviour of subsequence I am all ears.

ChickenProp · 2023-06-08T08:45:52Z

So the test I think I'd write would probably be to take both a generated value and its shrink tree from Gen.subsequence someList (I think I saw a way to do that but don't remember how). Then check its shrink tree is equal to that of Gen.shrink Shrink.list $ pure generatedValue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimisation of subsequence #490

Optimisation of subsequence #490

Kleidukos commented May 31, 2023 •

edited

Loading

ChickenProp commented Jun 1, 2023 •

edited

Loading

Kleidukos commented Jun 1, 2023

moodmosaic commented Jun 5, 2023

ocharles commented Jun 5, 2023

ChickenProp commented Jun 5, 2023 •

edited

Loading

Kleidukos commented Jun 7, 2023

ChickenProp commented Jun 8, 2023

Optimisation of subsequence #490

Optimisation of subsequence #490

Comments

Kleidukos commented May 31, 2023 • edited Loading

ChickenProp commented Jun 1, 2023 • edited Loading

Kleidukos commented Jun 1, 2023

moodmosaic commented Jun 5, 2023

ocharles commented Jun 5, 2023

ChickenProp commented Jun 5, 2023 • edited Loading

Kleidukos commented Jun 7, 2023

ChickenProp commented Jun 8, 2023

Kleidukos commented May 31, 2023 •

edited

Loading

ChickenProp commented Jun 1, 2023 •

edited

Loading

ChickenProp commented Jun 5, 2023 •

edited

Loading