Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IntSet.Bin #998

Merged
merged 8 commits into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions containers-tests/benchmarks/LookupGE/LookupGE_IntMap.hs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
module LookupGE_IntMap where

import Prelude hiding (null)
import Data.IntSet.Internal.IntTreeCommons
(Key, Prefix(..), nomatch, signBranch, left)
import Data.IntMap.Internal

lookupGE1 :: Key -> IntMap a -> Maybe (Key,a)
Expand Down
1 change: 1 addition & 0 deletions containers-tests/containers-tests.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ library
Data.IntMap.Strict.Internal
Data.IntSet
Data.IntSet.Internal
Data.IntSet.Internal.IntTreeCommons
Data.Map
Data.Map.Internal
Data.Map.Internal.Debug
Expand Down
1 change: 1 addition & 0 deletions containers-tests/tests/IntMapValidity.hs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ module IntMapValidity

import Data.Bits (finiteBitSize, testBit, xor, (.&.))
import Data.List (intercalate, elemIndex)
import Data.IntSet.Internal.IntTreeCommons (Prefix(..), nomatch)
import Data.IntMap.Internal
import Numeric (showHex)
import Test.Tasty.QuickCheck (Property, counterexample, property, (.&&.))
Expand Down
75 changes: 36 additions & 39 deletions containers-tests/tests/IntSetValidity.hs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@
module IntSetValidity (valid) where

import Data.Bits (xor, (.&.))
import Data.IntSet.Internal.IntTreeCommons (Prefix(..), nomatch)
import Data.IntSet.Internal
import Data.List (intercalate)
import Numeric (showHex)
import Test.Tasty.QuickCheck (Property, counterexample, property, (.&&.))
import Utils.Containers.Internal.BitUtil (bitcount)

Expand All @@ -13,9 +16,7 @@ import Utils.Containers.Internal.BitUtil (bitcount)
valid :: IntSet -> Property
valid t =
counterexample "nilNeverChildOfBin" (nilNeverChildOfBin t) .&&.
counterexample "maskPowerOfTwo" (maskPowerOfTwo t) .&&.
counterexample "commonPrefix" (commonPrefix t) .&&.
counterexample "markRespected" (maskRespected t) .&&.
counterexample "prefixOk" (prefixOk t) .&&.
counterexample "tipsValid" (tipsValid t)

-- Invariant: Nil is never found as a child of Bin.
Expand All @@ -24,48 +25,41 @@ nilNeverChildOfBin t =
case t of
Nil -> True
Tip _ _ -> True
Bin _ _ l r -> noNilInSet l && noNilInSet r
Bin _ l r -> noNilInSet l && noNilInSet r
where
noNilInSet t' =
case t' of
Nil -> False
Tip _ _ -> True
Bin _ _ l' r' -> noNilInSet l' && noNilInSet r'
Bin _ l' r' -> noNilInSet l' && noNilInSet r'

-- Invariant: The Mask is a power of 2. It is the largest bit position at which
-- two elements of the set differ.
maskPowerOfTwo :: IntSet -> Bool
maskPowerOfTwo t =
-- Invariants:
-- * All keys in a Bin start with the Bin's shared prefix.
-- * All keys in the Bin's left child have the Prefix's mask bit unset.
-- * All keys in the Bin's right child have the Prefix's mask bit set.
prefixOk :: IntSet -> Property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything checking the prefixes of Bins other than the root. Shouldn't this be called prefixesOk, and perform a recursive test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For test performance reasons, I suspect we want to consider validity of the prefix of a Bin relative to that of its parent when making the test recursive, treating the (peculiarly simple) case of the root specially.

Copy link
Contributor Author

@meooow25 meooow25 Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 🤦

For test performance reasons, I suspect we want to consider validity of the prefix of a Bin relative to that of its parent when making the test recursive, treating the (peculiarly simple) case of the root specially.

Is the trouble really worth it?

$ cabal run intset-properties
...
All 60 tests passed (0.21s)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 0.21s, no. But why does it run for such a short time? Can we make it run more tests? Larger tests? Ideally I want to use more efficient tests and also get that testing time up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately, I suspect that a relative test is likely to help give more useful information in case of failure, though I don't know that for sure. Regardless, I won't block up this PR for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it run more tests?

This would be the easiest, we can increase the number of quickcheck runs.

Ideally I want to use more efficient tests and also get that testing time up.

Well, if it helps catch errors. Don't know how one would judge that, except maybe coverage. Otherwise it is a waste.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Data.Sequence, that's helpful for dealing with "special casing" in some parts of the code. Maybe it's overkill here.

prefixOk t =
case t of
Nil -> True
Tip _ _ -> True
Bin _ m l r ->
bitcount 0 (fromIntegral m) == 1 && maskPowerOfTwo l && maskPowerOfTwo r
Nil -> property ()
Tip _ _ -> property ()
Bin p l r ->
let px = unPrefix p
m = px .&. (-px)
keysl = elems l
keysr = elems r
debugStr = concat
[ "px=" ++ showIntHex px
, ", keysl=[" ++ intercalate "," (fmap showIntHex keysl) ++ "]"
, ", keysr=[" ++ intercalate "," (fmap showIntHex keysr) ++ "]"
]
in counterexample debugStr $
counterexample "mask bit absent" (px /= 0) .&&.
counterexample "prefix not shared" (all (`hasPrefix` p) (keysl ++ keysr)) .&&.
counterexample "left child, mask found set" (all (\x -> x .&. m == 0) keysl) .&&.
counterexample "right child, mask found unset" (all (\x -> x .&. m /= 0) keysr)

-- Invariant: Prefix is the common high-order bits that all elements share to
-- the left of the Mask bit.
commonPrefix :: IntSet -> Bool
commonPrefix t =
case t of
Nil -> True
Tip _ _ -> True
b@(Bin p _ l r) -> all (sharedPrefix p) (elems b) && commonPrefix l && commonPrefix r
where
sharedPrefix :: Prefix -> Int -> Bool
sharedPrefix p a = p == p .&. a

-- Invariant: In Bin prefix mask left right, left consists of the elements that
-- don't have the mask bit set; right is all the elements that do.
maskRespected :: IntSet -> Bool
maskRespected t =
case t of
Nil -> True
Tip _ _ -> True
Bin _ binMask l r ->
all (\x -> zero x binMask) (elems l) &&
all (\x -> not (zero x binMask)) (elems r) &&
maskRespected l &&
maskRespected r
hasPrefix :: Int -> Prefix -> Bool
hasPrefix i p = not (nomatch i p)

-- Invariant: The Prefix is zero for the last 5 (on 32 bit arches) or 6 bits
-- (on 64 bit arches). The values of the set represented by a tip
Expand All @@ -77,13 +71,16 @@ tipsValid t =
case t of
Nil -> True
tip@(Tip p b) -> validTipPrefix p
Bin _ _ l r -> tipsValid l && tipsValid r
Bin _ l r -> tipsValid l && tipsValid r

validTipPrefix :: Prefix -> Bool
validTipPrefix :: Int -> Bool
#if WORD_SIZE_IN_BITS==32
-- Last 5 bits of the prefix must be zero for 32 bit arches.
validTipPrefix p = (0x0000001F .&. p) == 0
#else
-- Last 6 bits of the prefix must be zero for 64 bit arches.
validTipPrefix p = (0x000000000000003F .&. p) == 0
#endif

showIntHex :: Int -> String
showIntHex x = "0x" ++ showHex (fromIntegral x :: Word) ""
2 changes: 1 addition & 1 deletion containers-tests/tests/intmap-properties.hs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import Data.IntMap.Internal (traverseMaybeWithKey)
import Data.IntMap.Merge.Lazy
#endif
import Data.IntMap.Internal.Debug (showTree)
import Data.IntMap.Internal (Prefix(..))
import Data.IntSet.Internal.IntTreeCommons (Prefix(..), nomatch)
import IntMapValidity (hasPrefix, hasPrefixSimple, valid)

import Control.Applicative (Applicative(..))
Expand Down
25 changes: 0 additions & 25 deletions containers-tests/tests/intset-properties.hs
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,6 @@ main = defaultMain $ testGroup "intset-properties"
, testProperty "prop_AscDescList" prop_AscDescList
, testProperty "prop_fromList" prop_fromList
, testProperty "prop_fromRange" prop_fromRange
, testProperty "prop_MaskPow2" prop_MaskPow2
, testProperty "prop_Prefix" prop_Prefix
, testProperty "prop_LeftRight" prop_LeftRight
, testProperty "prop_isProperSubsetOf" prop_isProperSubsetOf
, testProperty "prop_isProperSubsetOf2" prop_isProperSubsetOf2
, testProperty "prop_isSubsetOf" prop_isSubsetOf
Expand Down Expand Up @@ -284,28 +281,6 @@ prop_fromRange = forAll (scale (*100) arbitrary) go
go (l,h) = valid t .&&. t === fromAscList [l..h]
where t = fromRange (l,h)

{--------------------------------------------------------------------
Bin invariants
--------------------------------------------------------------------}
powersOf2 :: IntSet
powersOf2 = fromList [2^i | i <- [0..63]]

-- Check the invariant that the mask is a power of 2.
prop_MaskPow2 :: IntSet -> Bool
prop_MaskPow2 (Bin _ msk left right) = member msk powersOf2 && prop_MaskPow2 left && prop_MaskPow2 right
prop_MaskPow2 _ = True

-- Check that the prefix satisfies its invariant.
prop_Prefix :: IntSet -> Bool
prop_Prefix s@(Bin prefix msk left right) = all (\elem -> match elem prefix msk) (toList s) && prop_Prefix left && prop_Prefix right
prop_Prefix _ = True

-- Check that the left elements don't have the mask bit set, and the right
-- ones do.
prop_LeftRight :: IntSet -> Bool
prop_LeftRight (Bin _ msk left right) = and [x .&. msk == 0 | x <- toList left] && and [x .&. msk == msk | x <- toList right]
prop_LeftRight _ = True

{--------------------------------------------------------------------
IntSet operations are like Set operations
--------------------------------------------------------------------}
Expand Down
1 change: 1 addition & 0 deletions containers/containers.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ Library
Data.IntMap.Merge.Lazy
Data.IntMap.Merge.Strict
Data.IntSet.Internal
Data.IntSet.Internal.IntTreeCommons
Data.IntSet
Data.Map
Data.Map.Lazy
Expand Down
Loading
Loading