-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saturated bloomfilter size #666
base: develop
Are you sure you want to change the base?
Saturated bloomfilter size #666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good.
Could we possibly revert the style changes for code you are not touching?
Sorry it's Intelij clean code not working with the .scalafmt.conf of the project. I apply algebird's scalafmt by hand |
This reverts commit 9180862.
From the paper linked to the size estimation of a bloomfilter, we have a inegality mixing t = number bits to 1 of the filter, m=width of filter, numHash=k nl <= S-1(t-1) nr >= S-1(t+1) S-1 is defined like : S-1(t) = ln(1 - t/m) / k.ln(1 - 1/m) problem occur when _t = m_ S-1(m) = ln(0) / k.ln(1 - 1/m) but *ln(-1/m)* is not correct. problem is : ln( 1 - 1/m) < 0 The scala maths library give : scala> val infinity = scala.math.log(0) / -42 infinity: Double = Infinity scala> infinity.round.toInt res24: Int = -1 just add special case for the S-1 function.
Codecov Report
@@ Coverage Diff @@
## develop #666 +/- ##
===========================================
+ Coverage 89.31% 89.52% +0.21%
===========================================
Files 113 113
Lines 8944 8945 +1
Branches 490 494 +4
===========================================
+ Hits 7988 8008 +20
+ Misses 956 937 -19
Continue to review full report at Codecov.
|
I reproduce the error from issue #632
throws :
Due to, I think , #632 (comment) we discuss before.
Coming from the comment (L.135 to 138 ) in Bloomfilter file :
But there is a problem between the result from the :
which return :
2147483647 is a particular number because it's the maximum 32 bits integer you can have.