-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add range-based iteration #290
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kerollmops, do we really need to keep the size_hint in our iterators? It seems wasteful that bitmap.iter().next()
has to compute the cardinality of the whole bitmap.
I think other than not keeping an accurate size_hint, I don't see any reason why this would need a seperate iterator type (if we implement our own Flatten like we do here and #287).
This and #287 feel like two sides of the same coin: To me, this is easily and efficiently built on top of an advance_to
& advance_back_to
, though again, only if we could drop the need for an accurate size_hint.
Thank you for your PR and feedback on this. I would like to know if dropping the I love the interface of this PR much more than #287. What do you think @Dr-Emann? Do you feel it could be possible? |
I think this change could re-use existing store iterators (it should be possible to construct a |
pub struct BlockRangeIter<'a> { | ||
first: BlockPartIter, | ||
between: BlockSeqIter<'a>, | ||
last: BlockPartIter, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a separate iterator type for this: you should be able to construct a BitmapIter
over a range by setting key
/key_back
, and masking value
/value_back
.
let mut current = BlockPartIter::new(self.start_key, self.block_range[0]); | ||
if let c @ Some(_) = current.next() { | ||
return c; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this always return the same value from next
if there's a bit set in the "current" block, since we always reload self.block_range[0]
? self
is not modified in any way if we return
here, right?
Hi @Kerollmops and @Dr-Emann, thank you very much for the helpful feedback. I will plan to revise to accommodate any recent changes, and to merge the range-iterator type with the existing iterator type as Dr-Emann suggested. That and any other suggestions are most welcome. It may be one or more weeks until I get to doing this, since I am currently swamped with other urgent priorities, but I would love to implement the change and plan to do so. Thanks again for the feedback. |
Let me know if your plans change, I'd also be happy to pick this up, I have a pretty good idea of what I'd want the implementation to look like. I think ideally, this would be a very thin wrapper around pub fn iter_range(&self, range: R) -> Iter<'_> {
let mut iter = self.iter();
iter.advance_to(start);
iter.advance_back_to(end);
iter
} So the complexity would all be in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, @Kerollmops, I opened #298, which includes |
This change adds range-based iteration, meaning a consecutive subsequence of a RoaringBitmap can now be iterated over efficiently.
The time complexity to iterate over a range of length k in a RoaringBitmap of size n is: log(n) + k.
The log(n) term comes from a binary search to find the first element, and the k term comes from iterating over the following k-length range.
Note: The
convert_range_to_inclusive
function was refactored usingnum
traits, in order to handleu16
as well asu32
.Unit and property-based tests are included to verify the correctness of range iteration over both arrays and bitmaps.