-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
message: Handle message deletion event. #766
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Small comments below; they're small enough that I'll go ahead and mark this for @gnprice's review too.
lib/model/message_list.dart
Outdated
bool _removeMessagesById(Iterable<int> messageIds) { | ||
bool hasRemovedMessage = false; | ||
for (final messageId in messageIds) { | ||
final messageIndex = _findMessageWithId(messageId); | ||
if (messageIndex == -1) continue; | ||
|
||
assert(contents.length == messages.length); | ||
messages.removeAt(messageIndex); | ||
contents.removeAt(messageIndex); | ||
assert(contents.length == messages.length); | ||
hasRemovedMessage = true; | ||
} | ||
if (hasRemovedMessage) _reprocessAll(); | ||
return hasRemovedMessage; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I notice in this implementation is that the List.removeAt
calls can be expensive, if messageIndex
is low and the list's length is large. From the doc; this part in particular:
[…] and moves all later objects down by one position.
There are ways we could be more efficient in that scenario (low messageIndex
, large length), but I think they'd all come with tradeoffs in other scenarios. Also, bulk-delete events probably aren't very common anyway. But mentioning this in case you or Greg see something that's obviously an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the cost of this loop is quadratic, which is enough to make me nervous — with _findMessageWithId
lookups.)
In particular if you're looking at a conversation of
An algorithm that would fix that is:
- First do the lookups, to determine if there are any affected messages in this list at all. That costs
$\Theta(k \log n)$ . (And saves us from spending$\Theta(n)$ on the next step if$d = 0$ . If$d > 1$ , we're resigned to spending that much time anyway.) - Then, if there are any affected messages, use
List.removeWhere
to remove them from each of the lists all at once. (Build aSet
out ofmessageIds
first, to make that efficient.) This costs$\Theta(k + n)$ . (The first term is for building theSet
; the second for theremoveWhere
calls.)
That makes the total cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the time complexity analysis! So List.removeWhere
performs in linear time? I couldn't find reference for how it is implemented. My guess is that they just make a copy of the list with the items removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like removing content with List.removeWhere
wouldn't be quite straightforward because it is a list parallel to messages that don't have the message ID. A solution is to keep on using removeWhere but create a set of ZulipContent
beforehand. This is based on the assumption that hashing ZulipContent
should be cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So
List.removeWhere
performs in linear time? I couldn't find reference for how it is implemented. My guess is that they just make a copy of the list with the items removed.
My guess is it's a bit more efficient than that in that it reuses the existing array, and just moves items down to lower indexes to cover gaps. But yeah, should definitely be linear time.
The docs on List are a bit lacking in spelling out the performance characteristics, I think in part because there isn't a great place to put that information; List
itself is actually an interface, and implementations can have different performance characteristics. But on List
it does say:
The default growable list, as created by [], keeps an internal buffer, and grows that buffer when necessary. This guarantees that a sequence of add operations will each execute in amortized constant time. Setting the length directly may take time proportional to the new length, and may change the internal capacity so that a following add operation will need to immediately increase the buffer capacity. Other list implementations may have different performance behavior.
So basically for each method you should think of it as doing whatever is the natural most efficient thing it could be doing for that method, given that data structure.
That data structure is essentially the same as a Python list or a JS array, if that helps in thinking about it.
(btw that's an old Dart version in that docs link; if you delete the "2.9.3/" for the version so it just says "stable", it'll give you the current version.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like removing content with
List.removeWhere
wouldn't be quite straightforward because it is a list parallel to messages that don't have the message ID.
Hmm, I see.
In that case probably the best thing is to write the loop ourselves. Essentially do this:
reuses the existing array, and just moves items down to lower indexes to cover gaps.
but do the testing only on messages
, and the moving on both messages
and content
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @PIG208, and thanks @chrisbobbe for the initial review!
Comments below.
Because message deletion can affect what items are to be seen (date
separator, recipient header, etc.), if the view has any message deleted, a
`_reprocessAll` call is triggered. It would be ideal to just removed the
affected items for better performance.
Yeah, I'm content to have deletion take linear time in the length of the existing list, which is what _reprocessAll
does. We're already doomed to linear time just for updating messages
itself, so long as it's a plain list.
(And as Chris pointed to above and I go into more detail about below, this current revision can actually spend quadratic time updating messages
— so _reprocessAll
is cheap by comparison with that.)
lib/model/message_list.dart
Outdated
bool _removeMessagesById(Iterable<int> messageIds) { | ||
bool hasRemovedMessage = false; | ||
for (final messageId in messageIds) { | ||
final messageIndex = _findMessageWithId(messageId); | ||
if (messageIndex == -1) continue; | ||
|
||
assert(contents.length == messages.length); | ||
messages.removeAt(messageIndex); | ||
contents.removeAt(messageIndex); | ||
assert(contents.length == messages.length); | ||
hasRemovedMessage = true; | ||
} | ||
if (hasRemovedMessage) _reprocessAll(); | ||
return hasRemovedMessage; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the cost of this loop is quadratic, which is enough to make me nervous — with _findMessageWithId
lookups.)
In particular if you're looking at a conversation of
An algorithm that would fix that is:
- First do the lookups, to determine if there are any affected messages in this list at all. That costs
$\Theta(k \log n)$ . (And saves us from spending$\Theta(n)$ on the next step if$d = 0$ . If$d > 1$ , we're resigned to spending that much time anyway.) - Then, if there are any affected messages, use
List.removeWhere
to remove them from each of the lists all at once. (Build aSet
out ofmessageIds
first, to make that efficient.) This costs$\Theta(k + n)$ . (The first term is for building theSet
; the second for theremoveWhere
calls.)
That makes the total cost
@@ -375,6 +375,40 @@ void main() { | |||
}); | |||
}); | |||
|
|||
group('handleDeleteMessageEvent', () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will also need tests for the behavior of MessageListView
, as well as that of MessageStoreImpl
. After all, the former is where most of the code this change introduces is. 🙂
Those tests can go in model/message_list_test.dart
; search for MessageEvent there for examples to follow. I think, given that we're using _reprocessAll
rather than trying to do something more clever (as handleMessageEvent
does), the only tests needed here are analogous to the tests with "MessageEvent" in their names.
Note one key thing those tests do is they call that checkInvariants
function, in particular on every notifyListeners
. It conducts a rather fierce set of consistency checks on the model, in particular recomputing everything that derives from messages
and confirming it all has the right values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note one key thing those tests do is they call that checkInvariants function, in particular on every notifyListeners. It conducts a rather fierce set of consistency checks on the model, in particular recomputing everything that derives from messages and confirming it all has the right values.
I see that in the 'MessageEvent, before fetch' test, but not the other 'MessageEvent' tests. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's there 🙂 Via this part:
in particular on every
notifyListeners
.
See prepare
for that call site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Quite possibly we should have checkNotNotified
call checkInvariants
too, and then the individual test cases will be more uniform in not needing to explicitly do so)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see! So the call is added in a listener. Then the post event invariant check of 'MessageEvent, not in narrow' seems to be missing because it isn't expecting a notification. Yeah it makes sense to add it to checkNotNotified
too.
6dc2eca
to
90b2e6a
Compare
The PR has been updated to address the reviews.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revision! All looks good modulo a couple of small comments.
lib/model/message_list.dart
Outdated
/// If none of [messageIds] are found, this is a no-op. | ||
bool _removeMessagesById(Iterable<int> messageIds) { | ||
final messagesToRemoveById = <int>{}; | ||
final contentToRemove = <ZulipContent>{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final contentToRemove = <ZulipContent>{}; | |
final contentToRemove = Set<ZulipContent>.identity(); |
That has no direct effect, because ZulipContent
doesn't override ==
or hashCode
. But it's not totally obvious that it wouldn't — indeed some of the other ContentNode
subclasses do — so best to be explicit.
And comparing for identity, rather than general equality, is crucial to this algorithm's behavior both for performance and, in edge cases, for correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's interesting. So an identity set uses identical to distinguish elements, which compares the object references instead.
messages.removeWhere((message) => messagesToRemoveById.contains(message.id)); | ||
contents.removeWhere((content) => contentToRemove.contains(content)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat solution. I think this algorithm works fine.
I might replace it with a direct loop if I find that convenient in the course of building the other features (like #421) that will reuse a version of this method, but I'll cross that bridge if I come to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Leaving it as-is for this revision.
The PR has been updated. Thanks for the review @gnprice ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revision! Remaining comments are all in the tests; one flaky test (at least in principle), and a few nits.
When handling a message deletion event, we just simply remove the messages from the per account store and wipe them from the message list view if they are present. Because message deletion can affect what items are to be seen (date separator, recipient header, etc.), if the view has any message deleted, a `_reprocessAll` call is triggered. It would be ideal to just removed the affected items for better performance. Fixes zulip#120. Signed-off-by: Zixuan James Li <[email protected]>
Thanks for the review! I have updated the PR and also removed an outdated |
The `length` check isn't needed because it's covered by the `deepEquals`. (It was needed in a previous draft of this test). Then we can also inline `expected`, and reformat slightly.
Thanks! All looks great; merging, with one added commit for a couple of nits: |
When handling a message deletion event, we just simply remove the messages from the per account store and wipe them from the message list view if they are present.
Because message deletion can affect what items are to be seen (date separator, recipient header, etc.), if the view has any message deleted, a
_reprocessAll
call is triggered. It would be ideal to just removed the affected items for better performance.