-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recategorize messages #977
base: master
Are you sure you want to change the base?
Conversation
556d9b8
to
ebd959d
Compare
update_message_metadata( | ||
session, message.account, message, message.is_draft | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the function that was changed in a broken a way on the PRs that we reverted. So if we capture categories
before, run the function after the reverts, and then read categories - and then are different we found a miscategorized email.
def session_factory(): | ||
return global_session_scope() if dry_run else session_scope(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sync-engine curiosity. sync-engine has two ways of managing transactions... global_session_scope
is for reads and session_scope
is mainly meant for writes. session_scope
will automatically commit at the end. And it also automatically behind the scenes serializes changes to transaction log, and that transaction log is what CRM consumes. session_scope
in original sync-engine implementation also allowed to shard to different databases connected to the same cluster, but we never used that feature so passing None
will just route to the only database we have.
The naming of those and the exact intentions of Nylas people were always a mystery to me.
bin/recategorize-messages.py
Outdated
only_inbox: bool, | ||
only_types: set[AccountType] = ALL_ACCOUNT_TYPES, | ||
) -> Iterable[int, list[int]]: | ||
discriminators = {account_type + "account" for account_type in only_types} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code for obtaining the filtered namespace_query
is almost the same as in get_total_namespace_count
. Should we factor it out into a separate private function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bin/recategorize-messages.py
Outdated
if date_start: | ||
query = query.filter(Message.created_at >= date_start) | ||
if date_end: | ||
query = query.filter(Message.created_at < date_end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make it inclusive? Given a date range 2024-10-15 - 2024-10-16, I would imagine it's a 2 day interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand your thinking. I tend to make it exclusive on one side because this way if you run two times [1, 2), [2, 3) you don't reprocess the intersection twice, but in this case we have a single interval.
@neob91-close I also added this commit that lets me only propagate changes for emails that are missing "inbox" category as most escalations are about this. This will also speed things up and ensure we don't overload CRM with delta changes. |
709b62b
to
d8c2160
Compare
@neob91-close I also changed |
459517d
to
a0dfa3c
Compare
This will basically find all the messages in the database where "categories" on the email message don't match the ones from imapuids. Once we run this the right categories will get propagated to the CRM, and this - in turn - will create inbox notifications.