-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea for new query: PersistenceQuery "EventsByDate" #66
Comments
Ah, colleagues gave me a hint that the Should even work with the I will have a deeper look into this :) |
The problem I see is this timestamp needs be generated using some reference, In a distributed system, If we have multiple writer nodes we can get a case where nodes have time desynchronized and it can lead to inconsistency issues. Can you detail better your business requirements? |
Yep, the reference clock issue was the immediate question that jumped to my mind as well seeing this ticket. You could maybe use mongo as a reference clock, but I don't know what (if any) assurances are given that a mongo cluster's (e.g. a replica set) clock is synchronized. It's an interesting concept, and I have thought about this before in the context of whether InfluxDB was viable as a persistence journal. The clock issue was what stopped me from getting too serious about it. |
Anyway this query not seems generic enough to me to deserve an API level implementation. This seems a business level feature. |
Do the replica set's clocks even need to be in sync? Inserts are always done on the primary, aren't they? I must admit that I'm not that firm with mongodb. The business requirement is that a search service wants to get the last modified persistence ids plus their sequence numbers in order to check if it missed some events and manually do a resync + search index update for those. |
Yes for replica sets inserts are done on the primary - it's a single-master system. So everything is fine and good until there is a primary re-election, let's say the primary crashes. Now inserts proceed on the new primary - which could introduce a consistency issue should the clocks be misaligned. That's the simple cluster case, if mongo sharding is used, then different mongod's are used for inserting into each shard. At that point, you're probably guaranteed consistency issues, at least across |
I understand that under this conditions this won't make it in the plugin. Thanks for the discussion. :) |
I think this should be doable with caveats as of merge of #150 - adding timestamps to the events will make this kind of query straightforward. The resolution of the query would be limited by the differences in internal clock time of the nodes inserting records. For slower-moving persistent actors (say < 1msg/50ms), I'd guess it would be quite accurate. |
I find this also quite useful to use the auto expire feature of MongoDB to get rid of old persistence journals of crashed actors. See https://www.ekito.fr/people/auto-expire-documents-mongodb-collections/ |
Hi @scullxbones
I cannot thank you enough for this great library which even gives some joy working with MongoDB :)
We currently face the challenge of retrieving events from the events-store which were created in the last "X TimeUnit" (e.g. in the last 5 minutes) in order to implement a sophisticated sync mechanism for a search service (another microservice) which sometimes does not get all our emitted events.
What do you think about a PersistenceQuery which retrieves all events before or after a given Date/Timestamp.
I thought about using the
_id
field for that (which includes a timestamp), but that would have at least 2 drawbacks:So I guess each journal entry would have to get a "ts" then in order to be able to query (e.g. also with index on that field).
Is that something you would want to have in your library or do you think that does not fit?
I could see what I could contribute as a pull request then ..
Regards Thomas
The text was updated successfully, but these errors were encountered: