Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordination between multiple CS instances #29

Open
habig opened this issue Aug 5, 2022 · 2 comments
Open

Coordination between multiple CS instances #29

habig opened this issue Aug 5, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@habig
Copy link

habig commented Aug 5, 2022

So, it would be nice to have SNEWS servers running redundantly in different places, to dodge network or power problems.

The way it is right now, this would work out of the box for the decision-making algorithms: each server can subscribe to the same hopskotch stream and would make the same decisions independently.

But, we don't want multiple copies of the alerts going out and confusing people, be they emails or slack alerts or whatever. So, at any point in time, only one server should push alerts.

How best do do this? I could imagine something where the first server to reach a decision puts out its alert. Other servers, if they see an alert already that matches the one they are about to push, would not make a duplicate.

While adding the logic to do this, we would need to be careful about making the operation an "atomic" one, to avoid race conditions. That will take some thought.

@habig habig added the enhancement New feature or request label Aug 5, 2022
@mlinvill
Copy link
Contributor

I have made some progress on this issue. I've been looking into RAFT consensus algorithm implementations. There is a python implementation called pysyncobj. I have a toy implementation of a distributed lock using pysyncobj running stand-alone as a starting point. However, the reliability needs to be better tested under real-world failure scenarios before this should be considered to be incorporated into the coincidence system.

@KaraMelih
Copy link
Collaborator

Just to keep track, I think it is being worked at in here;
https://github.com/SNEWS2/SNEWS_Coincidence_System/tree/mlinvill_feature_redundancy-in-alerting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants