Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Remote Hooks #72

Open
RonjaPonja opened this issue Jul 23, 2019 · 1 comment
Open

RFC: Remote Hooks #72

RonjaPonja opened this issue Jul 23, 2019 · 1 comment

Comments

@RonjaPonja
Copy link
Contributor

The Problem

Currently we use plain django signals. These have a few problems:

The signals are completely best effort. We don't note down which signal receiver actually reacted to the signal and if its action was successful. We don't retry anything either.

Further we block the commit from going through and the http request from returning before the signals are finished. The rational is, we want people to be sure their DNS change is already live before returning their commit as successful. The downside is that we block the django worker for seconds at a time to execute the flush by connecting to places via SSH (our preferred method of RPC for some reason) and that's the success case. In the failure case a hook takes a really long time. We may have to wait until the request timed out at 60 seconds. If we have enough requests in the queue making such a change, all workers will become busy. Subsequent requests will queue up until we reach net.core.somaxconn. Once that happens the health check endpoint will start failing, resulting in this web server getting kicked out of the loadbalancer.

My proposed Solution

To solve this, we should move the signal handler out of the serveradmin django. Signals should have no external dependencies and only check data integrity. Serveradmin already keeps track of commits. When making a commit via the API we should return a handle to this commit. If a client expects a certain hook to be performed, the commit object should offer an API to wait for the hook to be finished.

Further we need to extend the API to allow retrieving commits that have happened. Using this API an external DNS flusher, reusing the code currently living in serveradmin, could notify when it started and finished working on a hook for a commit and even report errors back. This hook information should then be transmitted back to the client that made the commit and is waiting for this hook. Further the hook information could be persisted in the database to have a record of which hooks were performed.

Another upside to this design is, it allows for very long running hooks. Even building a VM would be a perfectly plausible hook to implement this way.

TODO:

  • I'm deliberately vague about the API here, especially on the side working an the hook. The simplest implementation would offer all the commits that are made on serveradmin, but we could also be smarter. Emre already proposed[0] extending the adminapi Query API to subscribe to changes made on serveradmin after the initial retrieval of a Query. That proposal would complement this proposal well.
  • Django signals only happen within one web server. We have multiple web servers though. If somebody commits a change on web server A, but I'm subscribed via web server B we wouldn't know about it. Hence we either need to constantly poll B to requery the database or we need to inform all web servers about a commit that just happened. I suggest looking into postgresqls NOTIFY/LISTEN features for this. There's even already a django plugin for it.

[0] https://github.com/innogames/serveradmin/pull/52

@RonjaPonja
Copy link
Contributor Author

Emre agrees, that this might be a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant