-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
close django connection in REST API #4741
Conversation
The REST API has no need for the django connection, since it is only using the QueryBuilder so far. I've tested that even storing ORM entities still works inside the REST API, i.e. if needed the django connection is simply reestablished on the fly.
3a7cddd
to
81edfcf
Compare
331bf1b
to
74fb15a
Compare
@ltalirz do you think that this PR could be merged? I'm setting up the new aiida servers for Materials Cloud with |
thanks for the note @elsapassaro - these changes were a quick hack meant for trying things out but if they help for materials cloud perhaps we can figure out a way to get them in. I've just updated the description of the PR to explain the reasoning here in some detail. The reason these changes should not be merged as-is is that they can affect regular AiiDA operation:
I propose to add a What is your perspective @sphuber ? |
If this change can be made without affecting normal use of AiiDA, by making it configurable (preferably through the REST API config, instead of the global one, if this is purely for the REST API) then we could add this. It might be worth having a look at #5054 as a potential alternative though. If I understand correctly, the reason of the problems is that you are using a single REST API process per profile and so are running out of connections, but you are only doing that since it is currently not possible to switch profiles. PR #5054 adds that functionality and so would remove your original problem therefore not needing this workaround. @elsapassaro there are already other people testing that branch and if that works we will try to include it with the next release. If you could also test this with the MC that would help speed up the process. |
I guess it would in principle be possible to limit the scope to the REST API but I believe it would require some changes since the explicit creation of the postgresql connections currently does not happen inside the REST API code.
Fully agree that if #5054 turns out to be performant enough that would be great and would remove the need for this PR. In this PR (#4741) we just talking about opening/closing a postgresql connection per request (<10ms as mentioned above); in #5054 we are talking about potentially switching profiles per request which is likely orders of magnitude slower. @elsapassaro Since testing #5054 requires a migration to the new file repository, which will take a long time for MC, I think we should start testing performance locally. Furthermore, it would be super useful to have a look at the request patterns to the AiiDA REST APIs on Materials Cloud.
|
Since we are getting rid of Django for v2.0, I am closing this for now. If you feel this needs a backport fix on |
Note: This is a draft meant for testing e.g. on Materials Cloud.
In regular operation, AiiDA keeps several connections to the postgresql database persistently open.
One set of connections (up to 5 by default) comes from the SQLA connection pool, which is present on both sqla and django profiles (since both use the QueryBuilder).
For django profiles, there is an additional persistent connection opened by django.
See here for a short summary.
The default connection limit of postgresql is 100; see here for why it is a good idea to keep the number of connections from exploding.
As long as AiiDA users don't use more than 100/6=16 daemon runners (that number would need to include AiiDA shells + REST APIs), they will not hit that limit in regular use.
For Materials Cloud, the situation is different, however.
Currently, Materials Cloud runs one independent AiiDA REST API process per profile, all of which connect to the same postgresql database, and some profiles include further AiiDA-based services that also make use of the REST API.
We've therefore hit the limit of 100 connections (and even more) some time ago.
The Materials Cloud use case is very different from "operational" AiiDA use: on Materials Cloud, database access is needed during user-initiated queries but there are long periods of inactivity between those (there are no running jobs that would access the database with high frequency).
The benefit from keeping postgresql connections open is therefore less relevant (as shown here, opening a postgresql connection on materials cloud takes ~6-10ms).
This PR
QueuePool
of sqla by aNullPool
that will release a connection as soon as it is no longer neededI've tested that even storing ORM entities still works inside the REST
API, i.e. if needed the django connection is simply reestablished on the
fly.