-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refresh site creds on file fetcher processes #613
Conversation
0bb58ba
to
1e64714
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since those Queues are based on pipes, it means we will pipe a lot of redundant data on each IPC, which may hurt performance as those will need to serialized / deserialized.
Also, I'm afraid we will have errors this way: the old tasks will still use the old config.
Couldn't we signal those processes to reload their config from disk whenever our own config change ? Or another IPC mechanism to send the new config.
Is this really a problem? I thought of the same myself but the full config should be in the range of a few kilobytes and the cost of just sending it all over does not seem especially high unless I'm missing some big inefficiency here. |
Looks like a default config on a fresh service, once pickled, is in the 5kB range. For minimal changes, we could use a separate queue for config update events: checking that without blocking means we could update config at any time and all subsequent tasks will use the new config. |
The acute problem being fixed here is that pghoard does not refresh keys at all so key rotation cannot be completed with restarting the application. For that particular problem it is irrelevant if the old key is used a bit longer as there's anyway long grace period of key inactivity after which old key is disabled. For cases where storage location actually gets changed that could be more relevant, though in that case too the exact point when data starts getting read from / written to the other location is arbitrary and touching the queued events does not feel like it would have big impact one way or the other. Separate queue would work but I'd check the actual performance impact first. My guess is the overhead is low single digit milliseconds, which would probably be acceptable. |
I'm fine with both of those points, as long as they are considered :-) |
Is there a plan to work on this? The problem is still very valid. |
Hi! Sorry, I had to pause a bit this task. I agreed with @rdunklau that I'll measure how much this affect on restoration, will give some prio |
1e64714
to
137f1cf
Compare
@rdunklau I did multiples test runs and measured restoration times. I saw no major difference after including these changes. Considered db sizes (mb): 100, 300, 600, 1000. Dataset was not super big, but I don't think it might have a bigger impact. |
It was unlikely the performance would be so much worse that it would be visible unless you used very heavy stress test but in this case a synthetic test should be quite sufficient because you can easily simulate the config being passed as part of tasks or it not being passed there. For example the following test app should work:
This gives me consistently 0.11ms processing time per task when not passing the full config and 0.13ms processing time when passing it. So the overhead isn't even full milliseconds but rather 0.02 milliseconds, which is completely negligible given the actual task processing is way heavier than the 0.11ms no-op time. @rdunklau does this validation seem sufficient to you? |
Sorry I missed the previous comments. |
About this change - What it does
pghoard dispatches processes in charge of fetching files from sites. When starting such processes, pghoard provides its config as an argument. Meaning that if pghoard gets restarted with a different config (e.g object storage got new credentials), the running file fetcher processes won't acknowledge this and will keep using the old config.
So, in order to change this behavior. Its better to provide the current config directly on each task instead of the process itself. This way the process can update the transfer to the object storage (in case the site's config changed).
Resolves: #BF-2385