-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix discovery overwriting dynamic resources #44316
fix discovery overwriting dynamic resources #44316
Conversation
There are no plans to extend optimistic locking to presence related resources. Conditional writes are a more expensive operation than a plain write and at scale can(and have) taken down a backend. Optimistic locking is only intended to be leveraged for cluster configuration(roles, users, auth preference, auth connectors, etc.) that may accidentally be simultaneously updated by humans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we opted not to proceed with your suggestion from #29828? I think the Create, Get, Update sequence introduced here could pose other issues since none of the operations are atomic and we may be operating on outdated state by the time the Get on creation failure occurs.
I opted to not do that because the reconciler would have to be changed to avoid deleting current resources in a different discovery_group or of a different origin - it seemed to me like that behavior was too specific to put in reconciler code and it would affect a ton of things that use the reconciler. If I did do that then we would still be operating on potentially outdated state, right? edit: |
When the resource already exists, check its origin as well as discovery group. If it's not of discovery origin, then don't update it. If it's not in the same discovery group, and its discovery group is not blank, then don't update it.
f1dd572
to
1190fed
Compare
@GavinFrazar See the table below for backport results.
|
Example error if someone stumbles upon this with v14 or earlier
Solution: upgrade to at least v15.4.12 |
Changelog: Prevent discovery service from overwriting Teleport dynamic resources that have the same name as discovered resources.
Closes #29828
This fix relies on checking the existing resource to see if we can update it - when we add optimistic locking for db/kube_cluster/app then we can do a conditional update. It's very unlikely that a user will update the resource during such a small time window anyway. I also consider using conditional updates out of scope to fix the linked issue because it's a fundamentally different problem.