-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't log at error level in lease timeout #2151
Conversation
a0dfc60
to
cfd52fc
Compare
backend/controller/leader/leader.go
Outdated
return false | ||
} | ||
return true | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it'd be neater to do if !ok
and then return early
common/configuration/asm_follower.go
Outdated
@@ -51,7 +53,14 @@ func (f *asmFollower) sync(ctx context.Context, values *xsync.MapOf[Ref, SyncedV | |||
IncludeValues: &includeValues, | |||
})) | |||
if err != nil { | |||
return fmt.Errorf("error getting secrets list from leader: %w", err) | |||
if f.errorFilter.ReportLeaseError() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this is too general, like was mentioned on this morning's call. What @alecthomas said about responses that include errors vs network errors feels like a good way to differentiate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an example somewhere of how to determine the type of error? The library docs mention connect.Error but that does not seem to match what is in the datadog logs.
I never thought I would miss Java checked exceptions, but they do make situations like this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not useful for this case, but here's an example of checking connect errors: https://github.com/TBD54566975/ftl/blob/main/backend/controller/ingress/handler.go#L55-L62
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have updated it to use the return codes to decide on if it should attempt to filter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good direction @stuartwdouglas as this does feel like something that would be useful for more things that end up using leader/follower.
I think you mentioned on the call this morning that you thought it might be useful for anything that used leases? I'm less sure about that but maybe I havent thought of use cases, or maybe I'm misremembering. Either way my feel is what you've done in the code here is the right place, but I'm interested to hear if you had other ideas?
726d3ef
to
8dd27a1
Compare
We don't want error alerts if a controller is just failing over, this PR adds some infrasturcture to allow for a lower level of error reporting when the errors all occur within the lease TTL. fixes #2133
8dd27a1
to
dc6903d
Compare
@matt2e are you happy with this now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 looks good
We don't want error alerts if a controller is just failing over, this PR adds some infrasturcture to allow for a lower level of error reporting when the errors all occur within the lease TTL.
fixes #2133