-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add renewal for scanner to prevent lease timeout #277
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't normally work on this, so just a few general nits and a request to run gofmt
ef48bd3
to
8a503e4
Compare
67a91e0
to
8df1eda
Compare
scanner.go
Outdated
panic("Renew timer was not stopped before starting again") | ||
} | ||
if s.rpc.RenewInterval() > 0 { | ||
s.renewTimer = time.AfterFunc(s.rpc.RenewInterval(), s.renewScanner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A different way to do this would be to start a goroutine that sleeps on a ticker and context.
renewCtx, cancel := context.WithCancel(s.Context())
s.renewCancel = cancel
go s.renewLoop(renewCtx)
And renewLoop
would look like:
func (s *scanner) renewLoop() {
t := time.Ticker(s.rpc.RenewInterval()
defer t.Stop()
for {
select {
case <-t.C:
if err := s.renew(); err != nil {
return
}
case <-ctx.Done():
return
}
}
}
The advantage here is that the renewLoop routine will clean itself up immediately after the context is canceled and doesn't have to wait for the timer to go off to notice the context is done. And I think it's slightly simpler to have a loop rather than renewScanner()
needing to start a new renew timer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another advantage here is that it provides a place for us to add a metric to track the number of live renewals. It can be done in a follow on change, but we will want to add a metric that increments when this function starts and decrements when it exits. That way we can track if we are leaking scanners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this definitely cleans up the code and makes it easier to add metrics in the future. I've implemented this, feel free to take a look!
Just my 2c: Also, please make sure to not have this behavior by default as it might be unexpected for some users, as now their "periods of starvation" in client layer will cause leaked snanners within regionservers. |
Hey Andrey! |
Hey Aaron! |
0289f9e
to
e00cce6
Compare
e00cce6
to
440f87c
Compare
This change adds an option to Scan (Renew) that allows for renewal of scanners during periods of starvation to prevent lease timeout. Currently, we are renewing every lease timeout / 2 seconds (10s currently).