Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated tokens when using timeouts #10

Open
dv opened this issue Apr 3, 2013 · 3 comments
Open

Duplicated tokens when using timeouts #10

dv opened this issue Apr 3, 2013 · 3 comments

Comments

@dv
Copy link
Owner

dv commented Apr 3, 2013

There's a situation in which it's possible that tokens get pushed on the available list twice, when using stale client timeouts.

The release_stale_locks method can take quite a bit of time. In the case where a token has timed out, but the original locking process is actually still running, if the release_stale_locks method is being run while the original locking process finishes, both the original process and the release_stale_locks method will push the token on the available list.

@FooBarWidget
Copy link

This can be reproduced with a much simpler case.

In client 1:

sem = Redis::Semaphore.new(:semaphore_name, :connection => "localhost",:stale_client_timeout => 10)
sem.lock { STDIN.readline }

In client 2:

sem = Redis::Semaphore.new(:semaphore_name, :connection => "localhost",:stale_client_timeout => 10)
sem.lock { puts "hi" }

Wait 10 seconds, then start client 3:

sem = Redis::Semaphore.new(:semaphore_name, :connection => "localhost",:stale_client_timeout => 10)
sem.release_stale_locks!

Now press Enter in client 1. It will push an extra token to the list. Your semaphore's value has now changed from 1 to 2.

There should be some kind of way for #signal to detect that its lock was forcefully cleaned up. Perhaps you can do this with a key that stores the last cleanup up. #signal should then check the cleanup time. If it's larger than current_time, then that means a cleanup occurred, and it should not push the token to the list.

@stephankaag
Copy link

Anyone?

@mikeryz-rosie
Copy link
Contributor

I just had a lot of fun with this particular failure case - it definitely reinforces the need to make sure your timeout is longer than expected runtime.

I would enjoy writing a solution to this, but the question to @dv is whether the added complexity is really worth it (i.e. would get merged in).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants