Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOSimPOR fails to find a race under specific circumstances #183

Open
jasagredo opened this issue Oct 10, 2024 · 5 comments
Open

IOSimPOR fails to find a race under specific circumstances #183

jasagredo opened this issue Oct 10, 2024 · 5 comments
Labels
bug Something isn't working IOSimPOR Issues / PRs related to IOSimPOR

Comments

@jasagredo
Copy link
Contributor

jasagredo commented Oct 10, 2024

Describe the bug
This test case:

prop_race :: Property
prop_race = exploreSimTrace id action $ \_ trace ->
  case traceResult False trace of
    Left (FailureDeadlock err) -> counterexample (ppTrace trace) $ property False
    _ -> property True
  where
    action :: IOSim s ()
    action = do
      exploreRaces
      ref <- newMVar 0
      tq <- atomically newTQueue
      let f = do
            atomically $ writeTQueue tq ()
            bracketOnError
              (takeMVar ref)
              (tryPutMVar ref)
              (putMVar ref . (+1))
      t1 <- async f
      t2 <- async f
      async (cancel t1) >>= wait
      wait t2
      wait t1

will succeed. However removing the atomically $ writeTQueue tq () will make it fail because of a Deadlock. The deadlock is indeed present in the implementation: t1 takes, t1 puts, t1 receives exception, t2 takes, t1 tryPut succesfully, t2 blocks on put.

Writing to that tqueue before running the bracketOnError results in the race not being found by IOSimPOR. This looks like a direct bug.

Desktop (please complete the following information):

  • GHC version: 9.6.6
  • io-sim version: 1.6.0.0
  • io-classes version: 1.7.0.0

Additional context
Add any other context about the problem here. Attach io-sim or io-sim-por trace if possible.

A trace with the deadlock if removing the mentioned line:
fail.txt

@jasagredo jasagredo added the bug Something isn't working label Oct 10, 2024
@coot
Copy link
Collaborator

coot commented Oct 11, 2024

The attached trace contains:

      0s - Thread [].2      main - TxCommitted [] [TMVarId 2] Effect {  }

while the code snippet doesn't mentions any TMVars, is this a bug in the recent feature or have you used TMVars in some version?

@coot
Copy link
Collaborator

coot commented Oct 11, 2024

ok, the TMVars come from async calls.

@coot
Copy link
Collaborator

coot commented Oct 11, 2024

For me the property fails with a Deadlock when I comment out the writeTQueue action.

@coot coot added the IOSimPOR Issues / PRs related to IOSimPOR label Oct 11, 2024
@bolt12
Copy link
Contributor

bolt12 commented Oct 11, 2024

Yes, the problem is that the code with the writeTQueue still exhibits the race condition and IOSimPOR suddenly stops being able to find it

@bolt12
Copy link
Contributor

bolt12 commented Oct 16, 2024

After chatting with @rjmh this might be a symptom of our implementation of MVars in terms of STM might be too deterministic and hides races.

Here's a direct quote:

Getting the race detection and reversal right between STM transactions is already pretty complex. Needing to handle MVars as primitives too would add a lot of complexity to IOSimPOR internals. If there is an implementation in terms of TVars--which has the same races--then using it will be simpler. I suppose there might be specific race reversals that needn't be explored with MVars... say a takeMVar which blocks, followed by a putMVar which unblocks it... but which the TVar implementation might classify as races anyway. If there are such cases, and they can't be fixed by rewriting the implementation-in-terms-of-TVars, then it might be worth making them atomic I suppose.

The present implementation keeps a list of blocked threads in the MVar (or rather, TVars which, when written, unblock a thread). But this means that even a takeMVar which blocks, or a putMVar which blocks, is implemented using a write to the TVar. That means that two blocking takeVars, for example, will be considered to race, because they write to the same TVar. Better would be for takeMVar to use STM retry to block the caller if the MVar is empty, thus turning blocking takeMVar calls (and similarly, blocking putMVar calls) into read-only transactions, that at least would not race with each other.

Although this situation is not ideal, i.e. IOSimPOR should still be able to find the races, this might be not a bug in IOSimPOR itself but rather a consequence of the MVar implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working IOSimPOR Issues / PRs related to IOSimPOR
Projects
None yet
Development

No branches or pull requests

3 participants