Skip to content
This repository has been archived by the owner on Dec 17, 2018. It is now read-only.

KayVee can hang on shutdown #37

Open
allengeorge opened this issue Feb 14, 2014 · 3 comments
Open

KayVee can hang on shutdown #37

allengeorge opened this issue Feb 14, 2014 · 3 comments
Assignees
Labels
Milestone

Comments

@allengeorge
Copy link
Owner

Once in a blue moon it appears that KayVee can hang on shutdown. The problem has been traced down to a failure the underlying Netty NioWorkerPool to shutdown cleanly (it appears to be waiting for a CountdownLatch to reach 0 - a condition that, for some reason, never happens).

The full stack is at: KayVee 0.1.1 Shutdown Hang Stack

@allengeorge allengeorge self-assigned this Feb 14, 2014
@allengeorge
Copy link
Owner Author

Another source of the hang: deadlock on KayVee 0.1.1 shutdown

@allengeorge
Copy link
Owner Author

This is caused due to a System.exit() call within RaftAlgorithm, which can trigger a deadlock. See: http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#exit(int). The fix is to either:

  1. Detect that I'm in the middle of a shutdown and not run System.exit()
  2. Avoid System.exit() entirely

@allengeorge allengeorge added this to the 0.2.0 Release milestone Feb 25, 2014
@allengeorge
Copy link
Owner Author

OK. I think I have the exact cause here.

If I call System.exit in a Netty I/O thread the following happens:

  1. The thread locks the Shutdown.class object
  2. The Jetty ShutdownThread is invoked, which starts running the shutdown tasks we've registered
  3. One of the shutdown tasks is RaftAgent.stop(), which waits for all I/O threads to complete

And...deadlock. This is because the netty I/O thread is waiting for all the shutdown tasks to run, but they won't complete because one of the tasks is to actually shut down the I/O thread.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant