Skip to content
This repository has been archived by the owner on Dec 17, 2018. It is now read-only.

Observing transient timeouts during KayVee distributed-store healthcheck #14

Closed
allengeorge opened this issue Nov 21, 2013 · 2 comments
Closed
Assignees
Milestone

Comments

@allengeorge
Copy link
Owner

The KayVee distributed-store health-check has a command-commit timeout of 500ms. This should be more than enough to get consensus and commit log entries to disk for a 3-machine cluster. I notice, however, transient timeouts in running a health-check loop (1 every second). There are sometimes spikes when the check is being run against the leader and the leader has to deal with a faulty machine (see #11).

@ghost ghost assigned allengeorge Nov 21, 2013
@allengeorge
Copy link
Owner Author

This may be due to transient load spikes on my local machine. I notice that my dev machine experiences multi-second pauses (presumably paging something to/from disk) during day-to-day work.

@allengeorge allengeorge added this to the 0.2.1 Release milestone Mar 25, 2014
@allengeorge
Copy link
Owner Author

I can't reproduce this with the set of long-term low-load soak tests I've run on GCE. I'm going to chalk it up to running n nodes on my iMac with lowish amounts of RAM.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant