Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional ETIMEDOUT errors #35

Closed
kevinburkeshyp opened this issue Mar 31, 2015 · 6 comments
Closed

occasional ETIMEDOUT errors #35

kevinburkeshyp opened this issue Mar 31, 2015 · 6 comments

Comments

@kevinburkeshyp
Copy link

Occasionally we see ETIMEDOUT errors with a very vague stack trace:

Error: connect ETIMEDOUT
at errnoException (net.js:904:11)
at Object.afterConnect [as oncomplete] (net.js:895:19)
{ [Error: connect ETIMEDOUT] code: 'ETIMEDOUT', errno: 'ETIMEDOUT', syscall: 'connect' }

I am pretty sure these are coming from librato-node for the following reasons:

  • we disabled Librato after the major service outage and didn't see errors for the duration of the time it was shut off
  • now that it's back and we've added error handling, we are still seeing ETIMEDOUT but it is not crashing our stack.

Do you see this on your end? It makes me wonder what the timeout is being set to (or how long it's taking to fail) since the client doesn't set it.

Would be super swell to do fancy stuff like set a separate connect/request timeout and/or fail over on a failed connect to a different DNS entry but my guess is that request does not support these features.

@bobzoller
Copy link
Contributor

from https://github.com/request/request/tree/v2.45.0

timeout - Integer containing the number of milliseconds to wait for a request to respond before aborting the request

grokking code, request sets up both a global setTimeout /and/ a res.setTimeout IFF you pass it a timeout option.

this is also related to #32 and #33


what if we accept a requestOptions parameter to librato-node.configure, which we pass through to the underlying request calls? This way you could specify a timeout, and also proxy info, whatever.

additionally, we could add an option to enable some arbitrary number of retries if folks thought that was valuable.

@kevinburkeshyp
Copy link
Author

That would be neat; my one worry about retries would be if Librato would double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether anything happened on our end or on Librato's end because I had no data about the stack trace to begin with.

@bobzoller
Copy link
Contributor

Librato was down for an extended period of time around the time you opened
this issue. We experienced similar things.

Good question about double-counting. In our own usage I doubt we'd enable
retry, so I'll let that go for the time being. PR for requestOptions
coming shortly.

On Wed, Apr 8, 2015 at 3:08 PM, Kevin Burke [email protected]
wrote:

That would be neat; my one worry about retries would be if Librato would
double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether anything
happened on our end or on Librato's end because I had no data about the
stack trace to begin with.


Reply to this email directly or view it on GitHub
#35 (comment)
.

@kevinburkeshyp
Copy link
Author

I remember that, but for about two weeks after we introduced Librato we
were seeing ETIMEDOUT errors maybe three times a day, sporadically, with no
other stack trace attached.

On Wed, Apr 8, 2015 at 3:12 PM, Bob Zoller [email protected] wrote:

Librato was down for an extended period of time around the time you opened
this issue. We experienced similar things.

Good question about double-counting. In our own usage I doubt we'd enable
retry, so I'll let that go for the time being. PR for requestOptions
coming shortly.

On Wed, Apr 8, 2015 at 3:08 PM, Kevin Burke [email protected]
wrote:

That would be neat; my one worry about retries would be if Librato would
double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether
anything
happened on our end or on Librato's end because I had no data about the
stack trace to begin with.


Reply to this email directly or view it on GitHub
<
https://github.com/goodeggs/librato-node/issues/35#issuecomment-91050510>

.


Reply to this email directly or view it on GitHub
#35 (comment)
.

kevin

@bobzoller
Copy link
Contributor

ah, got it. yeah I suppose transient network issues then...

On Wed, Apr 8, 2015 at 3:15 PM, Kevin Burke [email protected]
wrote:

I remember that, but for about two weeks after we introduced Librato we
were seeing ETIMEDOUT errors maybe three times a day, sporadically, with no
other stack trace attached.

On Wed, Apr 8, 2015 at 3:12 PM, Bob Zoller [email protected]
wrote:

Librato was down for an extended period of time around the time you
opened
this issue. We experienced similar things.

Good question about double-counting. In our own usage I doubt we'd enable
retry, so I'll let that go for the time being. PR for requestOptions
coming shortly.

On Wed, Apr 8, 2015 at 3:08 PM, Kevin Burke [email protected]
wrote:

That would be neat; my one worry about retries would be if Librato
would
double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether
anything
happened on our end or on Librato's end because I had no data about the
stack trace to begin with.


Reply to this email directly or view it on GitHub
<
#35 (comment)

.


Reply to this email directly or view it on GitHub
<
https://github.com/goodeggs/librato-node/issues/35#issuecomment-91051661>
.

kevin


Reply to this email directly or view it on GitHub
#35 (comment)
.

@kevinburkeshyp
Copy link
Author

thanks for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants