occasional ETIMEDOUT errors #35

kevinburkeshyp · 2015-03-31T03:54:07Z

Occasionally we see ETIMEDOUT errors with a very vague stack trace:

Error: connect ETIMEDOUT
at errnoException (net.js:904:11)
at Object.afterConnect [as oncomplete] (net.js:895:19)
{ [Error: connect ETIMEDOUT] code: 'ETIMEDOUT', errno: 'ETIMEDOUT', syscall: 'connect' }

I am pretty sure these are coming from librato-node for the following reasons:

we disabled Librato after the major service outage and didn't see errors for the duration of the time it was shut off
now that it's back and we've added error handling, we are still seeing ETIMEDOUT but it is not crashing our stack.

Do you see this on your end? It makes me wonder what the timeout is being set to (or how long it's taking to fail) since the client doesn't set it.

Would be super swell to do fancy stuff like set a separate connect/request timeout and/or fail over on a failed connect to a different DNS entry but my guess is that request does not support these features.

The text was updated successfully, but these errors were encountered:

bobzoller · 2015-04-08T22:04:01Z

from https://github.com/request/request/tree/v2.45.0

timeout - Integer containing the number of milliseconds to wait for a request to respond before aborting the request

grokking code, request sets up both a global setTimeout /and/ a res.setTimeout IFF you pass it a timeout option.

this is also related to #32 and #33

what if we accept a requestOptions parameter to librato-node.configure, which we pass through to the underlying request calls? This way you could specify a timeout, and also proxy info, whatever.

additionally, we could add an option to enable some arbitrary number of retries if folks thought that was valuable.

kevinburkeshyp · 2015-04-08T22:08:38Z

That would be neat; my one worry about retries would be if Librato would double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether anything happened on our end or on Librato's end because I had no data about the stack trace to begin with.

bobzoller · 2015-04-08T22:12:40Z

Librato was down for an extended period of time around the time you opened
this issue. We experienced similar things.

Good question about double-counting. In our own usage I doubt we'd enable
retry, so I'll let that go for the time being. PR for requestOptions
coming shortly.

On Wed, Apr 8, 2015 at 3:08 PM, Kevin Burke [email protected]
wrote:

That would be neat; my one worry about retries would be if Librato would
double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether anything
happened on our end or on Librato's end because I had no data about the
stack trace to begin with.

—
Reply to this email directly or view it on GitHub
#35 (comment)
.

kevinburkeshyp · 2015-04-08T22:15:04Z

I remember that, but for about two weeks after we introduced Librato we
were seeing ETIMEDOUT errors maybe three times a day, sporadically, with no
other stack trace attached.

On Wed, Apr 8, 2015 at 3:12 PM, Bob Zoller [email protected] wrote:

Librato was down for an extended period of time around the time you opened
this issue. We experienced similar things.

Good question about double-counting. In our own usage I doubt we'd enable
retry, so I'll let that go for the time being. PR for requestOptions
coming shortly.

On Wed, Apr 8, 2015 at 3:08 PM, Kevin Burke [email protected]
wrote:

That would be neat; my one worry about retries would be if Librato would
double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether
anything
happened on our end or on Librato's end because I had no data about the
stack trace to begin with.

—
Reply to this email directly or view it on GitHub
<
https://github.com/goodeggs/librato-node/issues/35#issuecomment-91050510>

.

—
Reply to this email directly or view it on GitHub
#35 (comment)
.

kevin

bobzoller · 2015-04-08T22:30:05Z

ah, got it. yeah I suppose transient network issues then...

On Wed, Apr 8, 2015 at 3:15 PM, Kevin Burke [email protected]
wrote:

I remember that, but for about two weeks after we introduced Librato we
were seeing ETIMEDOUT errors maybe three times a day, sporadically, with no
other stack trace attached.

On Wed, Apr 8, 2015 at 3:12 PM, Bob Zoller [email protected]
wrote:

Librato was down for an extended period of time around the time you
opened
this issue. We experienced similar things.

Good question about double-counting. In our own usage I doubt we'd enable
retry, so I'll let that go for the time being. PR for requestOptions
coming shortly.

On Wed, Apr 8, 2015 at 3:08 PM, Kevin Burke [email protected]
wrote:

That would be neat; my one worry about retries would be if Librato
would
double-count the incoming metrics.

Oddly enough we have not seen this recently. I can't tell whether
anything
happened on our end or on Librato's end because I had no data about the
stack trace to begin with.

—
Reply to this email directly or view it on GitHub
<
#35 (comment)

.

—
Reply to this email directly or view it on GitHub
<
https://github.com/goodeggs/librato-node/issues/35#issuecomment-91051661>
.

kevin

—
Reply to this email directly or view it on GitHub
#35 (comment)
.

kevinburkeshyp · 2015-07-13T23:45:33Z

thanks for this!

bobzoller closed this as completed in 01bf798 Apr 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

occasional ETIMEDOUT errors #35

occasional ETIMEDOUT errors #35

kevinburkeshyp commented Mar 31, 2015

bobzoller commented Apr 8, 2015

kevinburkeshyp commented Apr 8, 2015

bobzoller commented Apr 8, 2015

kevinburkeshyp commented Apr 8, 2015

bobzoller commented Apr 8, 2015

kevinburkeshyp commented Jul 13, 2015

occasional ETIMEDOUT errors #35

occasional ETIMEDOUT errors #35

Comments

kevinburkeshyp commented Mar 31, 2015

bobzoller commented Apr 8, 2015

kevinburkeshyp commented Apr 8, 2015

bobzoller commented Apr 8, 2015

kevinburkeshyp commented Apr 8, 2015

bobzoller commented Apr 8, 2015

kevinburkeshyp commented Jul 13, 2015