Reduce tooBusy false-negative #17

sauvainr · 2016-11-08T06:20:05Z

In my company we realize that toobusy.js generate a large amount of false-positive (Our settings are highWater = 80 & smootingFactor = 1/5).
Event with this conservative factor, in a sudden load increase on the server would cause currentLag to jump from 0 to 200+ ms and cause all consecutive requests to be rejected, even the server has largely the resource to handle them.

This commit bring a proposal that solve this situation:

Limit the maximum lag value:
As highWater * 2 == (100% too busy) we want to avoid the current lag to suddenly jump to a state of rejection of all requests.
Limiting the lag metric to highWater * 2 insure a smooth and coherent current Lag increase.
Inverse smoothfactor for decrementing the currentLag value:
In a situation of a quick punctual overload of the system the recovery should be fast to avoid false-negative rejections when the resources are already available.
Inverting the smoothfactor when the lag measure is smaller than the current lag insure full resources usage.

In my company we realize that toobusy.js generate a large amount of false-positive (Our settings are highWater = 80 & smootingFactor = 1/5). Event with this conservative factor, in a sudden load on the service would cause currentLag to jump from 0 to 200+ ms which triggered cause all consecutive requests to be rejected even the server has largely the resource to handle them. This commit bring a proposal that solve this situation: 1. Limit the maximum lag value: As highWater * 2 == (100% too busy) we want to avoid the current lag to suddenly jump to a stat of rejection all requests. Limiting the lag metric to highWater * 2 insure a smooth and coherent current Lag increase. 2. Inverse smoothfactor for decrementing the currentLag value: In a situation of a quick punctual overload of the system the recovery should be fast to avoid false-negative rejections when the resources are already available. Inverting the smoothfactor when the lag measure is smaller than the current lag insure full resources usage.

asilvas · 2016-11-08T16:07:52Z

I've used this pattern for quite some time, and agree it has it's limitations. It's especially flaky with apps that can have bursts of cpu-intensive workloads, easily spiking lag times even during relatively low traffic. If your entire app/api is reliably light weight, toobusy pattern works quite well.

For more complex apps, I recommend concurrency monitoring, capping the number of concurrently processed requests, and responding too busy if threshold is exceeded. Or monitor the average request/response timings averaged over the last X requests, and responding too busy if threshold exceeds Y.

knoxcard · 2018-01-28T10:54:18Z

Sounds critical, has this been merged in and pushed up?

sauvainr · 2018-01-29T08:52:04Z

Hi, @knoxcard as the thread didn't have updates, I am still using our fork in my company -> https://github.com/exosite/node-toobusy Which is on production for more then a year now and provided the expected behavior for us.

@asilvas may want to have a look again and decides if he wants to integrates or document the limitation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce tooBusy false-negative #17

Reduce tooBusy false-negative #17

sauvainr commented Nov 8, 2016

asilvas commented Nov 8, 2016

knoxcard commented Jan 28, 2018 •

edited

Loading

sauvainr commented Jan 29, 2018

Reduce tooBusy false-negative #17

Are you sure you want to change the base?

Reduce tooBusy false-negative #17

Conversation

sauvainr commented Nov 8, 2016

asilvas commented Nov 8, 2016

knoxcard commented Jan 28, 2018 • edited Loading

sauvainr commented Jan 29, 2018

knoxcard commented Jan 28, 2018 •

edited

Loading