-
Notifications
You must be signed in to change notification settings - Fork 103
Health monitor
TempestaFW has health monitoring ability, which allows to measure the health of backend servers in sense of HTTP availability. If health monitor is enabled for the server and such server produces a large number of "bad" responses (with undesirable HTTP statuses) beyond a given limit in certain timeout, it will be excluded from scheduling of client requests until new "good" responses (with separately specified conditions) will be received from it. Limits, timeouts and "bad" HTTP statuses are set via special directive. The syntax is as follows:
server_failover_http <status> <count> <timeout>;
-
<status>
is HTTP status (or wildcard pattern) of response. -
<count>
is a limit of responses. -
<timeout>
is a timeframe in seconds.
This directive applies to all servers for which the monitor is enabled. Directive may be repeated to configure monitoring of several HTTP statuses.
The health monitor itself is configured via following section:
health_check <name> {
request <req_string>;
request_url <url_string>;
resp_code <codes>;
resp_crc32 <crc32>;
timeout <timeout>;
}
-
<name>
is a unique identifier of health monitor. -
<req_string>
is a string containing the health monitoring request; default value is "GET / HTTP/1.0\r\n\r\n". -
<url_string>
is a string with URL; client requests with this URL will be used as health monitoring requests; default value is "/". -
<codes>
is a list of space separated HTTP statuses. -
<crc32>
is a hex number - calculated CRC32 checksum for expected response body (the value along with value defines conditions for "good" responses, which will signal that server is alive); also keywordauto
can be specified instead of hex number - this means that nocrc32
verification is required (the same as the absence ofresp_crc32
directive); user can generate CRC32 checksum via Linux utilitycrc32
, which is a part oflibarchive-zip-perl
package. -
<timeout>
is a timeout in seconds after which new health monitoring request (specified inrequest
directive) will be send to backend server (if there were no client requests satisfying condition, given inrequest_url
directive).
Administrator must configure either resp_code
or resp_crc32
(or both directives) with explicit values (not auto
). The health_check
section may be repeated to configure several health monitors in TempestaFW. Default health monitor with name auto
is always present in TempestaFW. Its configuration is given below:
health_check auto {
request "GET / HTTP/1.0\r\n\r\n";
request_url "/";
resp_code 200;
resp_crc32 auto;
timeout 10;
}
Auto
monitor can be explicitly redefined (with name auto
) by administrator with custom settings - in this case default auto
monitor is not created. It is also important to note that keyword auto
in resp_crc32
directive has special meaning for auto
monitor (implicitly or explicitly defined): it means that the crc32
value will be generated on the fly from the first received response and used to verify the crc32
values of subsequent responses.
Health monitor is specified for separate server groups (explicit or implicit), and for such groups a monitor with specific ID is enabled. This means, that for servers from such groups - all directives 'server_failover_http' and section 'health_check' (with corresponding ID) are applied.
To specify particular health monitor for server group - special directive exists inside srv_group
section:
health <id>;
-
<id>
is a health monitor identifier.
Following example demonstrates how to apply health monitor h_monitor1
to server group main
with several HTTP statuses monitoring:
server_failover_http 404 300 15;
server_failover_http 500 300 10;
server_failover_http 502 100 5;
health_check h_monitor1 {
request "GET / HTTP/1.0\r\n\r\n";
request_url "/root/";
resp_code 200;
resp_crc32 0x71f21b41;
timeout 10;
}
srv_group main {
server 10.10.0.1:8080;
server 10.10.0.2:8080;
health h_monitor1;
}
health_stat <statuses>;
-
<statuses>
is a list of space separated HTTP statuses (or wildcard patterns).
Example:
health_stat 400 5*;
Total count of responses from Tempesta for each specified HTTP status. It includes responses directly from servers as well as from cache. Displayed in Performance statistics.
health_stat_server <statuses>;
-
<statuses>
is a list of space separated HTTP statuses (or wildcard pattern).
Example:
health_stat_server 400 5*;
Total count of responses from servers for each specified HTTP status. Only responses directly from servers are considered; responses from the cache are ignored. Displayed in Servers statistics.
The 200 HTTP status is always monitored, regardless of whether it is specified in the directive.
Also note that enabling the server_failover_http
directive automatically includes counting responses from servers. Therefore, for those HTTP statuses for which server_failover_http
is enabled, there is no need to enable health_stat_server
.