You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description of the problem including expected versus actual behavior:
Rally will stop retrying to collect stats telemetry once it has failed too many times.
At the time of the last stats collection attempt, the benchmark showed a steady and prolonged increase in average bulk indexing latency.
Rally recorded 0 bulk indexing failures, though indexing throughput dropped significantly.
Subsequent manual stats calls to the cluster were successful.
Provide logs (if relevant):
2023-08-25 16:30:33,699 ActorAddr-(T|:45481)/PID:7942 esrally.telemetry ERROR Could not determine master node stats
Traceback (most recent call last):
File "~/rally/esrally/telemetry.py", line 172, in run
self.recorder.record()
File "~/rally/esrally/telemetry.py", line 2249, in record
info = self.client.nodes.info(node_id=state["master_node"], metric="os")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/.local/lib/python3.11/site-packages/elasticsearch/_sync/client/utils.py", line 414, in wrapped
return api(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "~/.local/lib/python3.11/site-packages/elasticsearch/_sync/client/nodes.py", line 249, in info
return self.perform_request( # type: ignore[return-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/.local/lib/python3.11/site-packages/elasticsearch/_sync/client/_base.py", line 390, in perform_request
return self._client.perform_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/rally/esrally/client/synchronous.py", line 226, in perform_request
raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(message=message, meta=meta, body=resp_body)
elasticsearch.ApiError: ApiError(503, "{'ok': False, 'message': 'The requested resource is currently unavailable.'}")
The benchmark was using the default node-stats-sample-interval of 1s. One second seems aggressive, and I will try with a value of 10s. We might consider a new default.
The text was updated successfully, but these errors were encountered:
Rally version (get with
esrally --version
):esrally 2.9.0.dev0 (git revision: 50ebcb68d9f09de545a1bfb217fc9840b97a367e)
Description of the problem including expected versus actual behavior:
Rally will stop retrying to collect stats telemetry once it has failed too many times.
Provide logs (if relevant):
The benchmark was using the default
node-stats-sample-interval
of1s
. One second seems aggressive, and I will try with a value of10s
. We might consider a new default.The text was updated successfully, but these errors were encountered: