Track retries in the crawler's stats #186

curita · 2024-04-08T23:43:01Z

Background

Retries issued by zyte_api.aio.retry.RetryFactory are somewhat hidden. They are logged as DEBUG messages (so they are not seen by default in new projects with LOG_LEVEL: INFO) and, I believe, tracked generally under the scrapy-zyte-api/attempts stat. Only after the retries fail are they logged as errors and tracked as scrapy-zyte-api/fatal_errors.

scrapy-zyte-api/error_types/* also tracks the kind of errors and the amount we've seen, but that stat doesn't tell which were retries.

Suggestion

Would it be possible to explicitly track in the stats the retries issued by the retry policy, segregated by error type? And how many of those result in fatal_errors, also segregated by error type? This way, we could better track what's going on behind the scenes and use those stats to debug unusual behaviors.

This could be especially helpful for custom retry policies, in which case we might not know which codes are retried unless we look at the code or try to infer it from the existing stats.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track retries in the crawler's stats #186

Track retries in the crawler's stats #186

curita commented Apr 8, 2024

Track retries in the crawler's stats #186

Track retries in the crawler's stats #186

Comments

curita commented Apr 8, 2024

Background

Suggestion