Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track retries in the crawler's stats #186

Open
curita opened this issue Apr 8, 2024 · 0 comments
Open

Track retries in the crawler's stats #186

curita opened this issue Apr 8, 2024 · 0 comments

Comments

@curita
Copy link

curita commented Apr 8, 2024

Background

Retries issued by zyte_api.aio.retry.RetryFactory are somewhat hidden. They are logged as DEBUG messages (so they are not seen by default in new projects with LOG_LEVEL: INFO) and, I believe, tracked generally under the scrapy-zyte-api/attempts stat. Only after the retries fail are they logged as errors and tracked as scrapy-zyte-api/fatal_errors.

scrapy-zyte-api/error_types/* also tracks the kind of errors and the amount we've seen, but that stat doesn't tell which were retries.

Suggestion

Would it be possible to explicitly track in the stats the retries issued by the retry policy, segregated by error type? And how many of those result in fatal_errors, also segregated by error type? This way, we could better track what's going on behind the scenes and use those stats to debug unusual behaviors.

This could be especially helpful for custom retry policies, in which case we might not know which codes are retried unless we look at the code or try to infer it from the existing stats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant