fix(gardener): avoid issues caused by stale databases #1516
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We use a database to keep measurements results to allow to interrupt and resume the work, given that we have thousands or URLs to vet. However, after a week, the database is most likely stale, so let's just avoid trusting and start afresh in such a case. It's reasonable to assume that the database is stale after such an interval because the URL's status may have changed (e.g., some URLs could have stopped working).
Additionally, every time we run
gardener sync
we most likely have new URLs, so we need to do one of the following:merge new URLs into the database;
just zap the database and start afresh.
Because doing 1. would be a bit time consuming, for now I am opting for doing 2. and we'll implement 1. if needed.
Closes ooni/probe#2684.