-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache "live" results from crates.io #48
Comments
Is the data you need cached locally somewhere already, e.g. either in the crates.io index itself (which can be consumed using the |
Good question! Unfortunately it is not. We need the data about the crates.io publishers, which is present in neither of those places. |
Aah, unfortunate. Perhaps it'd be worth opening an upstream issue to include that information in the index? |
I don't think it's a good idea to include it in the index, actually. This info is not needed for most uses - that's why it's not in the index! |
If we choose to use a granular cache, it makes sense to store it on-disk in JSON since it's basically a map and we already have a dependency on And we already have the cache directory created for storing the crates.io dump. |
I'm not sure they ever made a conscious decision whether or not to include it in the index. It's a feature that was added to crates.io quite awhile after the index was created. It's also (somewhat) low-cardinality data that would compress well. I think the nice part about having it in the index is the index provides a timestamped/append-only(-ish) cryptographic(-ish, with the unfortunate problem of SHA-1 collisions) log, so including audit info would commit to that, as opposed to it potentially being retroactively modified by an attacker in the event of a crates.io compromise. |
https://crates.io/crates/structsy sounds like a better way to store data on disk than JSON files. |
When downloading data via the crates.io API, we could cache it for later reuse. This would help if the user wants to view both
crates
andpublishers
commands for their crate or adjust the cargo-metadata parameters (e.g. target platform).The timestamp of when the data was downloaded should be preserved; the cached data should be used only if the
--cache-max-age
configuration allows it.If there are any cache entries with a timestamp from the future, they should be discarded.
The text was updated successfully, but these errors were encountered: