Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revalidate cache based on source digest #468

Merged
merged 1 commit into from
Jan 30, 2024
Merged

Conversation

etiennebarrie
Copy link
Member

@etiennebarrie etiennebarrie commented Jan 29, 2024

Ref: #336

Bootsnap was initially designed for improving boot time in development, so it was logical to use mtime to detect changes given that's reliable on a given machine.

But is just as useful on production and CI environments, however there its hit rate can vary a lot because depending on how the source code and caches are saved and restored, many if not all mtime will have changed.

To improve this, we can first try to revalidate using the mtime, and if it fails, fallback to compare a digest of the file content. Digesting a file, even with fnv1a_64 is of course an overhead, but the assumption is that true misses should be relatively rare and that digesting the file will always be faster than compiling it. So even if it only improve the hit rate marginally, it should be faster overall.

Also we only recompute the digest if the file mtime changed, but its size remained the same, which should discard the overwhelming majority of legitimate source file changes.

@casperisfine casperisfine marked this pull request as ready for review January 29, 2024 15:26
@casperisfine casperisfine marked this pull request as draft January 29, 2024 15:26
Base automatically changed from remove-warning to main January 29, 2024 15:27
@casperisfine casperisfine force-pushed the revalidate-mtime branch 3 times, most recently from 4ea8870 to 83f69e5 Compare January 30, 2024 10:19
@casperisfine casperisfine marked this pull request as ready for review January 30, 2024 10:22
Ref: #336

Bootsnap was initially designed for improving boot time
in development, so it was logical to use `mtime` to detect changes
given that's reliable on a given machine.

But is just as useful on production and CI environments, however
there its hit rate can vary a lot because depending on how the
source code and caches are saved and restored, many if not all
`mtime` will have changed.

To improve this, we can first try to revalidate using the `mtime`,
and if it fails, fallback to compare a digest of the file content.
Digesting a file, even with `fnv1a_64` is of course an overhead,
but the assumption is that true misses should be relatively rare
and that digesting the file will always be faster than compiling it.
So even if it only improve the hit rate marginally, it should be
faster overall.

Also we only recompute the digest if the file mtime changed, but
its size remained the same, which should discard the overwhelming
majority of legitimate source file changes.

Co-authored-by: Jean Boussier <[email protected]>
@casperisfine casperisfine merged commit 4a91add into main Jan 30, 2024
16 checks passed
@etiennebarrie etiennebarrie deleted the revalidate-mtime branch January 30, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants