-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootsnap compile cache isn't as useful as it could be on CI and production #336
Comments
For problem 1 I think we should make the cache invalidation configurable so we can keep For problem 2 I think the solution 2.2 is the less invasive and less prone to problems and less likely to add runtime penalty. |
I think we should first measure how much slower the digest based cache would be.
The two proposed solutions weren't exclusive. My issue with 2.2 is that it is harder to integrate. |
For Problem 1 could we use something like https://github.com/rosylilly to restore the |
I suppose you mean: https://github.com/rosylilly/git-set-mtime
that's what I'd expect yes.
I don't think there is any subset, bootsnap has one entry for every single ruby source file. You might save a bit by skipping various text files etc, but it's not as reliable. Also note that this issue is not just a Shopify thing, the idea is to improve the situation for all users, not just us. |
It could be configurable to use the |
Does it sound like this issue is the same as: heroku/heroku-buildpack-ruby#979 I originally thought that this issue was due to a different dir at boot and runtime. But I tried to reproduce that issue locally and it looks like moving an app after the cache is generated works just fine. I'm not sure what conditions reproduce it other than deploying on Heroku. |
Not quite. The issue I'm describing is not supposed to make the cache grow. But the issue you link is another interesting thing. Since bootsnap use realpaths, with buildpacks moving the code around I suppose even the cache generated by assets:precompile can't be re-used. |
Ref: #336 Bootsnap was initially designed for improving boot time in development, so it was logical to use `mtime` to detect changes given that's reliable on a given machine. But is just as useful on production and CI environments, however there its hit rate can vary a lot because depending on how the source code and caches are saved and restored, many if not all `mtime` will have changed. To improve this, we can first try to revalidate using the `mtime`, and if it fails, fallback to compare a digest of the file content. Digesting a file, even with `fnv1a_64` is of course an overhead, but the assumption is that true misses should be relatively rare and that digesting the file will always be faster than compiling it. So even if it only improve the hit rate marginally, it should be faster overall. Also we only recompute the digest if the file mtime changed, but its size remained the same, which should discard the overwhelming majority of legitimate source file changes. Co-authored-by: Jean Boussier <[email protected]>
Problem 1: cache keys are
mtime
basedBootsnap's compile cache is very efficient for development workflows, but on CI or production, it becomes almost unusable.
The main reason is that git (and most other VCS) don't store
mtime
, so on CI or production, unless your setup manage to preservemtime
, all the compile cache will be invalidated. And most CI / production systems start from a fresh clone of the repository.The solution to this would be to use file digests instead of
mtime
, of course hashing a source file is slower than just accessing themtime
, but compared to parsing the Ruby source file, fast hash functions would still offer a major speed up.Problem 2: the cache isn't self cleaning
The compile cache entries are stored based on the path of the source file. e.g. the cache for
path/to.rb
will be stored in<cache-dir>/<fnv1a_64(path)>
. So if you keep persisting the cache between CI builds or production deploys, over time as you delete some source files, update gems etc, new entries will be created, but outdated ones won't be removed, which might lead to a very bloated cache.Hence why we have a note in the README about regular flushing on the cache.
And the problem can be even worse with some deploy methods like capistrano, with which the real path of the source files change on every deploy.
So even if we were to fix the
mtime
issue, we'd need to address cache GC otherwise users would run into big troubles.Here I'm not too sure what the best solution could be, but I have a few ideas
Solution 2.1: Splitting the cache
Assuming the biggest source of cache garbage is gem upgrades, we could have one compile cache directory per gem, e.g. we could store cache for
$GEM_ROOT/gems/my-gem-1.2.3/lib/my-gem.rb
in$GEM_ROOT/gems/my-gem-1.2.3/.bootsnap/<fnv1a_64(path)>
, or even$GEM_ROOT/gems/my-gem-1.2.3/lib/my-gem.rb.bootsnap
.This way when you upgrade or remove a gem you automatically get rid of the old cache.
However:
I think that if we were to implement this, the vast majority of the GC problem would be solved, as path changes insides the application are much less likely to be frequent enough to produce the problem unless you keep the cache for a very long time.
Solution 2.2:
bootsnap precompile --clean
This is much less of a general solution as I don't think is is likely that a large portion of users would integrate
bootsnap precompile
in their workflow, but in theory we could have it clean the outdated cache entries. Since it will go over all the source files to precompile them, it can make a list of up to date cache entries and delete the rest.Thoughts
This two changes aren't necessarily that hard to implement, but they are a quite important change, likely justifying a major version bump. So rather than to start writing PRs head on, I'd like to have some feedback on the idea.
@burke I saw you removed yourself from the CODEOWNERS, but if you have a bit of time your insights here would be more than welcome.
@rafaelfranca @DazWorrall I think you may have opinions or hindsights on this.
The text was updated successfully, but these errors were encountered: