-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootsnap cache incorrect or not being used? #979
Comments
I've always been a bit fuzzy on buildpacks, but IIRC If so then I believe your guess is correct. Bootsnap cache keys are a hash of the source file realpath. So if you move your app, all cache keys change. So yeah you cache pretty much exactly doubling in size is indicative of that. Unfortunately I don't think there's a way to really handle that. In Shopify/bootsnap#336 I suggest we could try to store parts of the cache inside the gems own directory, so this could help, but for the app's code cache, the move would still invalidate everything. Something that you might be interested in integrating though is the it could allow you to generate a clean bootsnap cache during the build, so that later the app would boot faster once you distribute it. |
Short story is that Heroku builds apps in one dire The full story is that to be able to deploy an app, we have to have a client on a machine that can receive build information (such as source files) and kick off a build. As an implementation detail, this client is built using Heroku which means the output directory of the client is running at the path Once the build is done, then at runtime, the app will live in
I do see that this is where the cache keys are generated in bootsnap https://github.com/Shopify/bootsnap/blob/eeed78c5e07f842b61de86870e50e65939f09828/ext/bootsnap/bootsnap.c#L338-L362 and that they're based on absolute paths.
Sprockets had the same problem. I worked to fix it there https://www.schneems.com/blogs/2016-02-18-speeding-up-sprockets. Essentially we split the absolute path into a base path and a relative path. The sass library stores a "base dir" and then from that dir, a relative path inside of the cache file.
That's super interesting. I wouldn't mind adding an explicit precompile step. I took a look at
I think this proposal would help a lot. The only area I think it might fail is if a dev is doing something like From my experience with Heroku customer support tickets The vast majority of boot time is spent in requiring dependencies. After that app code is small-ish. Parsing and generating routes. Maybe being able to handle relative paths would allow app code caching. Do you think there would ever be a world in which bootsnap would support using relative paths for cache keys? |
Right sorry, I was suggesting to call it once moved to
We can still revalidate the cache with
Yeah, we have a bunch of outliers here, but what you say likely hold true for the vast majority of apps out there.
It's tricky for a few reasons. And I was about to say no, but I had an idea last night: I think we could do it for gems, e.g. if we were to transform the key such as For app code however I don't think it's possible (at least for now). We could do the same trick with To be able to use relative paths for all caches, we'd need to be able to split the path like you did for sprockets. It's worth trying but I'm afraid it would be too slow. A Sprockets cache hit can saves seconds if not minutes, but Ruby's ISeq caching while helpful doesn't save enough time to allow for slow cache key generation. I'll see if I can find some time to experiment with the |
Awesome! Thanks
It's on the list of things we would like to have. Right now we're on v2 of the Buildpack spec. The next version v3 is an open spec from the Cloud Native Computing Foundation (CNCF) and known as Cloud Native Buildpacks (CNB). Under that spec, the build and "launch" as it's called use the same dir structure. The catch is that while we've only just begun re-writing buildpacks, there's no forecast for when our infra will be able to support that spec on Heroku. It's a big unknown.
Totally understand. I was also wondering if we could push the AOT generation upstream to something like RubyGems to avoid the cold-cache case, but I'm guessing that the bytecode isn't stable between versions and that's not straightforward. |
I tried in MRI without success: https://bugs.ruby-lang.org/issues/16847 Maybe rubygems/bundler would be more open to it. 👋 @deivid-rodriguez was ISeq caching as part of rubygems/bundler ever considered? |
Ref: heroku/heroku-buildpack-ruby#979 Heroku buildpacks build the application in one location and then move it elsewhere. This cause all the paths to change, hence all bootsnap cache keys to be invalidated. By replacing $BUNDLE_PATH by a constant string in the cache keys, we allow the bundler directory to be moved without flushing the cache. Ideally we'd use a similar substitution for the "app root", but I need to put more thoughts into it, as I'm not too sure how best to infer it.
Ref: heroku/heroku-buildpack-ruby#979 Heroku buildpacks build the application in one location and then move it elsewhere. This cause all the paths to change, hence all bootsnap cache keys to be invalidated. By replacing $BUNDLE_PATH by a constant string in the cache keys, we allow the bundler directory to be moved without flushing the cache. Ideally we'd use a similar substitution for the "app root", but I need to put more thoughts into it, as I'm not too sure how best to infer it.
Hei @casperisfine! No, I think it has never been considered as far as I recall. I think it should not be considered. In my opinion, it fits ruby-core much better so the way to get this inside the language would be to try to persuade Matz harder (not sure how though). I don't think it fits nicely inside bundler & rubygems area, and I would not be happy with the maintenance overhead. The way I see it, introducing this inside bundler & rubygems would be "a hack" to take advantage of rubygems being used by default to override Matz's decision, which doesn't seem nice. |
On some points like invalidation I agree, however on some others since rubygems/bundler is already quite opiniated and already change Ruby's require behavior quite a lot, IMHO it fits. It even allows to do things such as precompile the gem during install etc.
I've tried this for a while, failed, so now I'm a bit tired TBH. But yeah people are welcome to try again. |
I've just added support for a beta opt-in labs This can be enabled via:
I'd be very interested to hear how much of a difference this makes to app boot times at runtime if anyone tries it out :-) |
@edmorley My boot time was cut in half! No issues so far. Thank you! Before: $ heroku run bash
~ $ time rails runner puts "done"
real 0m13.937s
user 0m8.328s
sys 0m0.988s
~ $ time rails runner puts "done"
real 0m6.512s
user 0m4.772s
sys 0m0.460s
~ $ time rails runner puts "done"
real 0m6.482s
user 0m4.860s
sys 0m0.432s
~ $ du -hs tmp/cache
94M tmp/cache After enabling the labs feature and redeploying: $ heroku run bash
~ $ time rails runner puts "done"
real 0m7.150s
user 0m5.028s
sys 0m0.592s
~ $ time rails runner puts "done"
real 0m6.110s
user 0m4.672s
sys 0m0.388s
~ $ time rails runner puts "done"
real 0m6.171s
user 0m4.612s
sys 0m0.448s
~ $ du -hs tmp/cache
49M tmp/cache |
We have what appears to be a much more sprawling application than @sergiopantoja does and we saw a similar proportion of change: before:
after:
|
@sergiopantoja @geoffharcourt Thank you for providing those timings - glad to hear it helped! It will be a bit longer before we can make this new behaviour the default, but that flag is fine to use in the meantime :-) |
In our Rails app we reduced boot time from ~16 seconds to ~8 seconds after enabling Are there any caveats to be aware of? Any progress to report regarding making this the default? |
I just tried |
After enable Before# First time
~ $ bundle exec rake environment
real 0m9.968s
user 0m6.032s
sys 0m0.768s
# Second time
~ $ bundle exec rake environment
real 0m6.142s
user 0m2.640s
sys 0m0.368s
# Third time
~ $ bundle exec rake environment
real 0m6.045s
user 0m2.536s
sys 0m0.396s After~ $ time bundle exec rake environment
real 0m6.367s
user 0m2.748s
sys 0m0.432s
~ $ time bundle exec rake environment
real 0m5.997s
user 0m2.500s
sys 0m0.364s
~ $ time bundle exec rake environment
real 0m6.679s
user 0m2.704s
sys 0m0.328s |
I have also tried in my Rails app! Hope this helpful for others, and thanks for the great feature! 😆✨ (coderdojo-japan/coderdojo.jp#1423) Before: App Boot Time╭─○ yasulab ‹2.7.3› ~/coderdojo.jp
╰─○ heroku run bash
# 1st try
$ time NEW_RELIC_AGENT_ENABLED=false rails runner puts "Done"
real 0m10.353s
user 0m6.244s
sys 0m0.904s
# 2nd try
$ time NEW_RELIC_AGENT_ENABLED=false rails runner puts "Done"
real 0m5.248s
user 0m2.272s
sys 0m0.360s
# 3rd try
$ time NEW_RELIC_AGENT_ENABLED=false rails runner puts "Done"
real 0m4.031s
user 0m2.204s
sys 0m0.296s After: App Boot Time╭─○ yasulab ‹2.7.3› ~/coderdojo.jp
╰─○ heroku run bash
# 1st try
$ time NEW_RELIC_AGENT_ENABLED=false rails runner puts "Done"
real 0m2.416s
user 0m1.224s
sys 0m0.244s
# 2nd try
$ time NEW_RELIC_AGENT_ENABLED=false rails runner puts "Done"
real 0m2.618s
user 0m1.364s
sys 0m0.180s
# 3rd try
$ time NEW_RELIC_AGENT_ENABLED=false rails runner puts "Done"
real 0m2.397s
user 0m1.292s
sys 0m0.224s |
@spartchou do note that when loading the Rails environment via
|
I just tried the build-in-app-dir feature as well (executed Beforetime rails runner puts "Done":
Aftertime rails runner puts "Done":
Might we be doing something wrong? Or is our app simply not large enough to really benefit from this yet? We don't use NEW_RELIC, but we do use some other monitoring tools such as Sentry and Scout. Could that be influencing the timings? |
Bootsnap will only affect the performance of requiring files. If you have wildly varying boot times that would suggest that something else in your initialization process is taking time. Yes, could be related to Sentry and Scout, but you'd have to dig in to your gems / initializers to figure out the details. The bumbler gem sets out to help profile Rails boot time but I've never used it so can't vouch for how helpful it would be. |
If anyone else has this problem, I solved it by adding bin/yarn to my .slugignore. Thanks to @schneems for pointing me toward the binstubs. |
Theres' a recent article about this as well that I want to link https://dev.to/dbackeus/cut-your-rails-boot-times-on-heroku-in-half-with-a-single-command-514d |
Just want to chime in that we are also seeing great results on our production app after enabling Before: $ heroku run bash
$ time rails runner puts "done"
real 0m6.828s
user 0m5.844s
sys 0m0.904s
$ time rails runner puts "done"
real 0m3.072s
user 0m2.688s
sys 0m0.320s
$ time rails runner puts "done"
real 0m2.919s
user 0m2.500s
sys 0m0.372s
$ du -hs tmp/cache
32M tmp/cache After: $ heroku run bash
$ time rails runner puts "done"
real 0m3.066s
user 0m2.664s
sys 0m0.336s
$ time rails runner puts "done"
real 0m3.132s
user 0m2.640s
sys 0m0.432s
$ time rails runner puts "done"
real 0m2.887s
user 0m2.512s
sys 0m0.304s One problem we ran into and was able to address after identifying the issue is our app's use of a third-party buildpack which cleans the slug after build (https://elements.heroku.com/buildpacks/devforce/heroku-buildpack-cleanup). We had — |
@edmorley Sorry to resurrect this old thread, but is there any way to have |
Unfortunately labs can't currently be enabled via |
You can use a
|
Yes, though unfortunately since |
Linked internal tickets
Issue
The bootsnap cache is present after a deploy:
However if a
rails
process runs, the cache size doubles:And running the same command twice makes it faster, this makes me think that bootstrap isn't being used:
We need to investigate why. My first guess would be that the difference between build and runtime file structure prevents the original cache from being used. But that's just a guess.
The text was updated successfully, but these errors were encountered: