-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Canonicalize $BUNDLE_PATH in cache keys #343
Conversation
Ref: heroku/heroku-buildpack-ruby#979 Heroku buildpacks build the application in one location and then move it elsewhere. This cause all the paths to change, hence all bootsnap cache keys to be invalidated. By replacing $BUNDLE_PATH by a constant string in the cache keys, we allow the bundler directory to be moved without flushing the cache. Ideally we'd use a similar substitution for the "app root", but I need to put more thoughts into it, as I'm not too sure how best to infer it.
So after implementing this, I wonder if we could do the same with app paths, e.g. record |
😍 I gave it a shot, and it looks like there might be a bug or two:
Note that it's still looking for |
Hum, that's looks like a problem with the LOAD_PATH cache, which I haven't touched. Thanks for trying it out, I'll experiment some more as soon as I find time for it. |
It's weird, I can't repro locally:
From afar it seem to be working as intended. |
So I figured why it would repro locally. The problem occur when the BUNDLE_PATH is inside the app root |
It seems to break on
The backtrace suggest that |
Looks like it indeed: $ strings /tmp/dump.isq
YARB
x86_64-darwin19
<main>
/tmp/iseq.rb
)/private/tmp/iseq.rb |
Yep: File.write('/tmp/iseq.rb', 'p __FILE__')
File.write('/tmp/a.rb', 'p :a')
File.write('/tmp/b.rb', 'p :b')
$iseq = RubyVM::InstructionSequence.compile_file('/tmp/iseq.rb')
class RubyVM::InstructionSequence
def self.load_iseq(feature)
$iseq
end
end
require '/tmp/a.rb'
require '/tmp/b.rb' $ ruby /tmp/iseq_debug.rb
"/tmp/iseq.rb"
"/tmp/iseq.rb" I'll dig in Ruby to see what's doable, but this might require an MRI change, in which case it wouldn't be doable until the next minor release, unless we find a hack in the meantime. |
Yeah, so I'm afraid I'll start a PR/ticket upstream to get |
Ok, so that's going to be tricky. I did a quick & dirty patch to overwrite the ISeq location object: Shopify/ruby@192f5b4 With the following test script: system('mkdir', '-p', '/tmp/build', '/tmp/app')
File.write('/tmp/app/a.rb', 'p ["app/a", __FILE__]')
File.write('/tmp/app/b.rb', 'p ["app/b", __FILE__]')
File.write('/tmp/build/a.rb', 'p ["build/a", __FILE__]; require_relative "b"')
$iseq = RubyVM::InstructionSequence.compile_file('/tmp/build/a.rb')
class RubyVM::InstructionSequence
def self.load_iseq(feature)
if feature == "/tmp/app/a.rb"
$iseq
end
end
end
require '/tmp/app/a.rb' Which partly works:
The iseq cache is loaded correctly, and the However you can notice that the From what I gathered Also that patch is super hacky, as it changes the location in place. A proper patch would need to work on a copy or something. Or ideally the I'll chat a bit with our resident MRI people, see what they thing about it, but I doubt it will be an easy fix, so I wouldn't hold my breath. Ultimately that raise the question of the usefulness of bootsnap on Heroku. It might be worth disabling it entirely there :/ |
system('mkdir', '-p', '/tmp/build', '/tmp/app')
File.write('/tmp/app/a.rb', 'p ["app/a", __FILE__, __dir__]')
File.write('/tmp/app/b.rb', 'p ["app/b", __FILE__, __dir__]')
File.write('/tmp/build/a.rb', 'p ["build/a", __FILE__, __dir__]; require_relative "b"')
$iseq = RubyVM::InstructionSequence.compile_file('/tmp/build/a.rb')
class RubyVM::InstructionSequence
def self.load_iseq(feature)
if feature == "/tmp/app/a.rb"
$iseq
end
end
end
require '/tmp/app/a.rb'
So |
Wow. Thanks for digging in. Previously we talked about storing the cache files in the same location as the gem. Would this new information change the viability of that technique? |
Yep, since Heroku use So unless I'm missing something, I'm afraid there no way to make Bootsnap usable on Heroku. The best thing we could do for now I think, would be to automatically turn it off. Is there some kind of env var or something we could use to detect we're running on heroku? |
There's still some benefit for running on deploy as there's several commands that are executed that will use the same cache. (We run But having the cache files stored in the app's "slug" (tar file copy of the app generated after build time) it's just adding extra space. I could remove them after the build. I think instead of you turning it off for Heroku, I can manage these concerns where they are an issue. Eventually (someday?) we'll be able to build and run on the same path and I don't want to have to handle toggling this off on bootsnap's side and then warning people they need to upgrade. Is there a way to disable cache generation for bootsnap at runtime similar to |
Good point.
That would make sense I think. Right now they're actually slowing things down.
Not yet, but that would be trivial to add. |
Actually, now that I think about it, you only want to disable the CompileCache. The LoadPathCache should still bring you a major speedup. |
I can test this on CodeTriage. Looking now. |
With build load-path-cache:
With that cache removed:
It would be better to have more than one sample, but this first experiment does seem to show that the load path cache from build time helps. |
Yes and no. Your load path cache from build time will be fully invalidated anyway, since your entire But the nice thing with the load path cache, is that even on a cold cache (e.g. just removed), it still does speedup boot. So I think from the ruby-buildpack POV, the best course of action for now is:
|
|
Let me know if that does it for you: b941bad Also I'll try to followup upstream on that ISeq path issue. I think it wouldn't be that hard to fix, but of course at best it will be available in a year. |
That logic looks good for me. |
Yeah I got lazy here. I'll make sure to add some on Monday. Probably cut a new release as well. |
😉 I figured. I like to prototype and play around before cementing with tests. Also FWIW I talked with a co-worker today about this issue. It seems that work has been done to move the deploy infrastructure out of |
That would be a very good workaround yes. However keep in mind that Ruby loves to call Also I opened an issue upstream: https://bugs.ruby-lang.org/issues/17593 |
Done. |
Awesome, thanks! FWIW we're looking into launching a change to building in |
Blocked by an upstream issue, no point keeping this open right now. I'll resume the work if the necessary changes are made in Ruby. |
Ref: heroku/heroku-buildpack-ruby#979
Heroku buildpacks build the application in one location
and then move it elsewhere. This cause all the paths to change,
hence all bootsnap cache keys to be invalidated.
By replacing $BUNDLE_PATH by a constant string in the cache keys,
we allow the bundler directory to be moved without flushing the cache.
Ideally we'd use a similar substitution for the "app root", but
I need to put more thoughts into it, as I'm not too sure how
best to infer it.
cc @schneems: If you have time to give this branch a try.