Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use overlayfs or FUSE filesystem to speed up sandboxed Linux builds #78

Open
kylewlacy opened this issue Jul 6, 2024 · 0 comments
Open

Comments

@kylewlacy
Copy link
Member

kylewlacy commented Jul 6, 2024

When a sandboxed Linux process runs, all of its inputs are written to disk first so that they can be bind-mounted into the directory used as the rootfs for the process. Even though we use hardlinks as much as possible, the process of creating these local directory structures can take a significant amount of time.

Here's a test case today:

// project.bri
import * as std from "std";

export default () => {
  return std.runBash`
    echo "$large_input"
    touch "$BRIOCHE_OUTPUT"
  `.env({
    large_input: std.directory({
      a: std.toolchain(),
      b: std.toolchain(),
    }),
  });
};

Building this project locally takes about 20 seconds when uncached for me (this assumes std.toolchain() already exists as a local but large_input does not). The majority of that time is just creating the directory structure for large_input, even though all files within that directory are hardlinks to existing blobs.

We definitely need to work on optimizing this, and there are probably lots of low-hanging fruit in how we create these directory structures.

But, one future optimization we can do is to use a FUSE filesystem to avoid needing to create this directory structure entirely. That is, we have a small FUSE filesystem (likely a custom implementation) where, when reading a path like ${large_input}/a/bin/gcc, it forwards that request to ~/.local/share/brioche/blobs/SOME_HASH. This has several advantages:

  • The directory structure never needs to get written to disk
  • Avoid creating executable/non-executable copies of blobs (e.g. identical blobs that differ only by the .x prefix and the executable permission)
  • As a future improvement, we can even map other paths not in the blobs directory into the filesystem, e.g. mapping Brioche.includeFile files directly without copying into a blob

FUSE has some issues, namely that I/O performance is worse than a normal filesystem, it's Linux-only (effectively), and it's not enabled by default in all Linux distros. For these reasons, if/when we implement FUSE, we'll still need to fallback to the current "create then bind-mount a local directory" implementation that we use today.


Update: I recently learned that overlayfs was enabled by default when using Linux 5.11+. This is another option, which would save a lot of legwork needed for a custom FUSE implementation, and should work out-of-the-box in more distros I think (plus, FUSE is relatively slow IIRC, so using something provided by the kernel might be faster than anything we could do ourselves)

@kylewlacy kylewlacy changed the title Use FUSE filesystem to speed up sandboxed Linux builds Use overlayfs or FUSE filesystem to speed up sandboxed Linux builds Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant