I am pleased to announce that the first release of sandboxfs, 0.1.0, is finally here! You can download the sources and prebuilt binaries from the 0.1.0 release page and you can read the installation instructions for more details.

The journey to this first release has been a long one. sandboxfs was first conceived over two years ago, was first announced in August 2017, showed its first promising results in April 2018, and has been undergoing a rewrite from Go to Rust. (And by the way, this has been my 20% project at Google so rest assured that they are still possible!)

Alright. All that basic information laid out, let’s take a look at what this is.

What is sandboxfs?

sandboxfs is a FUSE file system that exposes an arbitrary view of the host’s file system. In other words: it exposes directories or files from your file system in a completely different layout without having to copy or symlink them. You can think of sandboxfs as an advanced version of bindfs (or mount --bind or mount_null(8) depending on your system) in which you can combine and nest files.

To put this in a concrete example, let’s say you want to create a container to run a compiler in. You could do something like the following:

$ sandboxfs \
    --mapping=rw:/:/tmp/scratch \
    --mapping=ro:/usr/bin:/usr/bin \
    --mapping=ro:/usr/include:/usr/include \
    --mapping=ro:/usr/lib:/usr/lib \
    /tmp/mnt

which would create a hierarchy under /tmp/mnt. Under that tree, all writes would go to /tmp/scratch except for the /usr/bin, /usr/include and /usr/lib subdirectories which would be read-only and correspond to the real directories at root. With this, you could constrain a compiler under /tmp/mnt (say, using chroot) and have it see only the things you allowed it to see.

So far so good. But the real key feature of sandboxfs is that it provides a “side channel” to reconfigure the mappings at runtime without having to remount the file system. This is critical for the Bazel use case described below as is what allows us to achieve good performance.

Use cases

sandboxfs was primarily targeted at making Bazel’s sandboxing mechanism faster. The Bazel build system allows each build action to be isolated from the rest of the system, and to do so, it currently constructs a sandbox using symlinks. The problems are that: symlinks do not scale because they require a system call for each input, and symlinks are “unsafe” because some tools like to use their real paths (possibly “escaping” the sandbox). You can read much more about this topic in the Introducing sandboxfs post.

But there are more use cases than this. The one that I’d like to get to at some point, and the one that really sparked the idea for this project even before Bazel, is pkg_comp. pkg_comp uses chroot sandboxes to build pkgsrc packages in and, at the moment, constructs the contents of the chroot using either bindfs or mount_null depending on the platform. Instantiating many mount points is problematic (e.g. each mount operation can be slow and OSXFUSE has a limit on the number of mount points), and fragile. Using sandboxfs would let pkg_comp instantiate a single mount point, resolving these issues.

Lastly, you can imagine this being useful for cases where you have to generate a file system that resembles a “real system” in which to run contained commands or services – i.e. the root file system of a container. Using sandboxfs, you can create such a view very cheaply, both in time and in disk space.

Rust rewrite

The original implementation of sandboxfs was in Go, which was great to make something functional and relatively robust. Unfortunately, I reached a point where that implementation was showing performance problems that were hard to overcome and, at that time, I was getting interested in Rust.

If you remember from my Rust review: Introduction post from almost a year ago, I decided to learn Rust and, as part of the process, rewrite sandboxfs in it. Well, the sandboxfs 0.1.0 release is the culmination of this experiment!

The journey has been mostly good. I’m much more confident about the new codebase than the old one due to the robustness enforced by the language, and I feel we can obtain all possible performance out of it. Profiling the Go version showed bottlenecks in the Go’s runtime, and Rust doesn’t have a runtime. That might have been fixable… but after having spotted a bunch of concurrency bugs in the Go code while writing the Rust version (because Rust wouldn’t just let me transcribe the old code structure “as is”), I can’t say I want to go back to the old version.

At this point, however, the major problem is the lack of maturity in critical dependencies: I’m particularly concerned about the FUSE library, for example, as it feels mostly abandoned and in poor shape (which is the same that happened with Go by the way). I mean, it works, but it lacks some important features that prevent implementing sandboxfs completely, like kernel cache invalidations. Upgrading the library to support a more-recent FUSE protocol versions is certainly possible, but that’d likely mean forking and maintaining the bindings.

Performance

The million-dollar question: is the new Rust version faster than the old Go version? Well… it should be and original prototyping showed that it was…. but I’m having trouble getting good measurements with Bazel builds. In fact, even the Go version of a year ago is behaving much worse than it did!?

I think a bug deep in our build stack is responsible for these regressions. We recently diagnosed what the problem was and fixed it, so I’m hopeful that I’ll be able to get good measurements once the fix rolls out. At that point I’ll benchmark large builds again and publish a post in the Bazel blog.

To be honest, the inability to measure great performance is the reason why I’ve postponed this release for so long. I wanted to be able to post impressive numbers at release time… but delaying the release any further is harmful. Given that sandboxfs now works well-enough in combination with Bazel, having a simple installation mechanism from crates.io and prebuilt binaries will be very helpful to let people try things out.

Parting words

Anyway, time for you to try this out!

If you have Cargo installed, you should be up and running with a simple cargo install sandboxfs (assuming you have pkgconfig and the FUSE development libraries on your system). Otherwise, please head to the installation instructions for more details and alternative installation methods.

Make sure to read the sandboxfs(1) manual page once installed as it contains a lot more information on usage and caveats. Unfortunately I cannot easily serve it preformatted in HTML (manview looked very promising… but it fails to render this page correctly).

NOTE: If you are a Bazel user… simply pass --experimental_use_sandboxfs to Bazel to get faster sandboxed builds! But, as the name implies, this is experimental so expect the possibility of breakage.

And I need help in many areas! Roughly:

  • Addressing To-Dos in the codebase: there are plenty of “little” problems annotated with TODO which would deliver more features, fix some obvious corner cases, and improve performance.

  • Figuring out what to do with the FUSE library. I would like to bring it up to a current kernel protocol version and make it multithreaded, but these changes are very intrusive and would likely require a fork. Not sure I have the time to maintain that…

  • Adding more features to OSXFUSE. In particular, the thing I would really like is to add getattrlist(2) support to the kernel extension (dunno if we’d need to change the protocol). Without this, Xcode tools do not run within a chroot… which prevents the pkg_comp use case described above.

Either way, let me know how it goes or if you would like to help 😊

Want more posts like this one? Take a moment to subscribe!

Enjoyed this article? Spread the word or join the ongoing discussion!