Today marks the 10th anniversary of Bazel’s public announcement so this is the perfect moment to reflect on what the next generation of build systems in the Bazel ecosystem may look like.

I write this with the inspiration that comes from attending the first ever conference on Buildbarn, one of the many remote execution systems for Bazel. In the conference, Ed Schouten, the creator of Buildbarn, presented Bonanza: a skunkworks reimagination of Bazel for truly large builds.

In this article, I want to dive into what Bonanza is and what similar projects to “replace Bazel” have existed. To get there though, we need to start first with a critique of the current implementation of Bazel.

Problems scaling up

The predecessor to Bazel, Blaze, is a build system designed at Google for Google’s monorepo scale. Blaze grew over the years assuming:

that every engineer had a beefy workstation under their desk;
that remote execution was expected to be used by default;
that the remote execution cluster was reachable through a fast and low latency network; and
that each office had physical hardware hosting a local cache of remote build artifacts.

These assumptions allowed Blaze to “scale up” to the very large codebase that Google builds, but they came with some downsides.

One consequence of these assumptions is that the Bazel process—confusingly named the “Bazel server”—that runs on your machine is very resource hungry. The reason is that this process has to scan the source tree to understand the dependency graph and has to coordinate thousands of RPCs against a remote cluster—two operations that aren’t cheap. What’s worse is that the Bazel server process is stateful: at start up, Bazel goes through the expensive steps of computing the analysis graph from disk and, to prevent redoing this work in every build, Bazel keeps this graph in its in-memory “analysis cache”.

The analysis cache is fragile. The Bazel server process may auto-restart, and certain flags used to control the build cause the cache to be discarded. These are not rare flags, no: these include basic flags like -c to change the compilation mode from debug to release, among many others. Cache discards are very intrusive to user workflows because an iterative build that would have taken a second now takes maybe thirty, for example.

This user experience degradation makes Bazel’s front-page claim of being fast hard to believe. Bazel is really fast at running gigantic builds from scratch and it is really efficient when executing incremental builds. But the problem is that “truly incremental builds” are a rarity, so you end up paying the re-analysis cost many more times than is necessary. If you run Bazel in a CI environment, you know that these costs are far from negligible because every single Bazel process invocation on a fresh CI node can take minutes to “warm up”.

Problems scaling down

There is also the other side of the coin, which is that Bazel does not scale down very well. This was one of my original critiques when Bazel went open source in 2015: at that time, I really wished for a declarative build system like Bazel to replace the mess that was and is Make plus the GNU Autotools, but the fact that Bazel was written in Java meant that it would never do this (mostly for non-technical reasons).

Regardless, I did join the Blaze team after that, and I spent most of my time coercing Blaze and Bazel to run nicely on “small” laptop computers. I succeded in some areas, but it was a losing battle: Java had certain deficiencies that prevent implementing lean software. Project Loom and Project Valhalla promise to bring the necessary features to Java, but these features aren’t quite there yet—and even when they are, retrofitting these into Bazel will be a very hard feat.

In any case. Bazel works, and it works nicely for many use cases, but it lives in this limbo state where it isn’t awesome for very large builds and it isn’t awesome for very small builds either. So, let’s look at the former: how do we make Bazel awesome for humongous builds? By lifting it in its entirety to the cloud.

Enter Bonanza

Bonanza is Ed’s playground for a new build system: a build system that takes remote execution to the limit. Where Bazel is only capable of shipping individual build actions to the cloud, Bonanza uses the cloud for everything, including external dependency handling, build graph construction and iteration, and more.

To understand what Bonanza brings to the table in the context of hugely scalable builds, let’s distill the salient points that Ed highlighted in his “Bonanza in a nutshell” slide from his conference presentation (recording coming soon):

Bonanza can only execute build actions remotely. There is no support for local execution, which makes the build driver (the process that runs on your machine) simpler and eliminates all sorts of inconsistencies that show up when mixing local and remote execution. Bazel’s execution strategies tend to enforce hermeticity, but they don’t always succeed because of sandboxing limitations.
Bonanza performs analysis remotely. When traditional Bazel is configured to execute all actions remotely, the Bazel server process is essentially a driver that constructs and walks a graph of nodes. This in-memory graph is known as Skyframe and is used to represent and execute a Bazel build. Bonanza lifts the same graph theory from the Bazel server process, puts it into a remote cluster, and relies on a distributed persistent cache to store the graph’s nodes. The consequence of storing the graph in a distributed storage system is that, all of a sudden, all builds become incremental. There is no more “cold build” effect like the one you see with Bazel when you lose the analysis cache.
Bonanza runs repo rules remotely. Repo rules are what Bazel uses to interact with out-of-tree dependencies, and they can do things like download Git repositories, toolchain binaries, or detect what compiler exists in the system. What you should know is that Blaze did not and does not have repo rules nor support for workspaces because Google uses a strict monorepo. Both the repo rules and the workspace were bolted-on additions to Blaze when it was open-sourced as Bazel, and it shows: these features do not integrate cleanly with the rest of Bazel’s build model, and they have been clunky for years. Bonanza fixes these issues with a cleaner design.
Bonanza encrypts data in transit and at rest. Bonanza brings to life some of the features discussed for the Remote Execution v3 protocol, which never saw the light of day, and encryption is one of them. By encrypting all data that flows through the system, Bonanza can enforce provenance guarantees if you control the action executors. This is important because it allows security-conscious companies to easily trust using remote build service providers.
Bonanza only supports rules written in Starlark. When Bazel launched, it included support for Starlark: a new extensibility language with which to write build logic in. Unfortunately, for historical reasons, Bazel’s core still included Java-native implementations of the most important and complex rules: namely, C++, Java and protobuf. Google has been chasing the dream of externalizing all rule implementations into Starlark for the last 10 years, and only in Bazel 8 they mostly have achieved this goal. Bonanza starts with a clean design that requires build logic to be written in Starlark, and it pushes this to the limit: almost everything, including flags, is Starlark.
Bonanza aims to be Bazel compatible. Of the modern build systems that use a functional evaluation model like Bazel, only Bazel has been able to grow a significant community around it. This means that the ecosystem of tools and rules, as well as critical features like good IDE support, is thriving in Bazel whereas this cannot be said of other systems. Bonanza makes the right choice of being Bazel compatible so that it can reuse this huge ecosystem. Anyone willing to evaluate Bonanza will be able to do so with relative ease.

When you combine all of these points, you have a build system where the client process on your development machine or on CI is thin: all the client has to do is upload the project state to the remote execution cluster, which in the common case will involve just uploading modified source files. From there, the remote cluster computes the delta of what you uploaded versus any previously-built artifacts and reevaluates the minimum set of graph nodes to produce the desired results.

Are your hopes up yet? Hopefully so! But beware that Bonanza is just a proof of concept for now. The current implementation shows that all of the ideas above are feasible and it can fully evaluate the complex bb-storage project from scratch—but it doesn’t yet provide the necessary features to execute the build. Ed appreciates PRs though!

Meanwhile, at Google

My time at Google is now long behind as I left almost five years ago, but back then there were two distinct efforts that attempted to tackle the scalability issues described in this article. (I can’t name names because they were never public, but if you probe ChatGPT to see if it knows about these efforts, it somehow knows specific details.)

One such effort was to make Blaze “scale up” to handle even larger builds by treating them all as incremental. The idea was to persist the analysis graph in a distributed storage system so that Blaze would never have to recompute it from scratch. This design still kept a relatively fat Blaze process on the client machine, but it is quite similar to what Bonanza does. Google had advantages over Bonanza in terms of simplicity because, as I mentioned earlier, Blaze works in a pure monorepo and does not have to worry about repo rules.

The other such effort was to make Blaze “scale down” by exploring a rewrite in Go. This rewrite was carefully crafted to optimize memory layouts, avoiding unnecessary pointer chasing (which was impossible to avoid in Java). The results of this experiment proved that Blaze could be made to analyze large portions of Google’s build graph in just a fraction of the time, without any sort of analysis caching or significant startup penalties. Unfortunately, this rewrite was way ahead of its time: the important rulesets required to build Google’s codebase were still implemented inside of Blaze as Java code, so this experimental build system couldn’t do anything useful outside of Go builds.

Meanwhile, at Meta

My knowledge of Meta’s build system is limited, but almost two years ago, Meta released Buck 2, a complete reimplementation of their original Bazel-inspired build system. This reimplementation checked most of the boxes for what I thought a “Bazel done right” would look like:

Buck 2 is written in a real systems language (Rust). As explained earlier, I had originally criticised Java’s choice as one of Bazel’s weaknesses. It turns out Meta realized this same thing because Buck 1 had also been written in Java and they chose to go the risky full-rewrite route to fix it. (To be fair, you need to understand that Java had been a reasonable choice back then: when both Blaze and Buck 1 were originally designed, C++11—possibly the only reasonable edition of C++—didn’t even exist.)
Buck 2 is completely language agnostic. Its core build engine does not have any knowledge of the languages it can build, and all language support is provided via Starlark extensions. This stems from learning about earlier design mistakes of both Blaze and Buck 1. Meta chose to address this as part of the Rust rewrite, whereas Google has been addressing it incrementally.
Buck 2 has first-class support for virtual file systems. These are a necessity when supporting very large codebases and when integrating with remote build systems, but are also completely optional. Blaze also had support for these, but not Bazel.

At launch, I was excited and eager to give Buck 2 a try, but then the disappointment came in: as much as it walks and quacks like Bazel due to its use of Starlark… the API that Buck 2 exports to define rules is not compatible with Bazel’s. This means that Buck 2 cannot be used in existing Bazel codebases, so the ability to evaluate its merits in a real codebase is… insurmountable. In my mind, this made Buck 2 dead on arrival, and it’s yet to be seen if Meta will be able to grow a significant public ecosystem around it.

In any case, I do not have any experience with Buck 2 because of the previous, so I cannot speak to its ability to scale either up or down. And this is why I wrote this section: to highlight that being Bazel-compatible is critical in this day and age if you want to have a chance at replacing a modern system like Bazel. Bonanza is Bazel-compatible so it has a chance of demonstrating its value with ease.

What lies ahead

If you ask me, it seems impossible to come up with a single build system that can satisfy the wishes of tiny open-source projects that long for a lean and clean build system and that can satisfy the versatility and scale requirements of vast corporate codebases.

Bazel-the-implementation tries to appeal to both and falls short, yet Bazel-the-ecosystem provides the lingua franca of what those implementations need to support.

My personal belief is that we need two build systems that speak the same protocol (Starlark and Bazel’s build API) so that users can interchangeably choose whichever one works best for their use case:

On the one hand, we need a massively scalable build system that does all of the work in the cloud. This is to support building monorepos, to support super-efficient CI runs, and to support “headless” builds like those offered by hosted VSCode instances. Bonanza seems to have the right ideas and the right people behind it to fulfill this niche.
On the other hand, we need a tiny build system that does all of the work locally and that can be used by the myriad of open-source projects that the industry relies on. This system has to be written in Rust (oops, I said it) with minimal dependencies and be kept lean and fast so that IDEs can communicate with it quickly. This is a niche that is not fulfilled by anyone right now and that my mind keeps coming to; it’d be fun to create this project given my last failed attempt.

The time for these next-generation Bazel-compatible build systems is now. Google has spent the last 10 years Starlark-ifying Bazel, making the core execution engine replaceable. We are reaching a point where the vast majority of the build logic can be written in Starlark as Bonanza proves, and thus we should be able to have different build tools that implement the same build system for different use cases.