Bazel Dynamic Execution - Julio Merino (jmmv.dev)

Tree artifacts and transient files

To conclude the deep dive into Bazel’s dynamic spawn strategy, let’s look at the nightmare that tree artifacts have been with the local lock-free feature. And, yes, I’m double-posting today because I really want to finish these series before the end of the decade¹!

Tree artifacts are a fancy name for action outputs that are directories, not files. What’s special about them is that Bazel does not know a priori what the directory contents are: the rule behind the action just specifies that there will be a directory with files, and Bazel has to treat that as the unit of output from the action. Other than that, tree artifacts are “just” a different kind of output².

December 31, 2019 · Tags: bazel
Continue reading (about 4 minutes)

Lifting the local lock for dynamic execution

In the previous post, we saw how accounting for artifact download times makes the dynamic strategy live to its promise of delivering the best of local and remote build times.

Or does it? If you think about it closely, that change made it so that builds that were purely local couldn’t be made worse by enabling the dynamic scheduler: the dynamic strategy would always favor the local branch of a spawn if the remote one took a long time. But for builds that were better off when they were fully remote (think of a fully-cached build with great networking), this is not true: the dynamic strategy might hurt them because because we may discard some of those remote cache hits.

December 31, 2019 · Tags: bazel
Continue reading (about 4 minutes)

Artifact downloads and dynamic execution

In the previous post of this series, we looked at how the now-legacy implementation of the dynamic strategy uses a per-spawn lock to guard accesses to the output tree. This lock is problematic for a variety of reasons and we are going to peek into one of those here.

To recap, the remote strategy does the following:

Send spawn execution RPC to the remote service.
Wait for successful execution (which can come quickly from a cache hit).
Lock the output tree (only when run within the dynamic strategy).
Download the spawn’s outputs directly into the output tree.

Note how we lock the output tree before we have downloaded any outputs, and taking the lock means that the local branch of the same spawn cannot start or complete even if there are plenty of local resources available to run it.

December 30, 2019 · Tags: bazel
Continue reading (about 5 minutes)

Output conflicts and dynamic execution

When the dynamic scheduler is active, Bazel runs the same spawn (aka command line) remotely and locally at the same time via two separate strategies. These two strategies want to write to the same output files (e.g. object files, archives, or final binaries) on the local disk. In computing, two things trying to affect the same thing require some kind of coördination.

You might think, however, that because we assume that both strategies are equivalent and will write the same contents to disk¹, this is not problematic. But, in fact, it can be, because file creations/writes are not atomic. So we need some form of mutual exclusion in place to avoid races.

December 27, 2019 · Tags: bazel, sandboxfs
Continue reading (about 4 minutes)

Bazel's dynamic strategy

After introducing Bazel’s dynamic execution a couple of posts ago, it’s time to dive into its actual implementation details as promised. But pardon for the interruption in the last post, as I had to take a little detour to cover a necessary topic (local resources) for today’s article.

Simply put, dynamic execution is implemented as “just” one more strategy called dynamic. The dynamic strategy, however, is different from all others because it does not have a corresponding spawn runner. Instead, the dynamic strategy wraps two different strategies: one for local execution and one for remote execution.

December 26, 2019 · Tags: bazel
Continue reading (about 3 minutes)

Introduction to Bazel's dynamic execution

Bazel’s dynamic execution is a feature that makes your builds faster by using remote and local resources, transparently and at the same time. We launched this feature in Bazel 0.21 back in February 2019 along an introductory blog post and have been hard at work since then to improve it.

The reason dynamic execution makes builds faster is two-fold:

first, because we can hide hiccups in the connectivity to the remote build service; and,
second, because we can take advantage of things like persistent workers, which are designed to offer super-fast edit/build/test cycles.

Put in numbers, here is what dynamic execution looks like for a relatively large iOS build I measured at Google a few months ago:

December 20, 2019 · Tags: bazel
Continue reading (about 3 minutes)

Posts: Bazel Dynamic Execution