A key feature of Bazel is its ability to produce fast, reliable builds by caching the output of actions. This system, however, relies on a fundamental principle: build actions must be deterministic. For the most part, Bazel helps ensure that they are, but in the odd cases when they aren’t, builds can fail in subtle and frustrating ways, eroding trust in the build system.

This article is the first in a series on Bazel’s execution model. Having explained these concepts many times, I want to provide a detailed reference before explaining a cool solution to a problem I recently developed at work. We will start with action non-determinism, then cover remote caching and execution, and finally, explore the security implications of these features.

This first article explains what non-determinism is, how it manifests, and how you can diagnose and prevent it in your own builds. Let’s begin.

Bazel execution basics

Consider the following example build file:

cc_library(
    name = "bs5-lib",
    srcs = ["lib1.c", "lib2.c"],
)

cc_binary(
    name = "bs5-bin",
    srcs = ["main.c"],
    deps = [":bs5-lib"],
)

This build file specifies two targets: the bs5-lib target, which builds a C library from two source files, and the bs5-bin target, which builds a C binary from one source file and links it against the bs5-lib library.

These two targets instantiate the cc_library and cc_binary rules by binding them to specific attributes (the values of srcs and deps).

Processing these rules during dependency analysis yields a collection of actions:

  • The cc_library rule used to define the bs5-lib target generates:

    • A CppCompile action to compile the lib1.c file into the lib1.o object file.

      Its command line may be: cc -o lib1.o lib1.c

    • A CppCompile action to compile the lib2.c file into the lib2.o object file.

      Its command line may be: cc -o lib2.o lib2.c

    • A CppArchive action to link lib1.o and lib2.o together into the bs5-lib.a archive.

      Its command line may be: ar rcsD bs5-lib.a lib1.o lib2.o

  • The cc_binary rule used to define the bs5-bin target generates:

    • A CppCompile action to compile the main.c file into the main.o object file.

      Its command line may be: cc -o main.o main.c

    • A CppLink action to link main.o and bs5-lib.a together into the bs5-bin executable.

      Its command line may be: cc -o bs5-bin main.o bs5-lib.a

Note that nowhere in the list above do you see target names. Actions work with file-level dependencies, not target-level dependencies. If you need to visualize this, think of the target dependency graph and the action dependency graph as two disjoint entities. (Skyframe tracks them as just one graph but we can ignore that fact here.)

It’s this, actions, that are the atomic unit of execution in Bazel. Once Bazel is done with its loading and analysis phases, it enters the execution phase. During execution, the “only” thing that Bazel does is dispatch actions for execution via its execution strategies, trying to maximize parallelism as determined by the constraints of the action dependency graph.

Build actions anatomy

To break down an action into its parts, let’s examine what goes into defining the CppLink action above, and to do that, let’s first focus on its simple command line to produce the bs5-bin binary from the main.o object file and the bs5-lib.a static library:

cc -o bs5-bin main.o bs5-lib.a

Bazel tracks the command line as part of the action, but things are a bit more complex than that. And to explain the “complexity”, let’s try to understand what problems Bazel is trying to solve compared to a more rudimentary build tool like Make.

If you have used (or still use) Make, you would have likely expressed the corresponding build rule as:

bs5-main: main.o bs5-lib.a
    $(CC) $(LDFLAGS) -o bs5-main main.o bs5-lib.a

which looks… OK, I guess. But what happens if you do this?

# Build the binary.
make bs5-main

# Build the binary _again_ in stripped mode.
make bs5-main LDFLAGS=-s

An inconsistent build! The bs5-main binary is not stripped as you would expect because the second make invocation does nothing! make has no idea that the LDFLAGS variable is involved in the target definition so it doesn’t know that the target has to be rebuilt to honor the variable change.

This type of scenario is what leads to having to run make clean from time to time in a Make-based build system because the outputs that Make produces get out of sync with environmental changes. And the reason is that the only thing that make tracks to determine whether a target needs to be rebuilt are the file timestamps of the inputs that are explicitly listed in the rule (main.o and bs5-lib.a in this example).

Now you’d say: but you can fix it! “Just” do:

# An unconditional rule that captures the content of LDFLAGS.
.PHONY: ldflags.stamp.2
ldflags.stamp.2:
    @echo $(LDFLAGS) >ldflags.stamp.2

# A conditional rule that only updates the timestamp of the stamp file if the
# content captured by the unconditional rule changes.
ldflags.stamp: ldflags.stamp.2
    @cmp -s ldflags.stamp ldflags.stamp.2 || cp ldflags.stamp.2 ldflags.stamp

# The rule we had before, but with an extra dependency on the stamp file.
bs5-main: main.o bs5-lib.a ldflags.stamp
    $(CC) $(LDFLAGS) -o bs5-main main.o bs5-lib.a

And indeed this ensures that the bs5-main target gets re-linked if LDFLAGS changes. But I hope you’ll agree that this is awful and that nobody does it because: one, most folks writing Makefiles aren’t aware of the problem; and, two, even if they are, it’s too hard to get it right (see… we forgot about the value of CC and whatever other environment variables might influence CC’s behavior like, you know, the PATH?).

I didn’t come here to bash against Make. OK, maybe I did because folks out there often say “Make works just fine and it’s much simpler than Bazel!” when in reality they are oblivious to a bunch of very real problems that later waste other people’s time when their build environment subtly breaks. </rant>

Bazel and other next-generation build systems solve this specific problem and more by being comprehensive about what they track at the action level, and using that information to determine whether an action needs to be rebuilt or not. In particular, a Bazel action is defined by these parts:

  • The command line to execute, which in this case is cc -o bs5-bin main.o bs5-lib.a.

  • Hashes of the input files required to execute the command. These include “obvious” inputs like the source files specified in the targets but also the files required to execute the tools of the action (e.g. the compiler’s own files). In this case, the list of input files could look like: main.o, bs5-lib.a, and cc.

  • The configuration of the environment in which the action runs. This includes environment variables, the host and target platforms, and things like that. In this case, the configuration could include the value of PATH and whether we are building in debug or optimized mode. Configurations are expressed as a hash, though, because of the many details that go into computing them.

These three properties define quite precisely the relation between the context of an action and the outputs it produces, and this is the main technique that Bazel uses to avoid clean builds at large scale.

Hello non-determinism

But pay attention to the “quite” word in “quite precisely” right above. I did not say “perfectly” because there are still ways for non-deterministic behavior to leak into a Bazel build, meaning that bazel clean and unexpected rebuilds (e.g. due to cache expiration) could still change the behavior of a build.

Consider this innocuous example:

genrule(
    name = "date",
    outs = ["date.txt"],
    cmd = "date >$@",
)

This rule says: stick the output of the date command, which prints the current date, into the date.txt file. Obviously, “current date” varies over time so we should expect the above to give us trouble. And indeed it does: look at this sequence of commands where I’ve removed all irrelevant Bazel console noise:

$ bazel build //:date
INFO: 2 processes: 1 internal, 1 linux-sandbox.
$ cat bazel-bin/date.txt
Mon Jul  7 17:59:29 PDT 2025

$ bazel build //:date
INFO: 1 process: 1 internal.
$ cat bazel-bin/date.txt
Mon Jul  7 17:59:29 PDT 2025

$ bazel clean

$ bazel build //:date
INFO: 2 processes: 1 internal, 1 linux-sandbox.
$ cat bazel-bin/date.txt
Mon Jul  7 18:02:15 PDT 2025

$ █

The first Bazel build claims to have executed 1 action in the sandbox and the date.txt file shows us the date when that happened. The second Bazel build does nothing and date.txt remains unmodified. But if we later follow that by a Bazel clean and a third Bazel build, we see that the content of date.txt is now different. Non-determinism has leaked into the build, and… that’s problematic.

How bad is non-determinism really?

Non-determinism is a problem because it prevents achieving reproducible builds. On the one hand, this voids the security guarantees that come from being able to reproduce builds in different environments: if the output of the build is not bit-for-bit identical to its inputs, you can’t verify that a binary that’s being used in production actually comes from the sources it claims to have been built from. On the other hand, this leads to situations where developers get different behavior depending on when/where they build the code: you do not want to hear the “works on my machine” excuse when troubleshooting a bug. So, it is bad.

But one interesting property of Bazel’s action model is that a single non-deterministic action does not necessarily poison the whole build. Take a look at this build file that defines a chain of actions:

# date target: Writes a non-deterministic date to its output.
genrule(
    name = "date",
    outs = ["date.txt"],
    cmd = "date >$@",
)

# copy target: Consumes the output of "date" and copies it to
# its output.
genrule(
    name = "copy",
    outs = ["copy.txt"],
    cmd = "cp $< $@",
    srcs = [":date"],
)

# count target: Consumes the output of "copy" and produces an
# output that does not vary due to the input non-determinism.
genrule(
    name = "count",
    outs = ["count.txt"],
    cmd = "wc -l $< >$@",
    srcs = [":copy"],
)

# copy2 target: Consumes the output of "stop" and copies it to
# its output.
genrule(
    name = "copy2",
    outs = ["copy2.txt"],
    cmd = "cp $< $@",
    srcs = [":count"],
)

The interesting bit here is in the count target, which counts the lines in its input and writes the resulting number to its output. While this target consumes a non-deterministic input, its output is deterministic because the number of lines in the input is constant: date writes a different timestamp each time, but it always produces one line.

The fact that the target produces a deterministic output allows Bazel to stop “propagating” non-determinism across the build. Remember that actions track input hashes, not input timestamps. Once count is re-executed after changes to date, the output of count will have the same hash as it did before, and copy2 will conclude that it doesn’t need to be rerun.

Let’s try it:

$ bazel build //:copy2
INFO: 5 processes: 1 internal, 4 linux-sandbox.

$ rm -f bazel-bin/date.txt

$ bazel build //:copy2 --explain log
INFO: 4 processes: 1 action cache hit, 1 internal, 3 linux-sandbox.

$ cat log
Build options: --explain=log
Executing action 'BazelWorkspaceStatusAction stable-status.txt': unconditional execution is requested.
Executing action 'Executing genrule //:date': One of the files has changed.
Executing action 'Executing genrule //:copy': One of the files has changed.
Executing action 'Executing genrule //:count': One of the files has changed.

$ █

The sequence of commands above proves the point: the first build of the copy2 target tells us that Bazel executed 4 sandboxed actions (one for each target). If we then remove the non-deterministic file from the output tree and ask Bazel to rebuild the copy2 target, we see how it only rebuilt 3 targets and 1 of them scored a cache hit. And by inspecting the log we asked Bazel to produce, we see that it effectively rebuilt date, copy, and count, but it didn’t have to rebuild copy2 because the non-determinism didn’t propagate further.

In a Make world, the above sequence of commands would have invalidated the whole build because Make just checks timestamps, and targets almost-always update the timestamps of their outputs unless we go through great extents to prevent it (like I did earlier on in the stamp file rule with its call to cmp -s).

Possible non-determinism causes

In the previous example, it was rather obvious that a call to date could be problematic. But this is not the only source of non-determinism, and oftentimes the reason behind the non-determinism isn’t as obvious. Here is a more comprehensive list of possible causes:

  • Date and time. You might not be calling date, but build tools—especially code generators and archivers like zip—love injecting timestamps in their output files. These may be obvious, like comments in generated files, or subtle, like values written in binary metadata headers.

  • System identifiers. Similarly to “current date”, there are tools that query the current PID, UID, GID, etc. and inject those values in their outputs.

  • Sort ordering. Hash tables are the star data structure in computer science and they are everywhere. Unfortunately, there are tools that leak their internal use of hash tables into output files by, for example, emitting unsorted lists.

  • Accessing the network. Just don’t.

  • Unexpected/unknown dependencies on host tools. Calling a tool from the system means introducing hidden dependencies on whatever the tool itself depends on. For example, the tool might read a configuration file that alters its behavior.

  • Dynamic execution. This powerful feature that helps improve incremental build times in interactive scenarios can easily lead to non-determinism if the remote execution environment and the local execution environment aren’t equivalent (where equivalent is tricky to define).

  • Foreign CC rules. Bazel tries to enforce action determinism as we saw earlier, but other build systems make little efforts to do so. If you end up nesting build systems, as is the case when using this ruleset, it’s very likely that you are introducing non-determinism.

  • Randomness. Tools can decide to read from /dev/random and do something with that value, in which case you definitely have non-determinism.

Doesn’t sandboxing fix it all?

The list above is long, and there is this assumption, especially from newcomers to Bazel, that Bazel’s sandboxing ensures that build behavior is deterministic.

In a theoretical world, that would be true: Bazel would execute each action in a precisely controlled environment to ensure that actions behaved exactly the same from run to run. This would require using a cycle-accurate virtual machine to precisely control instruction scheduling (multithreading can also introduce non-determinism) and entropy sources, but as you can imagine, this would make build execution extremely slow.

In a practical world, sandboxing has to grant some concessions in the name of performance: otherwise, people will end up disabling sandboxing, nullifying all of its benefits.

Furthermore, sandboxing isn’t something magical you can “do” from userspace (unless you write a full machine emulator). Sandboxing requires kernel support, and different kernels offer different sandboxing technologies. In turn, this means that what Bazel can sandbox or not depends on the machine that Bazel is running on. For example: Bazel’s sandbox on Linux is able to restrict file accesses, offer stable PIDs, and forbid network accesses—but the macOS sandbox, based on the deprecated sandbox-exec, cannot mangle the PID namespace.

Diagnosing non-determinism

So. We know non-deterministic actions can exist in a Bazel build and that sandboxing isn’t going to protect us from them. In that case, how can we tell if such actions have leaked into our build? We can use the “execution log” feature in Bazel to write a detailed log of all the actions that Bazel executes. Then, we can diff the logs of two separate builds and see if they differ.

Looking back to our chain of actions from the last example, we could capture two fresh execution logs by doing this:

$ bazel clean
$ bazel build --noremote_accept_cached --execution_log_json_file=before //:copy2

$ bazel clean
$ bazel build --noremote_accept_cached --execution_log_json_file=after //:copy2

$ █

Note: it is important to start from a clean build and to tell Bazel to not reuse remotely-cached actions. In this way, we force Bazel to reexecute the whole build, which should uncover non-determinism if it exists. Also, make sure to keep --execution_log_sort enabled (the default).

Once we have run the above, we can proceed to diff the logs. I like doing diff -u before after | cdiff, but you can use whichever file diffing UI you prefer:

--- before 2025-07-15 18:02:46.838524511 -0700
+++ after  2025-07-15 18:02:51.870531304 -0700
@@ -25,7 +25,7 @@
   "actualOutputs": [{
     "path": "bazel-out/k8-fastbuild/bin/date.txt",
     "digest": {
-      "hash": "d6e32f2792db61b80e67707fa24bc3f3704d65267b871809cd9d2969fe80d39d",
+      "hash": "c2ae04c2b2ff29fd70372a9e399a8f8d5f18ea7395d45d9a34fdbc9decac854b",
       "sizeBytes": "29",
       "hashFunctionName": "SHA-256"
     },
@@ -63,7 +63,7 @@
   "inputs": [{
     "path": "bazel-out/k8-fastbuild/bin/date.txt",
     "digest": {
-      "hash": "d6e32f2792db61b80e67707fa24bc3f3704d65267b871809cd9d2969fe80d39d",
+      "hash": "c2ae04c2b2ff29fd70372a9e399a8f8d5f18ea7395d45d9a34fdbc9decac854b",
       "sizeBytes": "29",
       "hashFunctionName": "SHA-256"
     },

Voila. The first chunk of the log tells us that the first non-deterministic action is the one that writes the date.txt file, and the second chunk of the log tells us that there is another action that consumes said file as an input.

Keeping on top of non-determinism

Let’s finish the article by giving you some practical tips to remove non-determinism from the build and to make sure it doesn’t come back:

  • Set up a CI pipeline that identifies new instances of non-determinism. Unless you are proactive about it, non-determinism will creep back in because neither the local sandbox not remote execution can fully prevent it.

  • Keep the local sandbox enabled. It may not be perfect but it’s much better than nothing. Also, enable --nosandbox_default_allow_network explicitly because, for historical reasons, the sandbox did not forbid network access and the default hasn’t been flipped yet.

  • Rely on hermetic toolchains. Do not use the system-provided ones because they tend to have dependencies on system-provided files that are invisible to the Bazel action definitions. (E.g. if you use the host-provided gcc, the compiler will happily embed /usr/lib/gcc/x86_64-linux-gnu/15/crtbegin.o into the final binary and this will be invisible to Bazel.)

  • Force remote execution. Sometimes, non-determinism is inevitable or really hard to avoid (e.g. if you use the Foreign CC ruleset). Under these conditions, your best bet is to force the problematic actions to run remotely under a strictly controlled environment and to provision the remote cache so that such actions “never” fall out. If done correctly, this will “hide” the non-determinism because, once an action has been built, it will never be rebuilt again until its known inputs actually change.

  • Sanitize the action’s environment. Use --action_env and --host_action_env to keep settings like the PATH consistent across machines, and use --strict_action_env to minimize the environment variables that leak into action execution.

  • Think about network access. If you must, do it from repo rules and always verify that whatever you downloaded matches known checksums. If you are strict about checksum validation, you’ll still have a non-hermetic build, but at least, you’ll have a deterministic one. If you have such behavior in a test, don’t allow its results to be cached by means of the no-remote-cache tag.

And with that, it’s time to conclude until the next episode on remote caching.