As strange as it may sound, a very important job of any build tool is to orchestrate the execution of lots of other programs—and Bazel is no exception.
Once Bazel has finished loading and analyzing the build graph, Bazel enters the execution phase. In this phase, the primary thing that Bazel does is walk the graph looking for actions to execute. Then, for each action, Bazel invokes its commands—things like compiler and linker invocations—as subprocesses. Under the default configuration1, all of these commands run on your machine.
Running a subprocess is easy, but running thousands of them in an efficient and controlled manner is not. Yet every build tool has to do this. And every build tool has to come up with some form of process control to prevent processes from going rogue. In particular, Bazel ensures that its direct child processes do not leave other grandchild processes behind when terminated (especially on Ctrl+C2) and imposes a deadline for their execution.
On Unix systems, we have been made to believe that
exec are the pinnacle of simplicity and flexibility for a process-management API… and they might be so for trivial cases. But they definitely come with a lot of problems as explained in the criticized A
fork() in the road paper. Unfortunately, some features are only available via this mechanism (not via the alternate
posix_spawn), so we must use these two primitives.
But Bazel is written in Java, and the Java APIs for process management only support the basics to start and stop subprocesses. These APIs do not allow handling signals or dealing with process groups, both of which are crucial for the process control features required by Bazel.
You might say that JNI should be sufficient to fix this, but that’s not the case. If we were to implement these features in JNI, we’d need to have code that
forks subprocesses from JNI… and that’d be very problematic. By the time Bazel starts the execution phase, the Bazel process is massively multi-threaded, and you should be aware that
forking a multi-threaded program is not safe. We can trust that Java’s own
ProcessBuilder does the right thing, but we may not implement our own.
So how does Bazel solve this problem? By using separate helper binaries3 written in C++ (well, an ugly hybrid of C and C++). These helper binaries wrap the execution of the arbitrary commands that Bazel really wants to run and perform the necessary low-level process manipulation. Then, Bazel can use the simpler
ProcessBuilder API to spawn these processes, trusting that the wrappers are well-behaved and don’t require careful process control.
There are two of these binaries: the process wrapper and the Linux sandbox, but I’m going to focus on the former for the rest of this post.
The process wrapper, located at
src/main/tools/process-wrapper.cc, is a very simple tool with the following functionality:
- Runs a subprocess with an optional timeout,
- redirects stdout and stderr to optionally-provided files,
- ensures the subprocess and all of its children are terminated on exit,
- and optionally collects runtime metrics from the subprocess.
Once built, the tool is bundled into the Bazel executable and is placed in the install base at startup time. You can invoke it by hand if you wish to inspect its features:
.../some/workspace$ "$(bazel info install_base)"/_embedded_binaries/process-wrapper
No command specified.
Usage: /var/tmp/_bazel_jmmv/install/154077aea4bc476f9086f48e48456396/_embedded_binaries/process-wrapper -- command arg1 @args
-t/--timeout <timeout> timeout after which the child process will be terminated with SIGTERM
-k/--kill_delay <timeout> in case timeout occurs, how long to wait before killing the child with SIGKILL
-o/--stdout <file> redirect stdout to a file
-e/--stderr <file> redirect stderr to a file
-s/--stats <file> if set, write stats in protobuf format to a file
-d/--debug if set, debug info will be printed
-- command to run inside sandbox, followed by arguments
Feel free to play around with the flags for a little bit.
Going back to the list of features, the most interesting one is terminating the subprocess and all of its (transitive) children. And this is, in my opinion, the most important feature. For example: if Bazel runs a test program and said test program decides to spawn a second program, Bazel must ensure that both of these terminate once the test completes (either successfully, due to a failure, or on an interrupt).
For example, if you run this command:
.../some/workspace$ /bin/sh -c '/bin/sh -c "sleep 5; echo 2" & echo 1'
you will see that the number
2 appears after a 5-second delay even if the shell already regained control. But if you run the exact same command via the process wrapper:
.../some/workspace$ "$(bazel info install_base)"/_embedded_binaries/process-wrapper /bin/sh -c '/bin/sh -c "sleep 5; echo 2" & echo 1'
2 won’t appear because the process wrapper will have cleaned up the stray subprocess on exit.
The way this works is by placing the first subprocess under a new process group and then making the process wrapper terminate this whole group in unison. Yes, process groups are not infallible because a process can trivially escape them (e.g. by creating its own process group or by starting a new session). But the process wrapper does not intend to be bullet-proof in this regard: first, it’s not a security mechanism; and, second, it just can’t do so in a portable manner.
But we can do better if we are willing to introduce system-specific code. On Linux, we can handle this with PID namespaces, which the aforementioned Linux sandbox wrapper does, and with the child subreaper feature. On macOS, we can do some tricks by walking the process table. And I hear the native process-management APIs on Windows are much better in this regard.
Stay tuned; more on these topics in upcoming posts!
Local-only builds are the default in Bazel—but not in Blaze, the Google-internal variant of Bazel. Blaze has relied on the Google-internal distributed build system for many years and, as a result, the local execution aspects of Bazel have been “neglected” for quite a while. But don’t panic. Local actions are something that I’m actively trying to improve because they are making a comeback due to dynamic scheduling. ↩︎
You might have experienced cases where you Ctrl+C a
makeinvocation and the build commands are left behind, which is definitely not what you want to happen. As far as I know, this is still a problem with
bmakeon macOS. ↩︎
Trust me, the helper binaries approach is infinitely easier to implement and reason about than trying to multiplex subprocess handling within a single process. I did the latter in Kyua and it was… complicated… ↩︎