Julio Merino's personal blog
Quite some time ago, I wrote a handful of guest posts for O’Reilly’s ONLamp.com online publication. Unfortunately, that site seems now gone from the interwebs and I only noticed by chance upon realizing that some of my links to those articles were now broken. Fortunately, I was able to find copies of those articles via archive.org’s WayBack Machine. So I have now taken the chance to import those articles into this site.
Over the summer, I prototyped a bunch of web apps whose ideas had been floating in my mind for a long time. I spent quite a bit of time learning about REST APIs and, as part of these exercises, implemented skeletons of REST servers in both Go and Rust. (Just for context, the last time I wrote a web app was in high school… and it involved PHP, MySQL, and I think IE6?
The current working directory, or CWD for short, is a process-wide property. It is good practice to treat the CWD as read-only because it is essentially global state: if you change the CWD of your process at any point, any relative paths you might have stored in memory1 will stop working. I learned this first many years ago when using the Boost.Filesystem library: I could not find a function to change the CWD and that was very much intentional for this reason.
Bazel likes creating very deep and large trees on disk during a build. One example is the output tree, which naturally contains all the artifacts of your build. Another, more problematic example is the symlink forest trees created for every action when sandboxing is enabled. As garbage gets created, it must be deleted. It turns out, however, that deleting file system trees can be very expensive—and especially so on macOS. In fact, calls to our deleteTree algorithm routinely showed up in my profiling runs when trying to diagnose slowdowns using the dynamic scheduler.
Since the publication of Bazel a few years ago, users have reported (and I myself have experienced) general slowdowns when Bazel is running on Macs: things like the window manager stutter and others like the web browser cannot load new pages. Similarly, after the introduction of the dynamic spawn scheduler, some users reported slower builds than pure remote or pure local builds, which made no sense. All along we guessed that these problems were caused by Bazel’s abuse of system threads, as it used to spawn 200 runnable threads during analysis and used to run 200 concurrent compiler subprocesses.
This is the tale of yet another Bazel bug, this time involving environment variables, global state, and gRPC. Through it, I'll argue that you should never use setenv within a program unless you are doing so to execute something else.
The point of this post is simple and I’ll spoil it from the get go: every time you make an assumption in a piece of code, make such assumption explicit in the form of an assertion or error check. If you cannot do that (are you sure?), then write a detailed comment. In fact, I’m exceedingly convinced that the amount of assertion-like checks in a piece of code is a good indicator of the programmer’s expertise.