Showing 16 posts
“Fast machines, slow machines”… ah, the post that spawned these series. As I frantically typed that article while replying to angry tweets, the thought came to mind: software engineering as a whole is hyper-focused on lowering the costs to write new code, yet there is a disregard for the costs that these improvements bring to other disciplines in a company on even to end users. So, in this series finale, I want to compare how some choices that apparently lower development costs actually increase costs elsewhere. I also want to highlight how, if we made different decisions during development, we could possibly expose those extra costs early on. This is beneficial because exposing costs upfront allows us to make tough choices when there is still a chance of changing course. To make things specific, I will look at how the use of modern frameworks that facilitate development can end up hurting performance, reliability, and usability. So let’s start with a three-part rant first (sorry) and then let’s look at what we might do.
In the previous post, I proposed that certain engineering practices expose systemic costs and help with planning while other practices hide those same costs and disturb ongoing plans. The idea I’m trying to convey is hard to communicate in the abstract so, in that post, I used the differences between a monorepo and a multirepo setup as an example. Today, I’ll expore a different scenario to support the same idea. I’m going to talk about how certain ticket assignment practices during on-call operations can expose service support costs vs. how other practices hide them. Keep in mind that, just like in the previous post, I do not want to compare the general merits of one approach vs. the other. The only thing I want to compare is whether one approach centralizes toil and allows management to quantify its cost vs. how another approach hides toil by smearing it over the whole team in hard-to-quantify ways. Whether management actually does something to correct the situation once the costs are exposed is a different story.
In software engineering organizations, there are certain practices that keep costs under control even if those seem more expensive at first. Unfortunately, because such practices feel more expensive, teams choose to keep their status quo even when they know it is suboptimal. This choice ends up hurting productivity and morale because planned work is continuously interrupted, which in turn drags project completion. The reason I say seem and not are is because the alternatives to these cost-exposing practices also suffer from costs. The difference is that, while the former surface costs, leading to the need to allocate time and people to infrastructure work, the latter keeps the costs smeared over teams and individuals in ways that are difficult to account and plan for. To illustrate what I’m trying to say, I’ll present three different scenarios in which this opinion applies. All of these case studies come from past personal experiences while working in different teams and projects. The first one covered in this post is about the adoption of a monorepo vs. the use of multiple different repositories. The other two will come in follow-up articles.
Well, that was unexpected. I recorded a couple of crappy videos in 5 minutes, posted them on a Twitter thread, and went viral with 8.8K likes at this point. I really could not have predicted that, given that I’ve been posting what-I-believe-is interesting content for years and… nothing, almost-zero interest. Now that things have cooled down, it’s time to stir the pot and elaborate on those thoughts a bit more rationally. To summarize, the Twitter thread shows two videos: one of an old computer running Windows NT 3.51 and one of a new computer running Windows 11. In each video, I opened and closed a command prompt, File Explorer, Notepad, and Paint. You can clearly see how apps on the old computer open up instantly whereas apps on the new computer show significant lag as they load. I questioned how computers are actually getting better when trivial things like this have regressed. And boom, the likes and reshares started coming in. Obviously some people had issues with my claims, but there seems to be an overwhelming majority of people that agree we have a problem. To open up, I’ll stand my ground: latency in modern computer interfaces, with modern OSes and modern applications, is terrible and getting worse. This applies to smartphones as well. At the same time, while UIs were much more responsible on computers of the past, those computers were also awful in many ways: new systems have changed our lives substantially. So, what gives?
In MVC isn’t MVC, which hit the Hacker News front page overnight, Collin Donnell describes how the MVC design pattern that we use today isn’t really what was originally envisioned in 1979 by Tyrgve Reenskaug. This prompted me to think about how this architecture, if tweaked even further, maps pretty well to today’s designs of other kinds of programs, and I want to explore two cases in this post: web services and CLI apps.
Rust is infamous for having a steep learning curve. The borrow checker is the first boss you must defeat, but with a good mental model of how memory works, how objects move, and the rules that the borrow checker enforces, it becomes second nature rather quickly. These rules may sound complicated, but really, they are about understanding the fundamentals of how a computer works. That said… the difficulties don’t stop there. Oh no. As you continue to learn about the language and start dealing with things like concurrency—or, God forbid, Unix signals—things can get tricky very quickly. To make matters worse, mastering idiomatic Rust and the purpose of core traits takes a lot of time. I’ve had to throw my arms up in frustration a few times so far and, while I’ve emerged from those exercises as a better programmer, I have to concede that they were exhausting experiences. And I am certainly not an expert yet. So, yes, there is no denying in saying that Rust is harder than other languages. But… does it matter in practical terms? Betteridge’s law of headlines says that we should conclude the post right here with a “no”—and I think that’s the right answer. But let’s see why.
Earlier this week, a 2-year old post titled I want off Mr. Golang’s wild ride by @fasterthanlime made the news rounds again. This post raises a bunch of concerns on the Go language and is posted from the perspective of someone who prefers Rust. And, just yesterday, I noticed a comment on Twitter by @FiloSottile that, paraphrased, reads “Why is there so much hatred towards Go, especially from Rust developers?”. I wish I could answer this question with a “no, there isn’t”, but that would be a lie: in any large community, there will certainly be hateful people/opinions. If you have encountered such flamebait, I’m sorry, and I’m not here to defend it. What I’m here to do is look at the possible truth behind the claim that Rust developers dislike Go, and I wanted to elaborate on this based on my personal experience.
The GNU project is the source of the Unix userland utilities used on most Linux distributions. Its compatibility with standards and other Unix systems, or lack thereof, directly impacts the overall portability of any piece of software developed from GNU/Linux installations. Unfortunately, the GNU userland does not closely adhere to standards, and its widespread usage causes little incompatibilities to creep into any software created on GNU/Linux systems. Read on for why this is a problem and the pitfalls you will encounter.
We have all seen discussions go like this: someone first complains that an application like Google Chrome is wasteful because it uses multiple GBs of memory. Someone else comes along and says that memory is there to be used for speed and therefore this is the correct behavior: if the computer has multiple GBs of free memory, an application such as Chrome should make use of all the available memory in the form of a cache to be as responsive as possible. Makes sense, right? Yes, it does makes sense—as long as Chrome is the only application running. Let’s explore why this is not a great idea.
A recent tweet that caught my attention read: “principal engineers should be on-call”. Of course they should be! I’m “surprised” they aren’t everywhere, but I can imagine some reasons to justify their situation. Let’s change that in this thread. 🧵 👇
A good philosophy to live by at work is to “always be quitting”. No, don’t be constantly thinking of leaving your job 😱. But act as if you might leave on short notice 😎. Counterintuitively, this will make you a better engineer and open up growth opportunities. A thread 👇.
Monorepos are an interesting beast. If mended properly, they enable a level of uniformity and code quality that is hard to achieve otherwise. If left unattended, however, they become unmanageable monsters of tangled dependencies, slow builds, and frustrating developer experiences. Whether you have a good or bad experience directly depends on the level of engineering support behind the monorepo. Simply put, monorepos require dedicated teams and tools to run nicely. In this post, I will look at how almost-perfect caching plays a key role in keeping build times manageable under such an environment.
During my 11 years at Google, I can confidently count the number of times I had to do a “clean build” with one hand: their build system is so robust that incremental builds always work. Phrases like “clean everything and try building from scratch” are unheard of. So… you can color me skeptical when someone says that incremental build problems are due to bugs in the build files and not due to a suboptimal build system. The answer lies in having a robust build system, and in this post I’ll examine the common causes behind incremental build breakages, what the build system can do to avoid them, and how Bazel accomplishes most of them.
If you have followed Windows 10 at all during the last few years, you know that the Windows Subsystem for Linux, or WSL for short, is the hot topic among developers. You can finally run your Linux tooling on Windows as a first class citizen, which means you no longer have to learn PowerShell or, god forbid, suffer through the ancient CMD.EXE console. Unfortunately, not everything is as rosy as it sounds. I now have to do development on Windows for Windows as part of my new role within Azure… and the fact that WSL continues to be separate from the native Windows environment shows. Even though I was quite hopeful, I cannot use WSL as my daily driver because I need to interact with “native” Windows tooling. I believe things needn’t be this way, but with the recent push for WSL 2, I think that the potential of an alternate world is now gone. But what do I mean with this? For that, we must first understand the differences between WSL 1 and WSL 2 and how the push for WSL 2 may shut some interesting paths.
The pkgsrc package database, which by default lives in /var/db/pkg/, should not be there. Instead, it should be under /usr/pkg/libdata/pkgdb/. The same applies to FreeBSD’s and OpenBSD’s ports and also Debian’s dpkg, but I’ll focus on pkgsrc because it’s the system I know best. Let’s see why the current default is suboptimal and why libdata is a good alternative.
The scripts that live under /etc/rc.d/ in FreeBSD, NetBSD, and OpenBSD are in the wrong place. They all should live in /libexec/rc.d/ because they are code, not configuration. Let’s look at the history of these systems to see how we got here, why this is problematic, and how things would look like in a better world.