“Why do you know so much shell?” is a question I’m getting a lot at work lately. So yeah, why? And how can you learn it too? There is no secret here: I know the shell well because I was “forced” to write tools in it for a while and, because of that, I made a conscious effort to learn the language and get better at it.

You see, most people that write shell don’t want to deal with it. They stitch together whatever works into a script and call it a day, making a bunch of spaghetti even if it goes against the coding best practices they already know. And when they encounter some odd syntax they don’t recognize, their reaction is to say “this has to be rewritten in Python!” instead of taking a breath and trying to really understand what’s going on. It doesn’t help that plenty of senior engineers scoff at shell scripts.

And it is true: the shell is arcane and has many flaws as a programming language. I don’t want to convince you to start writing new tools in it. But the shell is also an incredible rapid prototyping language, and you can use it to solve business problems really quickly and with surprisingly little code. If you pause for a second to learn it, you’ll realize that you can bend tradition and write maintainable shell code too. Hear out how I got into writing so much shell and how you can get better at it too.

The constraints of the BSD systems

In the late 1990s, I discovered Linux and, soon after, the BSDs. I had a brief stint with OpenBSD and FreeBSD at first, but by the early 2000s, I had settled on NetBSD as my daily driver. My dream had always been to create my own operating system, but the more I learned and tried to write one, the more I realized I wasn’t up to the task yet. Thus NetBSD was the perfect fit for me: all my hardware worked on it, but the system had enough rough edges that I saw the opportunity to become a contributor to a real operating system.

NetBSD—and all the BSDs really—are full operating system distributions. Unlike Linux, the source code for their kernel, user space tools, and documentation lives in a single source tree (monorepo!) maintained by a single group of developers. This source tree is known as the base system and every other third-party app comes via the ports system—or pkgsrc in NetBSD-specific parlance. If this is hard to imagine, visualize your typical Windows installation: when you perform a fresh install of Windows 7 (not 10 or 11 because these get random junk auto-added), what you get is a collection of software that Microsoft has itself developed and chosen to be the basis to form Windows; everything you add to it later on, be it from Microsoft or other vendors, is not part of that base installation.

A constraint of this arrangement is that the code in a BSD base system is self-hosting: i.e. the base system must be able to build itself so it must include the compilers and interpreters required to build and run its code. In NetBSD during the early 2000s, this meant choosing between C, C++, and shell. Lua has been added as a fourth choice since.

It is of course possible to write tools for a BSD system in any language that’s not in the base system, but doing so means that the tool is relegated to live in the ports system. To make matters worse, the common practice in the BSDs was to build everything from source—pre-built binary packages existed but were inflexible and usually stale—and thus users frowned upon heavy dependencies. If your tiny tool required Perl or Python, for example, it would be dead on arrival because of the heavy tax imposed by the interpreter: if I recall correctly, building Perl on my Pentium II took something like 15 minutes, and building it on a 68k Mac I had took hours.

Contributing tools to NetBSD

See where this is going? I was the primary maintainer of Gnome 2.x on NetBSD and, as part of this work, I ended writing all sorts of tools to simplify the maintenance of the packages and the system as a whole. I wrote things like sysbuild, pkg_comp, pkg_alternatives, dfdisk, autoswc, etcutils… and even my own build system.

And to write such tools… which language could I use? I wanted my tools to feel part of the base system and I didn’t want to have to pay the heavy price of Perl or Python. I could have used C, but… well, let’s just say that C is a terrible choice for automation tools. I could have used C++, but people also hated it for its long compile times and the fact that, back in the pre-C++11 era, it wasn’t much better than C and compilers were really bad at supporting standards. And I could use the shell which, as ugly as it was, made programs immediately installable under any of the tens of hardware platforms that NetBSD supported, no matter how slow they were.

So that’s how I ended up writing shell. The shell was the only realistic option I had to write the tools I wanted to write. And you know what? My scripts were bad at first, full of the problems I opened this article with. But with practice, a principled approach to writing shell scripts programs, and an open mind to see the shell as “yet another programming language”, it turned out to be not a terrible idea in retrospect.

My biggest shell program today is probably pkg_comp2. If I count the lines of its source code plus its two dependencies (sandboxctl and shtk), it comes to about 15,000 SLOC. More than half of those are unit and integration tests, just as commonly happens in “real software”, which shows that shell programs can mimic the good development practices of other languages. Just take a moment and skim through pkg_comp.sh. Does this look like your regular spaghetti shell code to you?

How you can get better at the shell

I could probably write a whole book on this topic—and I’ve thought about doing so… would you read it?—but all I can do right now is give you some ideas:

  1. Read about the language. The shell is small. Once I decided I wanted to get better at the shell, I just opened the sh(1) manual page and read it. It will take you less than 1 hour to go through the whole document. You might choose to also read the Bash manual page—and you probably should, particularly to become aware of its many unnecessary non-standard features.

  2. Familiarize yourself with the Unix toolchain. Yes, the shell language is really simple, but that comes at a price: many of the things you want to do will require invoking tools like grep, sed, find… Which is fine because that’s the core idea behind the Unix toolchain—small, composable tools—but that means you need to know those tools too. The more tools you know about, the better your scripts will be. Think of these tools as the “standard library” for the shell. Manual pages are not in fashion… but getting comfortable in navigating them will prove to be a useful skill.

  3. Understand how process creation (fork vs. exec) and argument passing work in Unix. The shell is primarily designed to interact with subprocesses, so knowing these topics in detail is crucial to truly understand how quoting, globs, redirections, and pipelines work, and also to understand the difference between built-in and external commands. For example, do you know how test, [ and [[ differ?

  4. Write shell scripts following the good programming practices you already know. Avoid global variables. Factor code into functions. Minimize side-effects. Write unit and/or integration tests. And be unconditionally strict: e.g. double-quote all variable expansions to correctly handle whitespace characters, even if in most cases you may not need to do so.

  5. Think in terms of data flow. The shell is about combining tools as pipelines, not writing your usual imperative for loops. The more you can reason about solving problems with pipelines, the simpler and more performant your scripts will be. Functional programming FTW!

  6. Read Google’s Shell Style Guide. While I don’t necessarily agree with everything it has to say, especially around stylistic details, the “Features and bugs” and “Calling commands” sections are particularly interesting.

  7. Use ShellCheck.

  8. And finally, take a look at my short readability series on the shell from 2013.