If you use Bazel, your project is of moderate size. And because your project is of moderate size, it almost-certainly builds one or more binaries, at least one of which is a CLI tool. But let’s face it: you don’t have end-to-end testing for those tools, do you?

I’m sure you have split the binary’s main function into its own file so that the rest of the tool can be put in a library, and I’m extra-sure that you have unit tests for such library. But… those tests do little to verify the functionality and quality of the tool as experienced by the end user. Consider: What exactly does the tool print to the console on success? Does it show errors nicely when they happen, or does it dump internal stack traces? How does it handle unknown flags or bad arguments? Is the built-in help message nicely rendered when your terminal is really wide? What if the terminal is narrow?

You must write end-to-end tests for your tools but, usually, that isn’t easy to do. Until today. Combining shtk with Bazel via the new rules_shtk ruleset makes it trivial to write tests that verify the behavior of your CLI tools—no matter what language they are written in—and in this article I’m going to show you how.


To put things in perspective, we’ll be adding tests to a trivial demo tool written in C (not shell) that simply adds two numbers and prints the result to the standard output. You can find all of the code for this scenario under rules_shtk/examples/test.

Here is how the demo tool behaves after we build the //:adder target:

$ ./bazel-bin/adder 123 456
The sum of 123 and 456 is 579

$ ./bazel-bin/adder
adder: Requires two integer arguments

$ ./bazel-bin/adder 10000000 345
adder: Invalid first operand: out of range

Our mission is to add an //:adder_test target that depends on //:adder and that exercises the tool exactly as when a user runs it by hand. We’ll create an adder_test.sh file that uses the shtk testing library and hook it into the build as an shtk_test target.

Depending on rules_shtk

rules_shtk is a shiny new ruleset (released just yesterday) and because Bazel 7 is around the corner, I’ve opted to go all in for bzlmod. Thanks to bzlmod, setting up a project to use these rules is trivial. All you need is to add the following line to your MODULE.bazel file:

# Add this to MODULE.bazel.
bazel_dep(name = "rules_shtk", version = "1.7")

… but you’ll need to wait until bazelbuild/bazel-central-registry#1095 is reviewed and merged. (I don’t understand why this choke point exists in the new bzlmod ecosystem; Rust’s crates.io doesn’t have it, for example.)

In the meantime, or if you are not yet ready to upgrade to bzlmod, you can obviously use the old-fashioned and yucky WORKSPACE.bazel file to pull rules_shtk in:

# Add this to WORKSPACE or WORKSPACE.bazel if you don't use bzlmod yet.

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

    name = "bazel_skylib",
    sha256 = "66ffd9315665bfaafc96b52278f57c7e2dd09f5ede279ea6d39b2be471e7e3aa",
    urls = [

load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace")


    name = "rules_shtk",
    sha256 = "fa47891f27d8d59609732b34dc88020331b81b9767cbd72094fec1be8af4adfc",
    urls = [
    strip_prefix = "rules_shtk-1.7.0",

Requesting an shtk toolchain

Once you have added rules_shtk to your project, either via bzlmod or the WORKSPACE.bazel, you need to tell Bazel which shtk toolchain to use:

# Add this to WORKSPACE or WORKSPACE.bazel.

load("@rules_shtk//:repositories.bzl", "shtk_dist")


The parameterless shtk_dist macro makes Bazel download shtk 1.7 (because we requested the 1.7.x rules) and puts it to use for all tests that we later build. You cannot configure which version of shtk to download because, so far, all shtk versions are backwards-compatible and this won’t change in the 1.x series.

There is also an shtk_sytem macro that makes Bazel discover the shtk toolchain installed in the system, say via your local package manager. This does not make sense for our use case (because we are only running tests within Bazel), but it can be useful if you want to build a script with shtk_binary that can later be taken out of bazel-bin and installed into the host system.

Writing the test rule

We now have the rules and a toolchain in place so we can proceed to add an shtk_test target. Our goal is to test adder.c with a new adder_test.sh sibling file, so we can put the test target next to the binary target:

load("@rules_shtk//:rules.bzl", "shtk_test")

    name = "adder",
    srcs = ["adder.c"],

    name = "adder_test",
    src = "adder_test.sh",
    data = [":adder"],

There are different opinions on where tests should live but, personally, I want them as close to the code they test as possible. This makes tests discoverable when editing code, which increases the odds that developers will remember to update them. Also, nobody likes dealing with parallel deep directory hierarchies when working on a piece of code… thank you, java and javatests.

Writing the test program

All that’s left to do is an SMOP. We need to write the tests in adder_test.sh. Here are just a couple of examples:

# This is a portion of adder_test.sh.

shtk_import unittest

shtk_unittest_add_test addition_works
addition_works_test() {
    expect_command \
        -o inline:"The sum of 2 and 3 is 5\n" \
        ../adder 2 3

shtk_unittest_add_test bad_arguments
bad_arguments_test() {
    expect_command \
        -s 1 \
        -e inline:"adder: Requires two integer arguments\n" \

I don’t want to dive into the shtk APIs too much, but I want to highlight the expect_command calls in here. This assertion is the salient feature of shtk’s unit-testing module and is what makes testing tools a breeze. Take a look at the assert_command and assert_file documentation to demystify the -s, -o, and -e flags shown above.

One thing that needs explaining is the ../adder reference to the tool. If we look at the runfiles tree that Bazel creates when building the //:adder_test target, we see:

$ ls -l bazel-bin/adder_test.runfiles/_main/
total 8K
lrwxrwxrwx. 1 jmmv jmmv 116 Nov  4 06:17 adder -> /home/jmmv/.cache/bazel/_bazel_jmmv/916b9cbf147e00ff01b88519fdb0f294/execroot/_main/bazel-out/k8-fastbuild/bin/adder
lrwxrwxrwx. 1 jmmv jmmv 121 Nov  4 06:17 adder_test -> /home/jmmv/.cache/bazel/_bazel_jmmv/916b9cbf147e00ff01b88519fdb0f294/execroot/_main/bazel-out/k8-fastbuild/bin/adder_test

There are no directories in there. So, where does that .. come from in the ../adder reference? Well… shtk’s unit-testing execution engine creates a disposable subdirectory for each test it runs. The reason for this is that, when writing shell tests, it is extremely common to have to create auxiliary files, and shtk accounts for that by cleaning up whichever files you create in the current directory of a test. This is a little oddity you’ll have to keep in mind.

Running the test

All done. We can finally run the end-to-end test for our demo tool:

$ bazel test --nocache_test_results --test_output=streamed //:adder_test
WARNING: Streamed test output requested. All tests will be run locally, without sharding, one at a time
INFO: Analyzed target //:adder_test (0 packages loaded, 139 targets configured).
INFO: Found 1 test target...
adder_test: I: Testing addition_works...
Running checked command: ../adder 2 3
Running checked command: ../adder -12345 12345
adder_test: I: Testing addition_works... PASSED
adder_test: I: Testing bad_first_operand...
Running checked command: ../adder 123456789 0
Running checked command: ../adder  0
Running checked command: ../adder 123x 0
Running checked command: ../adder -123456789 0
adder_test: I: Testing bad_first_operand... PASSED
adder_test: I: Testing bad_second_operand...
Running checked command: ../adder 0 123456789
Running checked command: ../adder 0 
Running checked command: ../adder 0 123x
Running checked command: ../adder 0 -123456789
adder_test: I: Testing bad_second_operand... PASSED
adder_test: I: Testing bad_arguments...
Running checked command: ../adder
Running checked command: ../adder 1
Running checked command: ../adder 1 2 3
adder_test: I: Testing bad_arguments... PASSED
adder_test: I: Ran 4 tests; ALL PASSED
Target //:adder_test up-to-date:
INFO: Elapsed time: 0.504s, Critical Path: 0.33s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed successfully, 2 total actions
//:adder_test                                                            PASSED in 0.2s

Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

As much hate as “large” shell scripts get, the shell is a uniquely-positioned language to help you write end-to-end tests for your tools. After all, the shell is what you use to run those precious tools, so the shell is also what you should use to run them in an automated fashion for testing purposes.

It’s your turn now. Go read the online documentation for shtk, the previous introductory blog post, and get testing! Your users will be much happier if your tools offer excellent interfaces. And if you need ideas on what to test or how to improve the interface of your tools, take a look at the CLI design series from 2013.