While working on this static blog a few days ago, I made a change to its templates that warranted an automated test. I could have written a trivial shell script to do it, but instead I reached out for shtk’s unit-testing module. I tweeted about it right away to just say that you can, in fact, write tests in shell because lots of developers are skeptical about any script longer than 10 lines of code.
Interestingly, this reply came through: a pointer to a contemporary, under-development library for writing tests in Bash. Which made me think: “Hey, I had already done that years ago… but nobody knows about it. Gotta fix that with a blog post!” But first, I had to bring shtk back from its ashes because I had not touched it for more than 6 years and it wasn’t read for show and tell. So I did something that I wanted to do back in the day but never did: I put together a website for shtk to host its reference manual and I fixed a few obvious rough edges.
With those tweaks out of the way, we come to this article. In here, I want to show you how writing decent tests in shell is entirely possible and how shtk’s testing platform provides unique features to do integration testing of CLI apps written in any language.
A blog on operating systems, programming languages, testing, build systems, my own software projects and even personal productivity. Specifics include FreeBSD, Linux, Rust, Bazel and EndBASIC.
What is shtk anyway?
The Shell Toolkit, or shtk for short, is a collection of libraries to support the writing of portable shell scripts. shtk grew out of sysbuild and sysupgrade’s common code, a couple of tools I wrote over 10 years ago for NetBSD and that are fully written in shell because of the constraints of the NetBSD base system. In turn, this means that shtk is not Bash-specific so it avoids imposing a useless use of GNU.
From the get go, all of shtk, sysbuild, and sysupgrade had unit tests written with atf-sh, the shell testing library of the Automated Testing Framework. atf-sh was a rather simplistic library created by yours truly in 2007 and required a complex runtime (atf-run or, later, Kyua) to be functional. By 2014, while working at Google, I had been exposed to better ways of writing tests that blended better with the languages they supported (pyUnit and JUnit), and I knew that I needed a replacement for atf-sh.
Hence, in 2014, I took the best parts of atf-sh, the core concepts of the xUnit test frameworks, and I created shtk’s unittest
module. I then proceeded to migrate all existing tests to this new framework and also used shtk to build pkg_comp 2.x later on in 2017. But I failed to publicize the library because I didn’t quite know how to put together a cool-looking website and I didn’t have a good platform to talk about it—both of which are fixed now.
Installing shtk
If you are on NetBSD or on FreeBSD, you are in luck! There are packages for shtk ready to install, so go ahead and use those.
On any other system, you’ll have to build shtk from source. Fear not, it’s easy:
$ curl -LO https://github.com/jmmv/shtk/releases/download/shtk-1.7/shtk-1.7.tar.gz
$ tar xzf shtk-1.7.tar.gz
$ cd shtk-1.7
$ ./configure --prefix ~/.local
$ make
$ make install
After a successful installation, you should have the shtk
tool in your path. If that’s not the case, do this:
$ PATH="${HOME}/.local/bin:${PATH}"
$ export MANPATH="${HOME}/.local/share/man:${MANPATH}"
Setting up the MANPATH
is important. shtk’s official documentation is written as manual pages to seamlessly integrate with the Unix-y environment it’s intended to complement, so you’ll want man
invocations to work. If you are curious, start by peeking into shtk(1)
and shtk(3)
.
Creating our first test
Now that we have shtk up and running, let’s create a test. Write the following contents to a file named demo_test.sh
:
shtk_import unittest
shtk_unittest_add_test always_fails
always_fails_test() {
assert_equal 2 3
echo "NOT REACHED!"
}
Done? OK. Something looks funny, doesn’t it? shtk_import
is a shell function that brings the unittest
module into scope. But… where does that function come from? Well, here is the thing: shtk scripts need to be “built” before they can run. In order for the above to become a runnable test program, you have to do the following:
$ shtk build -m shtk_unittest_main demo_test.sh
After running the above, you’ll end up with a demo_test
executable. This “executable” is essentially the same as demo_test.sh
but with some preamble code to set up the module import features and a call to the shtk_unittest_main
entry point to execute the tests.
Once the script is built, run it and see the single test fail:
$ ./demo_test
demo_test: I: Testing always_fails...
demo_test: E: Expected value 2 but got 3
demo_test: W: Testing always_fails... FAILED
demo_test: W: Ran 1 tests; 1 FAILED
Advanced xUnit-like features
The above is nice but… pretty… simple? Anyone can write a conditional to check if two values are equal without the need for “fancy frameworks”, right? Right. But shtk provides much more.
Just like asserts, shtk also comes with expects to record soft failures: all assert_*
functions have an expect_*
counterpart. If we tweak our previous test to look like this:
shtk_import unittest
shtk_unittest_add_test always_fails
always_fails_test() {
expect_equal 2 3
echo "REACHED!"
expect_equal 4 5
}
We get the following, which shows how the two expect commands ran and detected a failure but didn’t stop execution:
$ shtk build -m shtk_unittest_main demo_test.sh
$ ./demo_test
demo_test: I: Testing always_fails...
demo_test: W: Delayed failure: Expected value 2 but got 3
REACHED!
demo_test: W: Delayed failure: Expected value 4 but got 5
demo_test: W: Testing always_fails... FAILED (2 delayed failures)
demo_test: W: Ran 1 tests; 1 FAILED
In general, when writing a test, you use asserts for any step that prepares the test scenario, and you use expects for the test scenario itself. This way, the test fails early if it is unable to set up the scenario (because it makes no sense to continue if the scenario is not set up), but it prints as much diagnostic information as possible if the actual test detects problems half-way through.
What about fixtures? Setup and teardown routines? shtk has got you covered too:
shtk_import unittest
shtk_unittest_add_fixture sample
sample_fixture() {
setup() {
echo "Common setup code"
echo 123 >data.txt
}
teardown() {
echo "Common cleanup code"
rm -f data.txt
}
shtk_unittest_add_test ok
ok_test() {
assert_equal 123 "$(cat data.txt)"
echo 125 >data.txt # Overwrites transient file.
}
shtk_unittest_add_test not_ok
not_ok_test() {
assert_equal 125 "$(cat data.txt)"
}
}
Running the above does as you would expect:
$ shtk build -m shtk_unittest_main demo_test.sh
$ ./demo_test
demo_test: I: Testing sample__ok...
Common setup code
Common cleanup code
demo_test: I: Testing sample__ok... PASSED
demo_test: I: Testing sample__not_ok...
Common setup code
demo_test: E: Expected value 125 but got 123
Common cleanup code
demo_test: W: Testing sample__not_ok... FAILED
demo_test: W: Ran 2 tests; 1 FAILED
Note how the setup and teardown routines were executed for each test, which means that the data.txt
file was recreated for every test case and the modifications to the file from one test didn’t impact the outcome of the other.
The secret sauce: assert_command
So far, everything looks very xUnit-like. We have asserts and expects; we have test setup and teardown hooks; we have fixtures. But shell scripts are uniquely suited to test the user interface of a CLI app: after all, users interact with CLI apps from the shell, so it’s only natural to use the shell to test arbitrary tools no matter what language they are written in. This is where shtk’s magic sauce comes into play.
shtk’s assert_command
check allows you to run an arbitrary command and to declaratively check its exit condition and its side-effects on stdout and stderr. The feature is inspired by AT_CHECK
in GNU Autoconf, to which I was exposed even longer ago (circa 2005) while working on the Monotone project.
Take a look at these tests that exercise the cp
tool and pay close attention to the assert_command
calls:
shtk_import unittest
shtk_unittest_add_test cp_ok
cp_ok_test() {
touch a
assert_command cp a b
[ -f b ] || fail "b was not created"
}
shtk_unittest_add_test cp_missing_source
cp_missing_source_test() {
assert_command -s exit:1 -e match:"No such file" cp a b
}
shtk_unittest_add_test cp_unexpected_output
cp_unexpected_output_test() {
assert_command -s exit:1 -o match:"Hello" cp a b
}
In the first test, cp_ok
, the call to assert_command
has no flags. This means that we expect the command given to the assert to finish successfully and quietly: the exit code of cp a b
should be 0, and both stdout and stderr should be silent.
In the second test, cp_missing_source
, the call to assert_command
specifies that the command under test has to terminate with an exit code of 1, and that stderr has match the No such file
regular expression.
In the third test, cp_unexpected_output
, the call to assert_command
expects a message to stdout that cp
will not print, and also implies that stderr should be empty.
When we run the above after compilation, we get:
$ shtk build -m shtk_unittest_main demo_test.sh
$ ./demo_test
demo_test: I: Testing cp_ok...
Running checked command: cp a b
demo_test: I: Testing cp_ok... PASSED
demo_test: I: Testing cp_missing_source...
Running checked command: cp a b
demo_test: I: Testing cp_missing_source... PASSED
demo_test: I: Testing cp_unexpected_output...
Running checked command: cp a b
Expected regexp 'Hello' not found in standard output:
Expected standard error to be empty; found:
cp: a: No such file or directory
demo_test: E: Check of 'cp a b' failed; see stdout for details
demo_test: W: Testing cp_unexpected_output... FAILED
demo_test: W: Ran 3 tests; 1 FAILED
Note that the first two tests passed, but pay attention to the output of the third failed test: the call to assert_command
reports that neither stdout nor stderr matched what we expected and provides details on why.
I invite you to take a look at the documentation in shtk_unittest_assert_command(3)
and its supporting shtk_unittest_assert_file(3)
as the features they provide are too numerous to be captured in this post.
Using GNU Automake as a test runner
Up until here, I have shown you the features that the shtk library itself provides… but a testing library to write test programs with is not very useful on its own. What happens when you want to run more than one test program at once? How do you set up the test environment so that tests always run in a consistent manner, free from side-effects? How do you collect results for reporting?
This is where test runners come into play, and there are many to choose from. But again, given the nature of shtk and its desire to blend into Unix-y environments, integrating with the de-facto build system of all foundational software is important. So, yes, shtk tests can run via GNU Automake. The test runner provided by this build system isn’t state-of-the-art, but it is more than enough for most situations.
Here is all it takes to hook the earlier demo_test.sh
shtk-based test into an Automake project:
TESTS = demo_test
check_SCRIPTS = demo_test
CLEANFILES += demo_test
EXTRA_DIST += demo_test.sh
demo_test: $(srcdir)/demo_test.sh
$(AM_V_GEN)shtk build -m shtk_unittest_main -o $@ $<
You’d probably want to add a pkg-config check in configure.ac
to load the path to shtk from the provided shtk.pc
and reference it from here as $(SHTK)
, but I’ll leave that as an exercise for you, dear reader.
After that, if we run make check
, we’ll see something like this:
make check-TESTS
PASS: demo_test
============================================================================
Testsuite summary for Demo
============================================================================
# TOTAL: 1
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
Neat. Take a look at the test suites documentation for GNU Automake for more details on how to communicate specific return codes (such as “skip”) to the runner.
Integrating with Bazel
GNU Automake is the de-facto build system for open source projects but it’s also… far from great. This is why I created Buildtool eons ago and why I got interested in Bazel in the first place. So the question is: can we integrate shtk with Bazel? Of course we can!
For the purposes of this post, I hacked up a Bazel rule to show off how running shtk tests with it would look like. And it looks exactly like you would expect:
load("shtk.bzl", "shtk_test")
shtk_test(
name = "demo_test",
src = "demo_test.sh",
)
Followed by:
$ bazel test --test_output=streamed :faulty_test
WARNING: Streamed test output requested. All tests will be run locally, without sharding, one at a time
INFO: Analyzed target //:faulty_test (0 packages loaded, 2 targets configured).
INFO: Found 1 test target...
faulty_test: I: Testing faulty...
faulty_test: E: This test fails
faulty_test: W: Testing faulty... FAILED
faulty_test: W: Ran 1 tests; 1 FAILED
FAIL: //:faulty_test (see /home/jmmv/.cache/bazel/_bazel_jmmv/828d51923bded9f03acff0119df51adc/execroot/demo/bazel-out/k8-fastbuild/testlogs/faulty_test/test.log)
Target //:faulty_test up-to-date:
bazel-bin/faulty_test
INFO: Elapsed time: 0.355s, Critical Path: 0.14s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed, 1 test FAILED, 2 total actions
//:faulty_test FAILED in 0.0s
/home/jmmv/.cache/bazel/_bazel_jmmv/828d51923bded9f03acff0119df51adc/execroot/demo/bazel-out/k8-fastbuild/testlogs/faulty_test/test.log
Executed 1 out of 1 test: 1 fails locally.
Some of the benefits of the Bazel test runner over GNU Automake’s are that Bazel adds proper test isolation and cleanup via its local sandboxing feature and that Bazel provides fine-grained test invalidation when dependencies change. These are the two primary properties you want in a modern test runner because they ensure that only the minimum subset of tests run on a given source code change, and that the test results are deterministic across invocations and machines.
The future
Despite its many haters, the shell is a pretty OK language if you treat it as such. You must learn its syntax and its oddities, but once you do, you can write maintainable and moderately-long programs that are, often enough, much simpler than their Python counterparts. These programs have few dependencies and, given sufficient test coverage, can be as robust as other tools. shtk is just an ingredient that can help you in writing such large scripts in a principled manner and, especially, in testing them.
As for what the future will bring for shtk… you tell me! I had not worked on this project for 6 years and absolutely nobody asked about it during this time. But… things have changed a lot since then and there might actually be some interest out there. In preparation for this blog post, I migrated shtk’s CI system from Travis to GitHub Actions, created a simple website to serve the API documentation—which was previously locked behind man
invocations in a terminal—and moved off from Kyua to GNU Automake as the test runner for simplicity.
Some ideas about what could be done: additional library modules/functions; a “static build” feature where the built scripts don’t require shtk to be pre-installed; and real Bazel rules (what I showed above was a macro-based hack) to incorporate shtk into your projects so that you can test your command-line tools end-to-end no matter what language they are written in.
Please let me know if you have any interest by either voting/replying below or reaching out via social media! And don’t forget to visit the brand-new homepage at https://shtk.jmmv.dev/ to read the documentation.