As most programming languages with support for functions, the shell offers locally-scoped variables. Unfortunately, local variables are not the default. You must explicitly declare variables as local
and you should be very strict about doing this to prevent subtle but hard-to-diagnose bugs.
That’s it! What else is there to say about this trivial keyword? As it turns out, more than you might think.
For today’s post, let’s pick apart the documentation for the local
keyword from the the sh(1)
manual page:
Variables may be declared to be local to a function by using a
local
command. This should appear as the first statement of a function, and the syntax is:
local [variable | -] ...
Local is implemented as a builtin command.
One thing to notice so far is that the local
call should appear as the first statement of a function. I’ve never observed problems with local
being called anywhere else and this sentence says should, not must, so I’m going to ignore this suggestion and I suggest you do too. It’s always a better practice to declare variables at the time of their first assignment; we are way past the C89 days.
Another thing to notice is that the syntax declaration doesn’t mention a value anywhere in the local
statement. While it’s often possible to write local foo=bar
as a statement, this can lead to subtle bugs. Take a look at these seemingly-equal definitions:
local dir="$(mktemp -d "${TMPDIR:-/tmp}/test.XXXXXX")" \
|| fail "mktemp failed"
local dir
dir="$(mktemp -d "${TMPDIR:-/tmp}/test.XXXXXX")" \
|| fail "mktemp failed"
The first use of local
is wrong: when the mktemp
command is expanded, its output is assigned to dir
. We expect the guard clause to trigger if mktemp
fails, but this never happens: the local
“call” always evaluates to true and masks the return value of the subshell. As a result, the guard clause never runs and the code continues executing.
The second usage of local
is correct. If you want to respect the return value of expansions, you must decouple expansions from the actual variable assignment. This is especially important if you expect set -e
to catch errors on these expressions.
The description for local
continues:
When a variable is made local, it inherits the initial value and exported and read-only flags from the variable with the same name in the surrounding scope, if there is one. Otherwise, the variable is initially unset. The shell uses dynamic scoping, so that if you make the variable
x
local to functionf
, which then calls functiong
, references to the variablex
made insideg
will refer to the variablex
declared insidef
, not to the global variable namedx
.
The dynamic scoping described here is where things get interesting but the description might be a bit hard to read. To elaborate on this point, let’s look at a fictitious piece of code:
exists_in_path() {
local program="${1}"; shift
local oldifs="${IFS}"
IFS=:
local dir # <------------------------- IMPORTANT!
for dir in ${PATH}; do
if [ -f "${dir}/${program}" ]; then
IFS="${oldifs}"
return 0
fi
done
IFS="${oldifs}"
return 1
}
run_in_tmpdir() {
local program="${1}"; shift
local dir="${TMPDIR:-/tmp}"
if exists_in_path "${program}"; then
( cd "${dir}" && "${program}" "${@}" )
else
echo "Program not found; won't try to run it under ${dir}" 1>&2
return 1
fi
}
In this piece of code, run_in_tmpdir
invokes exists_in_path
. Both functions define and use a variable dir
. The assignment of dir
within exists_in_path
is tricky to spot though because this variable only appears as the iterator in the for loop. It would be reasonable to assume that such a variable is local by default, but it’s not: that variable must be declared local as well.
What I mean to say is that it’s way too easy to miss variables when sprinkling local
through the code. If you forgot to do so for dir
in the code above, run_in_tmpdir
would later malfunction because the call to exists_in_path
would clobber the value of dir
within run_in_tmpdir
. The solution, of course, is to have unit tests to catch these sometimes-subtle issues.
And finally, the description for local
ends with:
The only special parameter that can be made local is
-
. Making-
local, any shell options that are changed via theset
command inside the function are restored to their original values when the function returns.
This may sound cryptic so let’s look at another example:
set -eu # Enable strict mode for the whole script.
do_easter_egg() {
local -
set +eu # Disable strict mode.
[ "${ENABLE_SECRET_BEHAVIOR}" = yes ] && echo "You found me!"
}
In this example, we have a script that enables the shell’s unofficial strict mode and then we have a function that violates the rules guarded against the strict mode. Such a function can make the options local by calling local -
. This way, when the function exits, the shell restores the original shell-wide state without having to know what it was. (Manually calling set -eu
at the end of the function would require keeping that line in sync with the top of the script, which is fragile.)
To conclude, some food for thought:
The fact that the
local
keyword is optional always feels like a language flaw: the keyword was added in retrospect once the developers realized that it was necessary. Other languages, such as Lua, suffer from this exact same problem. And C++ suffers from a similar issue with itsconst
keyword.As a reviewer, the lack of local definitions denotes to me that the code author is not very familiar with the shell. This typically correlates well the code having many other readability issues.
Why didn’t I quote the Bash manual page instead of ash’s? Because the Bash manual page intermingles the explanation of
local
with thedeclare
command which, as you may guess, is not standard. And as I argued in a previous post, I doubt you need a strict dependency on Bash in most cases.But… as it turns out,
local
is not mandated by POSIX either. As @jperkin pointed out in a reply to this post on Twitter, the Korn shell that Solaris 10+ ships does not support this feature. I’d recommend staying away from the “broken” shells that Solaris includes by default and assume a saner alternative is already available on the machine… but that’s something else to be aware of.