While there are lots of shell programming pitfalls, at least the interpreter will tell you immediately about them. The mistakes I describe below, generally mean that your script will run fine now, but if the data changes or you move your script to another system, then you may have problems.
I think some of the reason shell scripts tend to have lots of issues is that commonly one doesn't learn shell scripting like "traditional" programming languages. Instead scripts tend to evolve from existing interactive command line use, or are based on existing scripts which themselves have propagated the limitations of ancient shell script interpreters.
It's definitely worth spending the relatively small amount of time required to learn the shell script language correctly, if one uses linux/BSD/Mac OS X desktops or servers. This is because shell is the main domain specific language designed to manipulate the UNIX abstractions for data and logic, i.e. files and processes. So as well as being useful at the command line, its use permeates any UNIX system.
Stylistic issues
First I'll mention some ways to clean up shell scripts without changing their functionality. Note I use a shortcut form of the conditional operator below (and in my shell scripts), when doing simple conditional operations, as it's much more concise. So I use [ "$var" = "find" ] && echo "found" instead of the equivalent:
if [ "$var" = "find" ]; then
echo "found"
fi
[ x"$var" = x"find" ] && echo found
The use of x"$var" was required in case var is "" or "-hyphen". Thinking about this for a moment should indicate that the shell can handle both of these cases unambiguously, and if it doesn't it's a bug. This bug was probably fixed about 20 years ago, so stop propagating this nonsense please! Shell doesn't have the cleanest syntax to start with, so polluting it with stuff like this is horrible.[ ! -z "$var" ] && echo "var not empty"
This is a double negative, and is very prevalent in shell scripts for some reason.Just test the string directly like [ "$var" ] && echo "var not empty"
redundant use of $?
For example:
pidof program
if [ $? = 1 ]; then
echo "program not found"
fi
Note this is not just stylistic actually. Consider what happens if `pidof` returns 2.Instead just test the exit status of the process directly as in these examples:
if ! pidof program; then
echo "program not found"
fi
if grep -qF "string" file; then
echo 'file contains "string"'
fi
needless shell logic
We'll expand on this below, but we should do as little in shell as possible, over its domain of connecting process to files. For example the following common shell idiom of testing for files and directories can often be pushed into the programs themselves. I.E. instead of:[ ! -d "$dir" ] && mkdir "$dir" [ -f "$file" ] && rm "$file"do:
mkdir -p "$dir" #also creates a hierarchy for you rm -f "$file" #also never prompts
Robustness
globbing
In the example below to count the lines in each file, there is a common mistake.
for file in `ls *`; do
wc -l $file
done
Perhaps the idiom above stems from a common system where
the shell does not do globbing, but in any case it's neither scalable or robust.
It's not robust because it doesn't handle spaces in file names as word splitting is done.
Also it redundantly starts an ls process to list the files. Also on some systems this
form can overflow static command line buffers when there are many files.
Shell script is a language designed to operate on files so it has this functionality built in!
for file in *; do
wc -l "$file"
done
Notice how we just use the '*' directly which as well as not starting the redundant `ls` process,
doesn't do word splitting on file names containing spaces. Note this still is slow, as we use shell looping and
start a `wc` process per file, so we'll come back to this example in the performance section below.
stopping automatically on error
Often don't want a script to proceed if some commands fail. Checking the status of each command though can become very messy and error prone. One can instead execute set -e at the top of the script, which usually just works as expected, terminating the script when any command fails (that is not already part of a conditional etc.).cleaning up temp files
One should always try to avoid temp files for performance/maintainability reasons, and instead use pipes if at all possible to pass data between processes. Temporary files can be slow as they're usually written to disk, and also you must handle cleaning them up when your script exits, possibly in unexpected ways. The general method for cleaning up temp files if you really do need them is to use traps as follows:
#!/bin/sh
tf=/tmp/tf.$$
cleanup() {
rm -f $tf
}
trap "cleanup" EXIT
touch $tf
echo "$tf created"
sleep 10 #Can Ctrl-C and temp file will still be removed
#temp file auto removed on exit
echoing errors
If you just echo "Error occurred" then you will not be able to pipe or redirect any normal output from your script independently. It's much more standard and maintainable to output errors to stderr like echo "Error occurred" >&2. Note you can echo multiple lines together as in the following example:echo "\ Usage: $(basename $0) option1 more info even more" >&2
Portability
There are two aspects to portability really for shell scripts. There's the shell language itself, and also the various tools being called by the script. We'll just consider the former here. To support really old implementations of shell script then one can test with the heirloom shell for example, but for a contemporary list of portable shell capabilities, see the The Open Group spec which describes the POSIX standard. Note also the Autoconf info on shell portability which lists details you need to consider when writing very portable shell scripts.It's much better to test scripts directly in a POSIX compliant shell if possible. The `bash --posix` option doesn't suffice as it still accepts some "bashisms", but the `dash` shell which is the default interpreter of shell scripts on ubuntu is very good in this regard. One should be testing with this shell anyway due to the popularity of ubuntu, and dash is easy to install on Fedora for example.
bashisms
`bash` is the most common interactive shell used on unix systems, and consequently, syntax specific to `bash` is often used in shell scripts. Note I've never needed to resort to bash specific constructs in my scripts. If you find yourself doing complex string manipulations or loops in bash, then you should probably be considering existing UNIX tools instead, or a more general scripting language like python for example.[ "$var" == "find" ] && echo "found"
Shell script can't assign variable values in conditional constructs so the double equals is redundant. Moreover it gives a syntax error on older busybox (ash) and dash at least, so avoid it.echo {not,portable}
Brace expansion is not portable. While useful it's mostly so at the interactive prompt, and can easily be worked around in scripts.signal specifications
Be wary of when specifying signals to the trap builtin for example, which was mentioned above. I was even caught out by this in my timeout script. That script handles the "CHLD" signal which for bash at least can be specified as "sigchld", "SIGCHLD", "chld", "17" or "CHLD", only the last of which is portable.echo $(seq 15) $((0x10))
The command above containing both $(command substitution) and an $((arithmetic expression)) is portable. Traditionally one did command substitution using backquotes like `seq 15`. That's awkward to nest though and not very readable in the presence of other quoting. $((arithmetic expressions)) can be handy also for quick calculations, rather than spawning off `bc` or `expr` for example. Note bash supports the non portable form of $[1+1] for arithmetic expressions which you should avoid. Note also that vim 7.1.135 at least, highlights $() as a syntax error unless #!/bin/bash it at the top of the script— I must send a patch. [Update June 2008: Strangely it looks like vim explicitly chooses to highlight #!/bin/sh scripts as original bourne shell scripts rather than to the POSIX standard which the vast majority of systems currently use. I've asked for this to be changed, but in the meantime you can add "let g:is_posix = 1" to your .vimrc]Performance
We'll expand here on our globbing example above to illustrate some performance characteristics of the shell script interpreter. Comparing the `bash` and `dash` interpreters for this example where a process is spawned for each of 30,000 files, shows that dash can fork the `wc` processes nearly twice as fast as `bash`$ time dash -c 'for i in *; do wc -l "$i">/dev/null; done' real 0m14.440s user 0m3.753s sys 0m10.329s $ time bash -c 'for i in *; do wc -l "$i">/dev/null; done' real 0m24.251s user 0m8.660s sys 0m14.871sComparing the base looping speed by not invoking the `wc` processes, shows that dash's looping is nearly 6 times faster!
$ time bash -c 'for i in *; do echo "$i">/dev/null; done' real 0m1.715s user 0m1.459s sys 0m0.252s $ time dash -c 'for i in *; do echo "$i">/dev/null; done' real 0m0.375s user 0m0.169s sys 0m0.203sThe looping is still relatively slow in either shell as demonstrated previously, so for scalability we should try and use more functional techniques so iteration is performed in compiled processes.
$ time find -type f -print0 | wc -l --files0-from=- | tail -n1
30000 total
real 0m0.299s
user 0m0.072s
sys 0m0.221s
The above is by far the most efficient solution and illustrates the point well that
one should do as little as possible in shell script and aim just to use it to connect the
existing logic available in the rich set of utilities available on a UNIX system.