Shell Scripts Are Lousy Infrastructure

The best way to automate a task is usually to perform it manually a few times, then capture the procedure in some kind of macro or computer program. For command-line users, the easiest automation is often a shell script: Yank the commands from your shell history into a file, and call it a day.

Even for folks who aren't ordinarily terminal jockeys, Unix shells (and their spiritual cousins like PowerShell) are powerful laboratories for experimentation, letting us combine and reorder incantations until we get what we want. Shell scripting may be the closest thing to medieval alchemy that a 21st-century human can experience. It lets you iterate so quickly that you can almost see the lightning bolts shooting from your fingertips.

Now for the bad news: Shell scripts are a horrific way to define infrastructure. They are non-portable across both space and time, have no real type systems and very few safety mechanisms, and often amplify simple bugs into catastrophes. A shebang line like #!/bin/sh may run Bash, Dash, or something else, depending on the host machine's config. A carefully thought-out shebang like ‘#!/usr/bin/env bash’ is little better, because subtly different versions of Bash are in common use on different platforms. As of this writing, Apple still ships Bash 3 (but recently changed the default shell for new users to Zsh), whereas Ubuntu is already on Bash 5.

Most experienced shell scripters do a Safety Dance near the top of each script. (Mine is ‘set -euo pipefail.’) Don't kid yourself: Your shell script is still wildly unsafe. For example, the Safety Dance causes an early exit if a variable is unset:

$ cat > && chmod +x
set -euo pipefail
echo "rm -rf /$SOME_PATH"

$ ./
./ line 3: SOME_PATH: unbound variable

But what if the variable is technically set, just empty?

rm -rf /

Oops. Even if you nail down the shell itself, the very commands you’re issuing depend on the target platform and caller’s PATH. It's not just .sh files, either: Shell commands appearing in build scripts (Makefiles, package.json), CI/CD config, etc. are all maintenance nightmares.

So, what are our alternatives? A good one is to use a scripting language that happens to have an interactive interpreter, rather than an interactive interpreter that can technically be used as a scripting language. Python, Ruby, and Node.js (yes, Node) are all good options. A hundred lines of well-written Python are infinitely less likely to cause heartache than a half dozen lines of shell script.

An underrated approach is to use a compiled, statically typed language for simple automation. People sometimes see compilers and static types as productivity killers, but the real productivity killer is unmaintainable infrastructure. Honestly, writing a small Go or Rust program doesn't take that much longer than writing a script. The program can easily execute subprocesses as needed, much as a shell script would, but with vastly superior flow control, static safety and run-time error handling, a proper package ecosystem, linter, autoformatter, documentation system, better run-time performance, and any number of other quality of life improvements.

Here's a quick and dirty Rust “script” I wrote last year to generate TLS certificates for local web development. I'm not claiming this particular program is spectacular; but it's under a hundred lines long including whitespace and comments, accepts two command-line arguments (one mandatory, one optional), calls ‘openssl’ as a subprocess (as a shell would), has no build dependencies, and was roughly as easy to write as the tiny script that preceded it.

We learn about systems by interacting with them, but what’s good for interaction is rarely good for automation. The next time you’re tempted to commit shell script (in any of its multitudinous forms) to a repository, try to keep it to a single line that calls a real program written in Python, Go, or whatever general purpose language your codebase already uses.