Here's Why You Should Quote Your Variables in Bash

Nick Janetakis - Oct 16 '18 - - Dev Community

This article was originally posted on October 2nd 2018 at: https://nickjanetakis.com/blog/here-is-why-you-should-quote-your-variables-in-bash


While working towards finishing my next course, I was finishing up an Ansible role to issue SSL certificates using acme.sh.

The role supports issuing single domain, multi-domain and wildcard certificates using Let's Encrypt's v2 API. All of that is handled by acme.sh, but I made a judgment call to create the certificate's file name on disk to match the first certificate passed into acme.sh.

In Ansible terms, it looks like this:

acme_sh_domain:
  - domains: ["*.example.com", "example.com"]

When everything finishes running you would end up with *.example.com.key and *.example.com.pem certificates to use with nginx or whatever is reading your certs.

I normally don't encounter files that start with *, so it lead to some fun. I'm relentless with quoting variables so I didn't get into too much trouble but when playing around on the terminal while testing the role I forgot to quote a mv command once and here we are.

Demonstrating the Problem

If you have a Unix-like environment to test this on (MacOS, Linux or WSL should work), then feel free to follow along:

The ordering of how you create directories matters:
mkdir -p /tmp/demoproblem
cd /tmp/demoproblem

mkdir test.example.com
mkdir *.example.com
# mkdir: cannot create directory 'test.example.com': File exists

rm -rf *

mkdir *.example.com
mkdir test.example.com

ls -la
# drwxr-xr-x 1 nick nick 4096 Sep 29 15:15 *.example.com
# drwxr-xr-x 1 nick nick 4096 Sep 29 15:15 test.example.com

On WSL the ls command strips out the surrounding single quotes around *.example.com. On a true Unix-like system you would end up seeing '*.example.com' but that's only a presentation difference with ls.

Accidentally deleting unexpected directories:
rm -rf *.example.com

ls -la
# drwxr-xr-x 1 nick nick 4096 Sep 29 15:19 .
# drwxrwxrwt 1 root root 4096 Sep 29 14:59 ..

If you only wanted to delete *.example.com (the single folder) then you're in a world of hurt because it deleted every directory that matched the pattern.

Protecting against deleting everything that matches the asterisk:
mkdir *.example.com test.example.com

rm -rf "*.example.com"

ls -la
drwxr-xr-x 1 nick nick 4096 Sep 29 15:20 .
drwxrwxrwt 1 root root 4096 Sep 29 14:59 ..
drwxr-xr-x 1 nick nick 4096 Sep 29 15:20 test.example.com

There we go, with quotes everything works as intended. You could have also escaped the asterisk with rm -rf \*.example.com instead.

Double checking to make sure it works with mkdir too:
mkdir "*.example.com"

ls -la
drwxr-xr-x 1 nick nick 4096 Sep 29 15:23 *.example.com
drwxr-xr-x 1 nick nick 4096 Sep 29 15:23 test.example.com

As we can see, quoting also lets us create *.example.com even if test.example.com exists.

So Why Should You Quote Your Variables?

Imagine if that directory name was in a variable and it wasn't quoted, and you ran a script that deleted a certificate. You would have deleted that cert along with every other certificate you had in that directory.

That could have affected multiple projects and it could have easily crept into production undetected if you never issued a wildcard certificate before because the 20 other times you issued a certificate it worked fine so you thought you were in the clear.

Here's an example of a script not working as intended due to missing quotes:
DOMAIN_DIR=*.example.com
[ -d $DOMAIN_DIR ] && echo "Directory found" || echo "Directory not found"
# bash: [: *.example.com: binary operator expected
# Directory not found

Now how's this for madness. At this point we have *.example.com and
test.example.com in our directory, but now run mkdir aaa.example.com to
create a third directory.

DOMAIN_DIR=*.example.com
[ -d $DOMAIN_DIR ] && echo "Directory found" || echo "Directory not found"
# bash: [: too many arguments
# Directory not found

Now we get a different error.

And here's the same set up but quoting the DOMAIN_DIR at assign time:
DOMAIN_DIR="*.example.com"
[ -d $DOMAIN_DIR ] && echo "Directory found" || echo "Directory not found"
# bash: [: too many arguments
# Directory not found

That still doesn't help us.

Here's the same script but quoting just the variable:
DOMAIN_DIR=*.example.com
[ -d "$DOMAIN_DIR" ] && echo "Directory found" || echo "Directory not found"
# Directory found

That worked, but we should still quote the assignment just to be safe.

While we're at it, let's quote both the assignment and variable AND use dollar curlies:
DOMAIN_DIR="*.example.com"
[ -d "${DOMAIN_DIR}" ] && echo "Directory found" || echo "Directory not found"
# Directory found

When in doubt, quote your assignments and use quotes + dollar curlies together when referencing variables. Yes, it's more verbose, but a few more characters is a lot better than waking up to a cron job that deleted every certificate on your server in the middle of the night or caused some other weird side effects in your script.

Just don't just use dollar curlies by itself (even with a quoted assignment):
DOMAIN_DIR="*.example.com"
[ -d ${DOMAIN_DIR} ] && echo "Directory found" || echo "Directory not found"
# bash: [: too many arguments
# Directory not found

It's not enough.

Using Tools to Help Us Remember Our Quotes

Even with a lot of experience you can forget to use quotes once in a while. Lucky for us, there are tools to help prevent this situation from coming up.

I highly recommend you install shellcheck. It's an excellent shell script linting tool that will catch and warn you about a ton of potential issues (including missing quotes). There's even a VSCode extension for it too.

Have you ever been hit by bugs related to missing Bash quotes? Let me know below!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .