What exactly was the point of [ “x$var” = “xval” ]?
April 12, 2021 1:39 AM   Subscribe

Shellcheck developer Vidar Holen asks and answers a question about arcane command-line syntax in about 1000 words.
posted by cgc373 (30 comments total) 26 users marked this as a favorite
 
I really appreciate this. Even though I've been working in linux for almost two decades, I've largely avoided shell scripting because it's so arcane. It's cool seeing the history unearthed of some of its baffling idioms.

ShellCheck, for those who might not be aware, is awesome. It's actually helped cure me of my phobia of shell scripting. Plus it's written in Haskell. So it's kind of like Neo fighting Morpheus. Or something.
posted by Alex404 at 1:56 AM on April 12, 2021 [8 favorites]


I've felt gulty about this many times and thought myself lazy for not doing it right (whatever that should be) and that I was the only one who skated by this way. What a relief. Welcome x$friends.
posted by Richard Upton Pickman at 2:12 AM on April 12, 2021 [1 favorite]


I've written autoconf shell before, and the target for that is...well, any shell that ever existed. The genius of the autoconf folks was that they found the perfect turing-complete universal subset of all shell scripting languages, and came up with a style guide.

It absolutely broke me for several years once bash 3.x-or-higher became the universal shell, and we no longer cared if it ran on a ...checks notes... Apollo workstation or ...flips pages... Tandem switch.

Long live [[!
posted by rum-soaked space hobo at 2:41 AM on April 12, 2021 [13 favorites]


Hm, my recollection of picking up the x$whatever syntax had to do with preventing undesired behavior when matching empty strings and (separately) unknown input from a directory listing in Bash circa 1998.

Being the sort that likes acrane syntax, I always enjoyed shell scripts. It served me well when trying to figure out CL scripts on OS/400. The problem with learning mostly by reading other people's work is that I didn't grasp that [ was actually an operator for an embarrassingly long time.
posted by wierdo at 2:47 AM on April 12, 2021 [1 favorite]


Yes, an inability to handle "" as an argument was a problem in some shells. The shell would reduce it too soon, and not leave a token in to represent the empty string. You can tell that your shell doesn't do this because you get something like the following:
$ ls ""
ls: cannot access '': No such file or directory
Pre-POSIX shells would often just reduce the empty quotes to nothing there, and give you a normal ls listing as if you hadn't typed any arguments. That's one reason there were special operators to [/test to check if a variable was empty, but it was always quicker to just use the x-hack in your scripts.
posted by rum-soaked space hobo at 3:30 AM on April 12, 2021 [2 favorites]


I'm always kind of dismayed to see people not taking advantage of modern shell features. I once wrote a library to help people write IRC bots in bash and it has no fewer than four ways to set up the network connection to the server, the final one being built right into bash itself.

When I sang [['s praises above, I wasn't kidding: It supports regex matches! This is built into bash, folks!
posted by rum-soaked space hobo at 3:35 AM on April 12, 2021 [4 favorites]


The last one managed to stay until 2015

Hehehe. The “last bug”. That’s funny.
posted by erniepan at 4:22 AM on April 12, 2021 [8 favorites]


This brings me back to the mid 80s when the way we tested our *nix (AT&T wouldn't allow us to say 'Unix" in those days) was by getting the Bourne shell to run on it. This was surprisingly difficult to pull off.
posted by Obscure Reference at 4:48 AM on April 12, 2021 [1 favorite]


Back in the late 80s a friend wrote a complete system installation automation system in Bourne sh. When I saw it I was impressed. He wrote it because he “couldn’t program” and only “had a degree in international business” (though curiously he was at Berkeley in the early 80s). He had learned and codified all of these tricks for testing error conditions so that all of this could happen unattended, as we had no field support at several thousand locations. [ RIP, Mikey. ].

Later we adopted AIX in the machine room. (It was a little too soon to call it a data center.). I actually thought we should adopt Korn shell. Hey, it least there was a standard. I am reminded of the famous Korn anecdote. We didn’t. Something about different sub-shell behavior killed the adoption.

As much as I love shell, I am reminded of Larry Wall’s quote about Perl. And without Perl, would Python and Ruby and several other languages enjoy the prominence they have today?
posted by grimjeer at 5:21 AM on April 12, 2021 [3 favorites]


Shell isn't arcane. This is arcane.
posted by flabdablet at 6:57 AM on April 12, 2021


Ah, the fun of standardizing things after-the-fact that were hastily hacked together in marathon overnight coding sessions (also see: ECMAScript equality operators)
posted by RobotVoodooPower at 8:45 AM on April 12, 2021


grimjeer, could you narrow down which parts of the articles you linked contain the famous "anecdote" and "quote" about ksh/Perl respectively?
posted by rum-soaked space hobo at 8:55 AM on April 12, 2021


I do admire Vidar's commitment to the absurd speculative fiction blog premise: "What if people never stopped writing shell scripts?"
posted by pwnguin at 10:36 AM on April 12, 2021 [1 favorite]


In the DOS "shell", some windows update caused batch file tests like
if %1==add goto add
to fail with no %1 arg, now requires
if x%1==xadd goto add
-- sort of the Bob version of x$var, complete with goto...
posted by FroggyTheGremlin at 10:41 AM on April 12, 2021 [1 favorite]


> Hehehe. The “last bug”. That’s funny.
The last bug was fixed for the first time, half in jest, on May 21, 2061, at a time when humanity first stepped into the light.

“Will humankind one day without the net expenditure of energy be able to restore the POSIX shell to its original syntax?”
posted by Phssthpok at 11:06 AM on April 12, 2021 [1 favorite]


I know I'm in a minority as someone who actually likes not only Python but Javascript (at least ES6) as well, but my goal with shell scripting has always been to get it working without having to learn any of the details, because I want to retain my faith in humanity as a rational and compassionate species.
posted by bjrubble at 11:30 AM on April 12, 2021 [4 favorites]


Shell script is rational, once you understand that from the shell's point of view, everything is strings. Most of the arcana come down to a need to express breaking one string (a line typed at a terminal and/or read from a script file) into others (typically, arguments to be supplied to executables) in ways that are mostly unobtrusive at the command prompt.

The thing about shell is that it does need to strike a balance between two use cases - interactive command entry, and script construction - and it tends to favour succinct command entry wherever these two cases conflict.

Consequently, as a language for gluing together arbitrary executables it's better than most. As a general purpose problem solving language, not so much.

If Javascript is a Chrysler and Python is a Renault, then shell is a moped, PowerShell is a bulldozer and CMD is a pair of Crocs with drawing pins stuck in the soles. I know which I prefer for a quick trip to the shop.
posted by flabdablet at 12:17 PM on April 12, 2021 [4 favorites]


Also, when reading the shell manual it helps to understand that a "word", for the purposes of that manual, is just a string with completely arbitrary contents, taken as a whole. You can force the shell to treat pretty much anything as a single word if you just quote it properly, and it does you the courtesy of not lumbering your quoted strings with quotes and escapes that only its parser required when passing them on to commands as arguments. The POSIX executable invocation API allows commands to accept an arbitrary number of arbitrary null-terminated strings as arguments, and the shell was designed to be able to generate these cleanly.

If you're ever in doubt about how the shell has chosen to break your command line into words, you can check it by inserting
printf '[%s]\n'
before the start of the line and re-entering it. The shell will hand the format string [%s]\n, then each of the words it's broken the subsequent line into, to the printf builtin as individual arguments; printf the will reapply the format string passed as its first argument as many times as it needs to in order to consume the rest. There's only one format specifier in the format string in this case, so what it outputs is all the words of the original command line as parsed by the shell, one per output line, surrounded by square brackets so you can easily see stuff like empty words and leading or trailing spaces and embedded newlines and whatnot. I use this fairly often when debugging obstreperous shell scripts.

In the DOS "shell", some windows update caused batch file tests like
if %1==add goto add
to fail with no %1 arg, now requires
if x%1==xadd goto add


In cmd scripts you can use

if "%~1" == "add" goto add

which also handles the case where %1 is completely missing in a way that at first blush looks kind of cleaner, but is actually much much gnarlier under the hood. The DOS/Windows parsers work by doing a first pass over the entire script line, substituting %n parameter references and %variable% expansions into place before parsing the line in any other way; this makes all kinds of parsing edge cases fail in ways that just don't happen with shell.
posted by flabdablet at 1:05 PM on April 12, 2021


rum-soaked space hobo [ not a phrase I get to type every day, thank you]

- Korn shell story

From a slashdot interview with David Korn: http://slashdot.org/articles/01/02/06/2030205.shtml

''It was at a USENIX Windows NT conference and Microsoft was presenting their future directions for NT. One of their speakers said that they would release a UNIX integration package for NT that would contain the Korn Shell.

I knew that Microsoft had licensed a number of tools from MKS so I came to the microphone to tell the speaker that this was not the "real" Korn Shell and that MKS was not even compatible with ksh88. I had no intention of embarrassing him and thought that he would explain the compromises that Microsoft had to make in choosing MKS Korn Shell. Instead, he insisted that I was wrong and that Microsoft had indeed chosen a "real" Korn Shell. After a couple of exchanges, I shut up and let him dig himself in deeper. Finally someone in the audience stood up and told him what almost everyone in the audience knew, that I had written the 'real' Korn Shell. I think that this is symbolic about the way the company works.


- As for perl, what I was really thinking of was "It's easier to port a shell than a shell script." - Larry Wall
posted by grimjeer at 1:15 PM on April 12, 2021 [8 favorites]


I think that this is symbolic about the way the company works.

This behavior is by design.
posted by flabdablet at 1:22 PM on April 12, 2021


I know this is sorta off-topic, but

- I remember using PC-DOS back in 1981? (somewhere around there) and my boss at the time complained that it wasn't a real shell because you couldn't clear the screen (CLS). He ran a IBM System34 shop, 5251s, so to him text, terminal, cursors, all that was 'shell'.

- Despite the Korn incident, and despite Microsoft pushing PowerShell as your one-stop-shop for compatible, cross OS scripting, clearly there are parts of MS that get it.
[ and, yeah, my comment about PowerShell is unfair. If I'd grown up around NT and later, I'd likely want PowerShell to succeed against bash, and zsh, and oil, and fish, etc. ]

I admit to my fair share of MS bashing (SWIDT?). WSL has been fun for me, and useful. A lot of my stuff relies on peripheral access (USB devices, etc.), so I still need other machines to do some of my tinkering, but I like it, and I like that they have officially embraced the idea.

I think it is unrealistic to think that they will "replace Windows with Linux", as some folks suggest. I think WSL satisfies a lot of people's itches, and it keeps people using Windows, which is what MS wants. I think it is all part of the 'productivity for devs' 'hearts and minds tour'.
posted by grimjeer at 2:13 PM on April 12, 2021


Shell quirks also show up in several of the Cursed Computer Iceberg poster (previously), such as the lack-of-quote oopsie rm -rf $STEAM_ROOT/, the line noise wtf of :(){ :|:& };:, and the oddly named /usr/bin/[, which is related to the [ "x$var" = "xval" ] oddity.
posted by autopilot at 2:35 PM on April 12, 2021


It is a fun bit of trivia that pre-Bourne shell, as in Ken Thompson's rough hack at a shell, had no flow-control statements built in. It had some basic one-line ternary logic operators I think, but it was extremely limited. So if you wanted to branch somewhere else in your script, you had to shell out to /bin/goto.

Yes, that's right. You forked a subprocess which would run goto and the parent process would resume from the new location.

How, you ask? Well, the subprocess would inherit all file descriptors, including the fd for the script itself. All it needed to do was seek() that; and when it exited, the parent would drop the needle where the subprocess had left it and play the chosen tune.
posted by rum-soaked space hobo at 3:25 PM on April 12, 2021 [2 favorites]


A personal rule of thumb I've sometimes ignored and regretted: once your shell script gets past 20 lines or so, unless there's some reason it absolutely has to be a shell script, give up and start over in your choice of Perl, Python, Ruby.

There are a couple of things that are easier in a shell script than in those, but a zillion things that aren't.

I did enjoy the archaeological dig into bugs/regrettable behaviors across decades.
posted by Zed at 3:30 PM on April 12, 2021 [4 favorites]


zed, the largest program I have converted from shell to a "real" language was 1600 lines and I may still have nightmares. I suggest 2 lines as the limit.

rum-soaked space hobo, you tempt me with that goto.. It could be done without shell support, by the goto command parsing the shebang and execing it on /dev/fd/n seeked appropriately. (Local variables and such would be a small problem.)
posted by joeyh at 4:04 PM on April 12, 2021 [1 favorite]


The mention of AIX and HP-UX dredged up some old memories of supporting a large C application (a COBOL compiler & byte-coded runtime) on a number of increasingly purported Unix variants. The old joke at the time was that AIX was the product of two different aliens trying to explain Unix to each other, only with a broken universal translator.

HP-UX was what made AIX seem like a good idea.


I definitely love shellcheck, especially as a CI check, but am nothing but pleased with my decision in the 2000s to convert anything to Python if it didn't fit in a single 80x25 window or used a data structure more complicated than a string. In about half of the cases this produced smaller code and added features which nobody had implemented because it was just too tedious to contemplate.
posted by adamsc at 4:17 PM on April 12, 2021 [1 favorite]


When I sang [['s praises above, I wasn't kidding: It supports regex matches! This is built into bash, folks!

So, it won't work in crontab (uses dash, typically) and might get the side-eye on Mac OS, which is on zsh these days.

Still here for /usr/bin/[ as a symlink to /usr/bin/test (which it doesn't seem to be any more)
posted by scruss at 4:33 PM on April 12, 2021


Difference between them is that /usr/bin/[ supports --help and --version options or needs to enforce the presence of a trailing ] argument, while /usr/bin/test doesn't.

Both binaries are built from the test source code; the source for [ is just
#define LBRACKET 1
#include "test.c"
In other disappointments, /bin/true is now a 35K executable instead of an empty file.
stephen@jellyshot:~$ which true
/bin/true
stephen@jellyshot:~$ ls -al /bin/true
-rwxr-xr-x 1 root root 35424 Aug  7  2019 /bin/true
stephen@jellyshot:~$ sudo touch /usr/local/bin/true
stephen@jellyshot:~$ sudo chmod +x /usr/local/bin/true
stephen@jellyshot:~$ which true
/usr/local/bin/true
stephen@jellyshot:~$ ls -al /usr/local/bin/true
-rwxr-xr-x 1 root staff 0 Apr 13 14:49 /usr/local/bin/true
stephen@jellyshot:~$ true && echo yes
yes
stephen@jellyshot:~$ 
Fight the power.
posted by flabdablet at 9:51 PM on April 12, 2021 [2 favorites]


joeyh: it is tempting, isn't it? I've sort of pub-quizzed this among colleagues before, and pointed out that I could probably write such a /usr/local/bin/goto in almost any language.

The original relied on the fact that the script for the thompson shell was always stdin (you couldn't read user input in a script, for example), but the fds for the scripts in bash etc are well-known and documented. I haven't looked up if they're pinned in POSIX or anything, but for this level of silly hack it's not necessary.
posted by rum-soaked space hobo at 4:35 AM on April 13, 2021


I've never had any issues with shell scripts, and I think they are many things they do that are way easier than Perl or Python. Basic database queries, reading simple files, and reading and writing files would be some examples, though I haven't regularly programmed in shell in a long time.

Caveat would be that shell scripts are programs for programmers, meant to be quick, and not for end users. If you have an end user in mind, then more appropriate languages should be used for version control, a defined upgrade path, etc.

The company I work for actually wraps every batch job in a kshell script.
posted by The_Vegetables at 8:21 AM on April 13, 2021


« Older A man called Modi: some call him a murderer...   |   State-sponsored non-existent coffee shops Newer »


This thread has been archived and is closed to new comments