Dash helps Bash users relive the bad old days

Just as you thought the old shell wars of the 1980s and 1990s were over, just as you thought Bash was the only shell left standing, along comes a stupid idea.

I've talked before about the trials and tribulations of installation, and how there are two schools of thought when it comes to dependencies; that there's the "I'm the admin and I should be able to install whatever implementations/versions of prerequisites I please and force their use system-wide" view and the "I'm the developer, and if I want the thing to Just Work (and avoid a lifetime of handling support calls) I need to explicitly ensure and directly refer to my own dependencies". So in the case of a JVM, the first guy thinks "I should be able to run your application with whatever JVM I choose" and the second guy thinks "I'm going to ensure I run on the JVM I've tested".

They're both right in their own way, but in the real world we can't satisfy them both.

Me? I started off as the first guy (even a developer starts off as a user, after all) and only later learned that my sympathies were misplaced, and both the developer and the user is better off when things Just Work, and it's simply too hard to do that if you leave yourself completely at the user's mercy.

There's a reason why Linux distributions with good packaging systems are the Linux mainstream (and why such a Linux system is a much more comfortable place to be than any other Unix, including Mac OS). Something like Debian packages are the best compromise between the two somewhat-conflicting interests of administrator/user and developer. (I'm ignoring the conflicting interests of administrators and users because I think they mainly play out in the commercial Unixes and Windows. No-one gets a Linux box to be locked down and locked in, and no-one wanting to lock you down and lock you in gives you a Linux box.)

I've also talked before about #! lines, and the different choices you have for what comes next, and how and why an explicit absolute path is the best choice. At the time, I was concentrating on Ruby. For various reasons, though, there are still shell scripts out there. And they're a good example of these same problems.

Once upon a time, we'd write "#!/bin/sh" and we'd be careful to write pure Bourne shell code. (Or maybe we're old enough that there weren't really any alternatives to tempt us at the time.) Then Bash came along, and it sucked slightly less, especially interactively. Some of us may have dallied with better shells like rc(1) in the meantime, but Bash, for the usual sad reasons, was the first real contender for the default shell crown. And Bash took that crown, first on Linux, where it was a bit of a walk over, and later on Mac OS. (The two AIX users, three Solaris users, and that guy with the IRIX box that hasn't actually been turned on in a couple of years in the audience: you don't count. No-one cares. You probably don't even have Python and Ruby out of the box. You may not come with GCC or GNU Make. You probably have real killjoy Bill-Joy vi instead of Vim. If a program is available in an old and crufty flavor, you're still using it. This is why no-one cares. People have to be paid to write software for you, and they still see it as a chore, and you know what kind of stuff that attracts? Oracle. Enjoy!)

Anyway, POSIX specifies the behavior guaranteed by /bin/sh, but doesn't offer an implementation. So much politicking occurs. Note that this doesn't affect the default login shell. That choice is also important for your OS' overall usability, as anyone who used Mac OS in the tcsh(1) days can attest. /bin/sh is effectively the default shell for shell scripts not wise enough to say what they mean. And though you're told to be "portable" or "POSIX compliant" and ask for whatever random shell /bin/sh should happen to be (and that choice should be left to someone other than the developer of the script for what reason?) and so on and so forth, it's actually terrible advice. The people who give it are well-meaning, and in a perfect would it might be good advice. But in the real world, it sucks.

In the real world, not all shells are created equal. No, you shouldn't be doing anything very complicated in a shell script these days. Wrong decade. But still.

In the real world, people have the not unreasonable expectation that when you ask them for shell-like input (their .xsession, say) it will be run by the same shell they normally use. Unfortunately, I'm not aware of any software (beyond shells themselves) that works like this, or that says "just give me an executable, and I'll run it rather than 'source' it into some random shell for you". (I realize this is a slightly different issue, with a slightly different solution, but it's related and relevant and worth mentioning in passing.)

In the real world, programs have bugs. Use only "POSIX compliant" features all you like, but run with enough "POSIX compliant" shells, and you will come across differences in behavior. Sometimes it's because some implementations are broken. Sometimes it's because all implementations are broken. Sometimes it's because the specification isn't implementable. Sometimes it's because the specification is ambiguous. Sometimes the specification is incomplete. This isn't just true of POSIX or POSIX shells; this is true of software. More generally...

In the real world, only the very configurations you've actually tested really work.

Those are some reasons why, next time you think of writing "#!/bin/sh", you should be more specific instead.

As an example, I had trouble switching from Debian to Ubuntu at work because various scripts had, over the years, come to depend on Bash features. But Ubuntu uses "dash" as /bin/sh, and the scripts weren't asking for /bin/bash explicitly. They'd worked for years on dash-free Linux installations. And now they were quietly or subtly broken. If the authors of those scripts hadn't wasted their time chasing the illusion of portability, those scripts would have worked on Ubuntu too.

A friend had a similar problem, where his .xsession (in his home directory accessed over the network via NFS) relied on a Bash feature which meant he couldn't log in to GDM on Ubuntu. He wasn't told what the problem was, of course, or given any reasonable way to diagnose it. And his assumption, that a file of commands he was supposed to write but couldn't specify a shell for would use his default shell, was a perfectly reasonable assumption. Of course it ought to work like that. Anything else would be crazy. ("Welcome to X11", as they say.)

The really funny part, though, is that the usual hobgoblin of POSIX compliance doesn't seem to have been the reason. The reason is speed, according to the dash as /bin/sh spec. If you read that, you'll see some fairly impressive numbers. 30s knocked off the boot of an "old laptop"? Wow.

My friend tried booting Ubuntu on an old laptop. A Dell Latitude C640, which a quick web search tells me is from the year 2002. Booting as far as rc.local takes 45s with Bash and 43s with dash. Not 15s, which would be awesome, and well worth the inconvenience of having a different default /bin/sh from the default login shell. Nope. A whole 2s were saved.

It's probably no coincidence that Nokia are mentioned. And, yes, they run Linux on a machine with about as much processing power as my wristwatch. But that's their mistake, not ours. And something they should be paying for, not us.

There's no real lesson here that we didn't already know. Be explicit and specific. Portability comes from hard work, special cases, and – probably most importantly of all – testing, not from specification compliance. (I consider myself a Java programmer, and most of my hard-won experience is with Java applications, so don't waste your time thinking there are any magic bullets.)

Oh, and don't trust benchmarks unless you understand them and ran them yourself.