A lesson about using env(1) in script #! lines

If you've read the "perlrun" man page, or you've just read a lot of shell scripts, you'll surely have come across "the env trick". The idea is that instead of using a hard-coded path to the interpreter, you use a hard-coded path to env(1) and let it find the interpreter on the user's path instead.

The supposed advantage of using the env trick (and the newer your scripting language is, the more useful the trick) is that although popular scripting languages gravitate towards /usr/bin/, they tend to start off in such lowly places as ~/bin/ and /usr/local/bin/. env(1), on the other hand, is as old as the hills, and (though POSIX doesn't guarantee it), it's always /usr/bin/env (except on systems you don't care about, according to '#!'-magic, details about the shebang mechanism which has only OpenServer 5.0.6 and Unicos 9.0.2 as problem cases).

So by using env(1), you make your script more portable, and you also make it possible for users to run their own versions of the interpreter by modifying their path.

So how come the env trick didn't take the world by storm? I've seen more scripts that didn't use it than ones that did. And in the past that's always been reason enough not to use the trick. "There must be something wrong with it, so I won't use it either, even if I don't know what's wrong."

Recently software.jessies.org moved server, and in doing so moved from Linux to Solaris. This meant, as you'd expect, some modification of our build system. Solaris has rather old-fashioned versions of various utilities, and even if you don't need any fancy modern features, you can forget about using options with meaningful names. Other than switching from a few long-name options to POSIX options, though, the main problem was that all our Ruby scripts expected Ruby to be in /usr/bin/, as it is on Cygwin, Linux, and Mac OS. Solaris doesn't ship with Ruby, and on our server it was installed in /usr/local/bin/.

Why not just put Ruby in /usr/bin/ on the Solaris box? Well, it's the usual conundrum you face when you combine the notions "cross-platform" and "consistency". You can either be consistent across the platforms, or consistent with each individual platform, and either choice is perfectly reasonable, but the two choices are often mutually exclusive. Here, if you modify your Solaris installation, you're more comfortable because it's then like Cygwin, Linux, and Mac OS, but you haven't really solved your original problem: you still don't work on Solaris. There's no "right" answer, and both alternatives suck in different ways. Had it been my server, I'd have gone for consistency with Linux and Mac OS. The guy whose server it is, though, went for consistency with Solaris.

So, we needed to do something. At this point, we thought the env trick would come to our rescue. We could think of two possible negatives:

1. Performance. There's going to be some tiny overhead in calling env(1) and searching the path, but our scripts don't get executed often enough for that to be a concern.

2. Security/Reproduceability. Using env(1) gives away a little bit of control over what executes your scripts. It's hard to see that relying on the exact interpreter isn't a mistake anyway, and it's certainly not something we were concerned about. What might have caused us trouble, though, is the potential reproduceability problems when you realize that you were testing with a different version of the interpreter than was being used in production.

Not caring about those, we switched to using env(1). So where we'd had:

#!/usr/bin/ruby -w

We now had:

#!/usr/bin/env ruby -w

Solaris was fixed, and Mac OS worked fine. Linux, though, was now broken:

/usr/bin/env: ruby -w: No such file or directory.

If you look at what POSIX has to say, or you look at the GNU info page, it's clear that env(1) supports multiple arguments. The problem is that env(1) isn't being passed multiple arguments, at least not on Linux. Linux's binfmt_script.c contains Linux's code for interpreting #! lines, and as you can see, it only allows a single argument after the interpreter name. So if your "interpreter" is env(1), you're going to have "ruby -w" as your single argument. Mac OS 10.3 (but not 10.4) has a similar problem.

[If you want to know almost everything about how #! behaves on different platforms, have a look at #! - the Unix truth as far as I know it by Andries Brouwer. The only thing it seems to be missing is the fact that POSIX doesn't even guarantee that #! does anything (though I don't know of a contemporary Unix where it isn't supported), and Mac OS seems a strange omission, even for 2002.]

So that's not going to work.

3. Passing arguments to the interpreter via env(1) is not portable. In particular, all versions of Linux and older versions of Mac OS are broken. And this is an ancient design decision that's unlikely to change, like the one that often forbids calling an interpreter that's itself a script from a #! line. And even if it were fixed today, there are a lot of machines out there that would still be broken, many of which will remain broken until the day they're unplugged from the wall.

We could try to get away with calling Ruby with no arguments. The $VERBOSE variable (along with the $-v and $-w variables) are all bound to the same switch inside the implementation that the -w command-line argument affects. So you could write:

#!/usr/bin/env ruby
$-w = true

But that doesn't mean the same thing, despite what I said above about it being bound to the same C variable in the implementation. Why not? Because by the time Ruby gets round to assigning true to $-w, it's already done a lot of work. In particular, it's compiled your entire script with the -w flag off. Which is almost certainly not what you wanted.

You could try switching to Python. I had expected that, unlike Perl and Ruby, which seem to quite like the idea that they might get to watch you shoot your foot off, Python might get more enjoyment out of preventing you from shooting your foot off. But even there, as far as I can tell, warnings are optional and have to be turned on. (Diagnostics is one area where Ruby is significantly worse than Perl. Not only do syntax errors refer to the lexical analyzer's implementation rather than sticking to user-visible terminology, talking about kEND instead of "end", for example, but Ruby has nothing corresponding to Perl's very useful "Name "handel" used only once: possible typo" error checking. Because obviously every single script needs to add methods and fields at run-time, so every script should pay to use that feature. Stroustrup needs to give scripting language authors a good hard beating with his cluestick. Trouble is, they'd probably enjoy it.)

So for now we've gone back to hard-coding /usr/bin/ruby, and we've put specific changes in place to provide the scripts as input to Ruby on Solaris, rather than rely on being able to execute them directly. Sad, but good enough. I guess the only real solutions would be to either find a scripting language where warnings are on by default, or rewrite the scripts on installation to contain a suitable path. The latter choice being of little use when you mainly run directly out of a checked-out repository and don't go through any kind of installation phase, as is the case for us.