Bug in #! processing - One More Time
Garance A Drosihn
drosih at rpi.edu
Wed Feb 23 22:24:59 PST 2005
Sometimes it's the simplest little changes which can suck the
life out of you... I am aware that this is a trivial issue,
but now that I've figured out what is really going on, I am
not sure what the "best" fix would be.
To recap some history:
a) In Jan 2000, someone sent in a PR that perl documentation
(including the famous "Camel" book from O'Reilly) claims
that users can start a script with the line:
#!/bin/sh -- # -*- perl -*- -p
to avoid a variety of issues when writing cross-platform
scripts. Ignore the question of "but why?" for the moment,
it *is* documented by perl (and in books on some other
scripting languages). He proposed a fix, and that was
committed to src/sys/kern/imgact_shell.c as revision 1.21
back in Feb 15 2000 (predating 4.0-release). It was MFC'ed
into release 3.5 on March 20, 2000.
The PR is:
http://www.FreeBSD.org/cgi/query-pr.cgi?pr=16393
NOTE: People *do* use this "feature".
Counter: This feature doesn't actually work on recent
releases of Redhat Linux. I don't know about
other linuxes.
b) In 2002, some other user updated that PR saying that the
new behavior wasn't quite right either. I assume nothing
much was done at the time, but he spent time to collect
a lot of details (which will be given below).
c) In 2004, after 5.3-release, the issue came up again. I assume
that is in another PR, but I haven't checked. In any case,
kern/imgact_shell.c was changed to remove that special
processing for '#, after discussion in -current. The change
was committed to HEAD (6.x) on October 31st as revision 1.27.
It was MFC'ed to 5.3-stable on November 8th.
This broke scripts which depended on the special-handling of
'#', but the conclusion in -current was that /bin/sh should
handle such processing (if it wanted to), and not execve().
d) In January I was finally bitten by this running 6.x-current,
and a friend of mine happened to get hit by it at the same
time running 5.3-stable. So I wrote up a quick fix and did
some minimal testing. I posted that to -current on Jan 31st,
but I didn't want to commit it until I did more testing,
which I wanted to do *after* I brought my systems up-to-date.
e) On January 29th, sobomax committed an "unrelated" fix to
kern/imgact_shell.c, except that it just happened to bring back
the special '#' processing which had been removed in October...
f) I update my systems, do extensive testing of my patch, and I
committed it once I was confident it worked in all situations.
However, I didn't notice that the shell was no longer even
*seeing* the parameters after '#' (I had tested that part
back in #d), so it turns out the key loop I that had added
was never actually getting triggered.
I committed it to 6.x-current last week.
g) On Monday I get ready to MFC the change to 5.3 (ahead of the
rush to beat the code-freeze!). But... the damn thing does
NOT work right in some common situations!! WTF?!?
So, I figure out all the above history, and I locally modify
kern/imgact_shell.c to again remove the special '#'-processing.
I go to fix my patch to /bin/sh, and I realize...
There is no simple, "make everyone happy" fix for it. Sigh.
The problem is in the way the execve() system call passes all
arguments to the shell. Given a shell named /tmp/list_args.pl,
which starts out as:
#!/bin/sh -x -- # -*- perl -*- -p
and is executed via:
/tmp/list_args.pl aaa bbb
What /bin/sh sees for arguments are:
arg[0] == '-x'
arg[1] == '--'
arg[2] == '#'
arg[3] == '-*-'
arg[4] == 'perl'
arg[5] == '-*-'
arg[6] == '-p'
arg[7] == '/tmp/list_args.pl'
arg[8] == 'aaa'
arg[9] == 'bbb'
The problem is that /bin/sh has no way of knowing where the
"shebang-line options" end, and the "command-line options" start.
(or does it? I couldn't think of any reliable way, given that
the '#' could be followed by any totally arbitrary strings).
Going back to the follow-up to PR 16393, part of the challenge
with fixing this is that many other OS's do *not* break up the
options on the shebang line the way FreeBSD does.
From the PR:
Given a file called '/tmp/x2' with shebang line:
#!/tmp/interp -a -b -c #dee eee
If /tmp/x2 is exec'd, the operating system runs /tmp/interp
with the following arguments:
Solaris 8:
args: "/tmp/interp" "-a" "/tmp/x2"
Tru64 4.0:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"
FreeBSD 2.2.7:
args: "/tmp/interp" "-a" "-b" "-c" "#dee" "eee" "/tmp/x2"
FreeBSD 4.0:
args: "/tmp/interp" "-a" "-b" "-c" "/tmp/x2"
Linux 2.4.12:
args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"
Linux 2.2.19:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"
Irix 6.5:
args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"
HPUX 11.00:
args: "/tmp/x2" "-a -b -c #dee eee" "/tmp/x2"
AIX 4.3:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"
Mac OX X:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"
The most common behavior is:
argv[0]: full path of interpreter
argv[1]: all remaining args, coalesced into one string
argv[2]: The file file exec'd.
The change committed back in 2000 made the comment: "This complies
to POSIX 1003.2, in that Posix says the implementation is free to
choose whatever it likes.". I actually like the idea that FreeBSD
splits up the arguments from the shebang-line, but that leaves us
with the problem of figuring out shebang-options from user-specified
options given on the command-line.
As I see it, we have the following choices to fix this:
1) MFC the January 31st change to kern/imgact_shell.c to 5.3-stable,
as it is. This means we haven't fixed the problem that people
complained about in 2002 and again in 2004. And I still think
it is "not appropriate" for the execve() system to be deciding
what '#' means on that line. The biggest advantage is that this
means 5.4-release will behave exactly the same as 3.5 through
5.3-release have behaved.
2) Remove '#'-processing from kern/imgact_shell.c, and remove my
change to bin/sh/options.c (which doesn't work right once we
do that). This breaks shell-scripts which use the feature as
documented by perl (and other scripting languages), and fixes
the problem people complained about in 2002/2004.
3) Change kern/imgact_shell.c to process shebang options the same
way other (non-BSD?) operating systems do. By that I mean:
send the entire string as arg[1], and let the scripting
language sort it out. This is an incompatible change from
FreeBSD 5.3 to 5.4, but would put make us "more consistent"
with other operating systems.
4) Provide some way for /bin/sh to find out where the shebang
options end, and the user-specified options begin. This could
make everyone happy, but it's more work and right now (this
close to 5.4-release) that wouldn't make me particularly happy...
Or we could do #1 for now, and plan to do #4 after 5.4-release.
Or do #1 now in 5.3, and go with some incompatible change (#2
or #3) only in 6.x-current.
What do people think? I know this is a mind-numbingly trivial
issue to care about, but I figured that if I just went ahead
with any particular solution, someone would be irritated with me
and assume I must not have understood "the issues". They will
then commit yet *another* change which undoes whatever I did,
while they fix something they feel that I broke.
And if nothing else, this is proof that one can't just blindly
MFC some change, no matter now trivial it seems.
--
Garance Alistair Drosehn = gad at gilead.netel.rpi.edu
Senior Systems Programmer or gad at freebsd.org
Rensselaer Polytechnic Institute or drosih at rpi.edu
More information about the freebsd-arch
mailing list