Bug in #! processing - One More Time
Maxim Sobolev
sobomax at portaone.com
Thu Feb 24 03:00:32 PST 2005
Garance A Drosihn wrote:
> Sometimes it's the simplest little changes which can suck the
> life out of you... I am aware that this is a trivial issue,
> but now that I've figured out what is really going on, I am
> not sure what the "best" fix would be.
>
> To recap some history:
>
> a) In Jan 2000, someone sent in a PR that perl documentation
> (including the famous "Camel" book from O'Reilly) claims
> that users can start a script with the line:
>
> #!/bin/sh -- # -*- perl -*- -p
>
> to avoid a variety of issues when writing cross-platform
> scripts. Ignore the question of "but why?" for the moment,
> it *is* documented by perl (and in books on some other
> scripting languages). He proposed a fix, and that was
> committed to src/sys/kern/imgact_shell.c as revision 1.21
> back in Feb 15 2000 (predating 4.0-release). It was MFC'ed
> into release 3.5 on March 20, 2000.
>
> The PR is:
> http://www.FreeBSD.org/cgi/query-pr.cgi?pr=16393
>
> NOTE: People *do* use this "feature".
> Counter: This feature doesn't actually work on recent
> releases of Redhat Linux. I don't know about
> other linuxes.
>
> b) In 2002, some other user updated that PR saying that the
> new behavior wasn't quite right either. I assume nothing
> much was done at the time, but he spent time to collect
> a lot of details (which will be given below).
>
> c) In 2004, after 5.3-release, the issue came up again. I assume
> that is in another PR, but I haven't checked. In any case,
> kern/imgact_shell.c was changed to remove that special
> processing for '#, after discussion in -current. The change
> was committed to HEAD (6.x) on October 31st as revision 1.27.
> It was MFC'ed to 5.3-stable on November 8th.
>
> This broke scripts which depended on the special-handling of
> '#', but the conclusion in -current was that /bin/sh should
> handle such processing (if it wanted to), and not execve().
>
> d) In January I was finally bitten by this running 6.x-current,
> and a friend of mine happened to get hit by it at the same
> time running 5.3-stable. So I wrote up a quick fix and did
> some minimal testing. I posted that to -current on Jan 31st,
> but I didn't want to commit it until I did more testing,
> which I wanted to do *after* I brought my systems up-to-date.
>
> e) On January 29th, sobomax committed an "unrelated" fix to
> kern/imgact_shell.c, except that it just happened to bring back
> the special '#' processing which had been removed in October...
>
> f) I update my systems, do extensive testing of my patch, and I
> committed it once I was confident it worked in all situations.
> However, I didn't notice that the shell was no longer even
> *seeing* the parameters after '#' (I had tested that part
> back in #d), so it turns out the key loop I that had added
> was never actually getting triggered.
>
> I committed it to 6.x-current last week.
>
> g) On Monday I get ready to MFC the change to 5.3 (ahead of the
> rush to beat the code-freeze!). But... the damn thing does
> NOT work right in some common situations!! WTF?!?
>
> So, I figure out all the above history, and I locally modify
> kern/imgact_shell.c to again remove the special '#'-processing.
> I go to fix my patch to /bin/sh, and I realize...
>
> There is no simple, "make everyone happy" fix for it. Sigh.
>
> The problem is in the way the execve() system call passes all
> arguments to the shell. Given a shell named /tmp/list_args.pl,
> which starts out as:
> #!/bin/sh -x -- # -*- perl -*- -p
>
> and is executed via:
> /tmp/list_args.pl aaa bbb
>
> What /bin/sh sees for arguments are:
> arg[0] == '-x'
> arg[1] == '--'
> arg[2] == '#'
> arg[3] == '-*-'
> arg[4] == 'perl'
> arg[5] == '-*-'
> arg[6] == '-p'
> arg[7] == '/tmp/list_args.pl'
> arg[8] == 'aaa'
> arg[9] == 'bbb'
>
> The problem is that /bin/sh has no way of knowing where the
> "shebang-line options" end, and the "command-line options" start.
> (or does it? I couldn't think of any reliable way, given that
> the '#' could be followed by any totally arbitrary strings).
>
> Going back to the follow-up to PR 16393, part of the challenge
> with fixing this is that many other OS's do *not* break up the
> options on the shebang line the way FreeBSD does.
> From the PR:
>
> Given a file called '/tmp/x2' with shebang line:
> #!/tmp/interp -a -b -c #dee eee
>
> If /tmp/x2 is exec'd, the operating system runs /tmp/interp
> with the following arguments:
>
> Solaris 8:
> args: "/tmp/interp" "-a" "/tmp/x2"
>
> Tru64 4.0:
> args: "interp" "-a -b -c #dee eee" "/tmp/x2"
>
> FreeBSD 2.2.7:
> args: "/tmp/interp" "-a" "-b" "-c" "#dee" "eee" "/tmp/x2"
>
> FreeBSD 4.0:
> args: "/tmp/interp" "-a" "-b" "-c" "/tmp/x2"
>
> Linux 2.4.12:
> args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"
>
> Linux 2.2.19:
> args: "interp" "-a -b -c #dee eee" "/tmp/x2"
>
> Irix 6.5:
> args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"
>
> HPUX 11.00:
> args: "/tmp/x2" "-a -b -c #dee eee" "/tmp/x2"
>
> AIX 4.3:
> args: "interp" "-a -b -c #dee eee" "/tmp/x2"
>
> Mac OX X:
> args: "interp" "-a -b -c #dee eee" "/tmp/x2"
>
> The most common behavior is:
> argv[0]: full path of interpreter
> argv[1]: all remaining args, coalesced into one string
> argv[2]: The file file exec'd.
>
> The change committed back in 2000 made the comment: "This complies
> to POSIX 1003.2, in that Posix says the implementation is free to
> choose whatever it likes.". I actually like the idea that FreeBSD
> splits up the arguments from the shebang-line, but that leaves us
> with the problem of figuring out shebang-options from user-specified
> options given on the command-line.
>
> As I see it, we have the following choices to fix this:
>
> 1) MFC the January 31st change to kern/imgact_shell.c to 5.3-stable,
> as it is. This means we haven't fixed the problem that people
> complained about in 2002 and again in 2004. And I still think
> it is "not appropriate" for the execve() system to be deciding
> what '#' means on that line. The biggest advantage is that this
> means 5.4-release will behave exactly the same as 3.5 through
> 5.3-release have behaved.
>
> 2) Remove '#'-processing from kern/imgact_shell.c, and remove my
> change to bin/sh/options.c (which doesn't work right once we
> do that). This breaks shell-scripts which use the feature as
> documented by perl (and other scripting languages), and fixes
> the problem people complained about in 2002/2004.
>
> 3) Change kern/imgact_shell.c to process shebang options the same
> way other (non-BSD?) operating systems do. By that I mean:
> send the entire string as arg[1], and let the scripting
> language sort it out. This is an incompatible change from
> FreeBSD 5.3 to 5.4, but would put make us "more consistent"
> with other operating systems.
>
> 4) Provide some way for /bin/sh to find out where the shebang
> options end, and the user-specified options begin. This could
> make everyone happy, but it's more work and right now (this
> close to 5.4-release) that wouldn't make me particularly happy...
>
> Or we could do #1 for now, and plan to do #4 after 5.4-release.
> Or do #1 now in 5.3, and go with some incompatible change (#2
> or #3) only in 6.x-current.
>
> What do people think? I know this is a mind-numbingly trivial
> issue to care about, but I figured that if I just went ahead
> with any particular solution, someone would be irritated with me
> and assume I must not have understood "the issues". They will
> then commit yet *another* change which undoes whatever I did,
> while they fix something they feel that I broke.
>
> And if nothing else, this is proof that one can't just blindly
> MFC some change, no matter now trivial it seems.
I would vote for making #3 and respective /bin/sh changes and MFCing
them into 5.4. We don't have that many shell scripts that rely on the
previus functionality - ones that in the base system (if any) can be
easily fixed, while ones in /usr/ports can be conditionalized on
OSVERSION. Removing yet another superfluous difference between FreeBSD
and other systems out there is good thing especially considering that
BSD-way creates serious problems that can't be resolved without changing
semantics anyway.
-Maxim
More information about the freebsd-arch
mailing list