Bug in #! processing - One More Time

Maxim Sobolev sobomax at portaone.com
Thu Feb 24 03:00:32 PST 2005


Garance A Drosihn wrote:
> Sometimes it's the simplest little changes which can suck the
> life out of you...  I am aware that this is a trivial issue,
> but now that I've figured out what is really going on, I am
> not sure what the "best" fix would be.
> 
> To recap some history:
> 
> a) In Jan 2000, someone sent in a PR that perl documentation
>    (including the famous "Camel" book from O'Reilly) claims
>    that users can start a script with the line:
> 
>         #!/bin/sh -- # -*- perl -*- -p
> 
>    to avoid a variety of issues when writing cross-platform
>    scripts.  Ignore the question of "but why?" for the moment,
>    it *is* documented by perl (and in books on some other
>    scripting languages).  He proposed a fix, and that was
>    committed to src/sys/kern/imgact_shell.c as revision 1.21
>    back in Feb 15 2000 (predating 4.0-release).  It was MFC'ed
>    into release 3.5 on March 20, 2000.
> 
>    The PR is:
>    http://www.FreeBSD.org/cgi/query-pr.cgi?pr=16393
> 
>       NOTE: People *do* use this "feature".
>    Counter: This feature doesn't actually work on recent
>             releases of Redhat Linux.  I don't know about
>             other linuxes.
> 
> b) In 2002, some other user updated that PR saying that the
>    new behavior wasn't quite right either.  I assume nothing
>    much was done at the time, but he spent time to collect
>    a lot of details (which will be given below).
> 
> c) In 2004, after 5.3-release, the issue came up again.  I assume
>    that is in another PR, but I haven't checked.  In any case,
>    kern/imgact_shell.c was changed to remove that special
>    processing for '#, after discussion in -current.  The change
>    was committed to HEAD (6.x) on October 31st as revision 1.27.
>    It was MFC'ed to 5.3-stable on November 8th.
> 
>    This broke scripts which depended on the special-handling of
>    '#', but the conclusion in -current was that /bin/sh should
>    handle such processing (if it wanted to), and not execve().
> 
> d) In January I was finally bitten by this running 6.x-current,
>    and a friend of mine happened to get hit by it at the same
>    time running 5.3-stable.  So I wrote up a quick fix and did
>    some minimal testing.  I posted that to -current on Jan 31st,
>    but I didn't want to commit it until I did more testing,
>    which I wanted to do *after* I brought my systems up-to-date.
> 
> e) On January 29th, sobomax committed an "unrelated" fix to
>    kern/imgact_shell.c, except that it just happened to bring back
>    the special '#' processing which had been removed in October...
> 
> f) I update my systems, do extensive testing of my patch, and I
>    committed it once I was confident it worked in all situations.
>    However, I didn't notice that the shell was no longer even
>    *seeing* the parameters after '#' (I had tested that part
>    back in #d), so it turns out the key loop I that had added
>    was never actually getting triggered.
> 
>    I committed it to 6.x-current last week.
> 
> g) On Monday I get ready to MFC the change to 5.3 (ahead of the
>    rush to beat the code-freeze!).   But... the damn thing does
>    NOT work right in some common situations!!  WTF?!?
> 
> So, I figure out all the above history, and I locally modify
> kern/imgact_shell.c to again remove the special '#'-processing.
> I go to fix my patch to /bin/sh, and I realize...
> 
> There is no simple, "make everyone happy" fix for it.  Sigh.
> 
> The problem is in the way the execve() system call passes all
> arguments to the shell.  Given a shell named /tmp/list_args.pl,
> which starts out as:
>     #!/bin/sh -x -- # -*- perl -*- -p
> 
> and is executed via:
>     /tmp/list_args.pl aaa bbb
> 
> What /bin/sh sees for arguments are:
>      arg[0] == '-x'
>      arg[1] == '--'
>      arg[2] == '#'
>      arg[3] == '-*-'
>      arg[4] == 'perl'
>      arg[5] == '-*-'
>      arg[6] == '-p'
>      arg[7] == '/tmp/list_args.pl'
>      arg[8] == 'aaa'
>      arg[9] == 'bbb'
> 
> The problem is that /bin/sh has no way of knowing where the
> "shebang-line options" end, and the "command-line options" start.
> (or does it?  I couldn't think of any reliable way, given that
> the '#' could be followed by any totally arbitrary strings).
> 
> Going back to the follow-up to PR 16393, part of the challenge
> with fixing this is that many other OS's do *not* break up the
> options on the shebang line the way FreeBSD does.
>  From the PR:
> 
>     Given a file called '/tmp/x2' with shebang line:
>     #!/tmp/interp -a -b -c #dee eee
> 
>     If /tmp/x2 is exec'd, the operating system runs /tmp/interp
>     with the following arguments:
> 
>     Solaris 8:
>          args: "/tmp/interp" "-a" "/tmp/x2"
> 
>     Tru64 4.0:
>          args: "interp" "-a -b -c #dee eee" "/tmp/x2"
> 
>     FreeBSD 2.2.7:
>          args: "/tmp/interp" "-a" "-b" "-c" "#dee" "eee" "/tmp/x2"
> 
>     FreeBSD 4.0:
>          args: "/tmp/interp" "-a" "-b" "-c" "/tmp/x2"
> 
>     Linux 2.4.12:
>          args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"
> 
>     Linux 2.2.19:
>          args: "interp" "-a -b -c #dee eee" "/tmp/x2"
> 
>     Irix 6.5:
>          args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"
> 
>     HPUX 11.00:
>          args: "/tmp/x2" "-a -b -c #dee eee" "/tmp/x2"
> 
>     AIX 4.3:
>          args: "interp" "-a -b -c #dee eee" "/tmp/x2"
> 
>     Mac OX X:
>          args: "interp" "-a -b -c #dee eee" "/tmp/x2"
> 
>     The most common behavior is:
>          argv[0]: full path of interpreter
>          argv[1]: all remaining args, coalesced into one string
>          argv[2]: The file file exec'd.
> 
> The change committed back in 2000 made the comment: "This complies
> to POSIX 1003.2, in that Posix says the implementation is free to
> choose whatever it likes.".  I actually like the idea that FreeBSD
> splits up the arguments from the shebang-line, but that leaves us
> with the problem of figuring out shebang-options from user-specified
> options given on the command-line.
> 
> As I see it, we have the following choices to fix this:
> 
> 1) MFC the January 31st change to kern/imgact_shell.c to 5.3-stable,
>    as it is.  This means we haven't fixed the problem that people
>    complained about in 2002 and again in 2004.  And I still think
>    it is "not appropriate" for the execve() system to be deciding
>    what '#' means on that line.  The biggest advantage is that this
>    means 5.4-release will behave exactly the same as 3.5 through
>    5.3-release have behaved.
> 
> 2) Remove '#'-processing from kern/imgact_shell.c, and remove my
>    change to bin/sh/options.c (which doesn't work right once we
>    do that).  This breaks shell-scripts which use the feature as
>    documented by perl (and other scripting languages), and fixes
>    the problem people complained about in 2002/2004.
> 
> 3) Change kern/imgact_shell.c to process shebang options the same
>    way other (non-BSD?) operating systems do.  By that I mean:
>    send the entire string as arg[1], and let the scripting
>    language sort it out.  This is an incompatible change from
>    FreeBSD 5.3 to 5.4, but would put make us "more consistent"
>    with other operating systems.
> 
> 4) Provide some way for /bin/sh to find out where the shebang
>    options end, and the user-specified options begin.  This could
>    make everyone happy, but it's more work and right now (this
>    close to 5.4-release) that wouldn't make me particularly happy...
> 
> Or we could do #1 for now, and plan to do #4 after 5.4-release.
> Or do #1 now in 5.3, and go with some incompatible change (#2
> or #3) only in 6.x-current.
> 
> What do people think?  I know this is a mind-numbingly trivial
> issue to care about, but I figured that if I just went ahead
> with any particular solution, someone would be irritated with me
> and assume I must not have understood "the issues".  They will
> then commit yet *another* change which undoes whatever I did,
> while they fix something they feel that I broke.
> 
> And if nothing else, this is proof that one can't just blindly
> MFC some change, no matter now trivial it seems.

I would vote for making #3 and respective /bin/sh changes and MFCing 
them into 5.4. We don't have that many shell scripts that rely on the 
previus functionality - ones that in the base system (if any) can be 
easily fixed, while ones in /usr/ports can be conditionalized on 
OSVERSION. Removing yet another superfluous difference between FreeBSD 
and other systems out there is good thing especially considering that 
BSD-way creates serious problems that can't be resolved without changing 
semantics anyway.

-Maxim


More information about the freebsd-arch mailing list