svn commit: r326554 - in head: . usr.bin/sponge usr.bin/sponge/tests usr.bin/tee

Tue Dec 5 17:18:30 UTC 2017

>> On Dec 5, 2017, at 8:29 AM, Matt Joras <matt.joras at gmail.com> wrote:
>> 
>> On Tue, Dec 5, 2017 at 8:06 AM, Devin Teske <devin at shxd.cx> wrote:
>> 
>> 
>> The problems I have are:
>> 
>> 1. Should be in ports
>> 
>> Not pre-installed on Linux, why should we have it in base?
> "Pre-installed on Linux" is meaningless. The closest analog to our
> base is arguably GNU coreutils. It is indeed not part of GNU coreutils
> but then again there are several things in our base that are not in
> coreutils, and vice versa. Why should we have anything in base? If
> people find it useful and it doens't have a high cost of
> maintainership then why not have it?

Hand-waving and baseless conjecture is not going to change the fact that I have administered tens-of-thousands of Linux machines and:

1. None have ever had moreutils installed
2. Before today I have never heard of sponge or moreutils

You point to Linux not having a "base" as it were, but this is patently false.

In the kickstart scripts for anaconda, you can specify ^@minimal or @base or @whatever to get a classification of packages that have been glommed together. When installing Linux (be it CentOS, RedHat, Ubuntu, or whatever) there are package collections.

No collection that I have ever installef anywhere has ever installed moreutils. Ever.

Contrast that with the facts that:

All linux distros regardless of what package collection you choose gives you awk, sed, tr, etc.

So while you point to the notion that "Linux has no base" and "everything is a package anyway" (my paraphrasing), this is not the case.

It is not arguing "slippery slope" to not want sponge in base, but rather that it literally does not get installed with any package collection in Linux and therefore no Linux base has it (the "base" in Linux is based on the package collection you choose, and thus since you cannot choose a package collection that contains it, you cannot build a Linux system that has sponge on first-boot unless you explicitly mention sponge as a package in the packages section of a kickstart/preseed or manually select during install).

> 
>> If in base, people will target it thinking it solves a need that can't
>> otherwise be solved and thus end up creating a script that is less portable
>> because it is encumbered with dependencies specific to our base.
> It's not even a homegrown idea though... As we've already covered this
> is a tool that exists in the broader OSS ecosystem.

Tens of thousands of machines across multiple companies tells me otherwise.

> As long as it is
> compatible with the more common implementation I don't see the issue.
> Anything one writes using it is just "encumbered" with a dependency on
> sponge.

>> 2. Teaches bad practice
>> 
>> sed ... somefile | sponge somefile
>> 
>> Ignores if there is a sed error and indiscriminately zeroes somefile.
> 
> Calling this unequivocally bad practice is silly.

sh(1) manually calls it bad practice

> There are plenty
> uses of sponge that aren't bad practice. I have a git commit hook that
> utilizes sponge to do the same "auto-culling" that our svn patches do.

Ignoring errors in a pipeline as was pointed out.

> I like the sponge version better than creating temporary files myself:

You.
Do not.
Need.
Temporary files.

> sed '/^$/d' $(git config commit.message) | awk 'NR==FNR{a[$0];next}
> !($0 in a)' /dev/fd/0 "$1" | sponge "$1"

No temp files AND proper error checking.

set -e # All errors fatal
data=$( sed '/^$/d' $(git config commit.message) )
uncommon_lines=$( echo "$data" | awk 'NR==FNR{a[$0];next} !($0 in a)' /dev/fd/0 "$1" )
echo "$uncommon_lines" > "$1"

If you don't want the "set -e" just throw "|| exit" at the end of the first two lines. The important bit is:

1. If the path produced by $( git config commit.message ) makes sed throw an error (e.g., ENOENT) resulting in error status, you don't blindly go on to overwrite "$1"

2. If awk is unable to open and read "$1" on the second line, you do not blindly forge ahead and overwrtie "$1"

3. Only if you successfully ran sed, successfully ran awk, should "$1" be updated.

The sponge approach is just plain bad practice because:

A. An error on the commit message results in "$1" being truncated to zero bytes

B. An error on /dev/fd/0 results in "$1" being truncated to zero bytes

C. An error by awk on "$1" results in "$1" being truncated to zero bytes (e.g., a flaky NFS connection wherein an awk read() fails but the sponge write() succeeds).

> 
>> 3. Solution in search of a problem
> Again, stating this unequivocally is silly. I discovered sponge years
> ago when I was searching how best to handle something where I wanted
> to write output back to the same file in a shell pipeline.

Something that is explicitly warned against in sh(1).

> I was
> literally someone with a problem in search of a solution, and that
> solution was and still is sponge.

Sure, if you fly in the face of warnings abound.

> Since then I have seen it
> recommended numerous times in passing.

Ubiquitous bad practice is still bad practice.
-- 
Devin