svn commit: r268376 - head/bin/rm

Sat Jul 12 04:51:54 UTC 2014

On Fri, 11 Jul 2014, Jilles Tjoelker wrote:

> On Thu, Jul 10, 2014 at 12:34:17AM +1000, Bruce Evans wrote:
>> On Mon, 7 Jul 2014, Warner Losh wrote:
>
>>> Log:
>>>  rm -rf can fail sometimes with an error from fts_read. Make it honor
>>>  fflag to ignore fts_read errors, but stop deleting from that directory
>>>  because no further progress can be made.
>
>>>  When building a kernel with a high -j value on a high core count
>>>  machine, during the cleanobj phase we can wind up doing multiple rm
>>>  -rf at the same time for modules that have subdirectories. This
>>>  exposed this race (sometimes) as fts_read can return an error if the
>>>  directory is removed by another rm -rf. Since the intent of the -f
>>>  flag was to ignore errors, even if this was a bug in fts_read, we
>>>  should ignore the error like we've been instructed to do.
>
>> This seems to be more broken than before.  Although the intent of using
>> -f is often to ignore errors, -f is supposed to only ignore ENOENT
>> errors except in some cases for complicated interactions with -i.
>
> Yes, it looks like POSIX requires ignoring non-existent files only
> before recursing (for directory files). Hence, in particular, if a file
> is deleted by another process between being read from the directory and
> being checked for existence/type, -f shall cause the error to be
> ignored, but if a directory is deleted by another process while files
> are being deleted from it, the [ENOENT] error from rmdir() shall not be
> ignored.
>
> This probably does not make much sense and may be a defect in the
> standard.

Actually it makes perfect sense -- see below.

> In this case, the error is probably because fts tried to change
> directory to ".." which no longer exists in the deleted directory. This
> causes fts to abort and rm may not do all required work. In the case
> that two instances rm attempt to remove the same directory tree in
> parallel, hiding the error message is indeed correct.

The change is completely wrong becauses the case that it "fixes" is
the one that most needs a diagnostic since the error is recoverable
if you know about it.  It is only when the 2 instances remove the same
tree that you can expect that at least one of them completes the removal
for both (even that is not not clear since if the traversal order is
a little different then each might cause problems for the other by
removing a subtree).

I tested this with a directory /tmp/b containing /tmp/b/0/0/0 and
/tmp/b/1.  Run "rm -rfv /tmp/b" under gdb and stop it after it removes
/tmp/b/0/0/0.  Then rm -rf /tmp/b/0/0 separately.  This causes fts_read()
in the first rm to return NULL with errno ENOENT when it tries to get back
to /tmp/b/0/0.  Although the errno is ENOENT, ignoring the error is wrong
since the first rm failed to remove /tmp/b/0 and /tmp/b/1.

The recent change doesn't even fix the non-removal of /tmp/b/1 by
ignoring the error in the /tmp/b/0 subtree.  This is because fts handles
all the args in one fts_open() and an error for any arg is fatal for
all args.  So a simpler example is possible: just /tmp/b/0/0.  Run "rm
-rfv /tmp/b" under gdb and stop it after it removes /tmp/b/0/0.  Then
rm -rf /tmp/b/0 separately.  The first rm then exits without removing
/tmp/b.  The change breaks its reporting of an error for this in the
-f case.

fts's handling of all the args is precisely what is required to break
rm's specification of "go on to any remaining files" after an error in
one of the files.  I think fts could clean up well enough to go on
provided either current directory doesn't go away or the pathname of
the next file is absolute.  But fts couldn't reasonably do the same
error handling as all utilities (just for rm, the error handling depends
on -f and -i).  rm could repeat the whole operation if it can chdir()
to the original directory (this is easy to try using fts_close()) or
all the pathnames are absolute, but without -f this would often cause
a spew of diagnostics for files already removed, so it seems best to
just print a diagnostic and let the user repeat the operation.  I
would just change the diagnostic to give a hint about this.

The standard makes perfect sense since it is only at the top level that
ENOENT corresponds exactly to removal of the file.  At lower levels it
is unexpected, and it is not always clear if it is harmless.  Since it
is harmful in the most likely case where it happens (contending rm's),
the standard shouldn't have complications to say when it is safe to
ignore it or require ignoring it.

>> % 	while ((p = fts_read(fts)) != NULL) {
>
>> Normal loop exit is when this is NULL.  Then there is no error, and the
>> errno that is tested is pure garbage.   Fixing this alone might fix the
>> problem.  Otherwise, hopefully any error from things going away causes
>> an ENOENT error, so checking for that alone would fix the error handling.
>> That's a bit much to ask for -- if a whole subtree is removed then there
>> should be multiple ENOENT errors and probably other errors, and fts_read()
>> would find it hard to reduce to a single ENOENT without missing a more
>> serious error.
>
> Various fts(3) functions such as fts_read() violate the C standard rules
> for library functions in that they may set errno to 0. If fts_read()
> reaches the end, it sets errno to 0.

Apparently only when it reaches the end successfully, since I got ENOENT.
This was in an old version of FreeBSD.  Checking shows interesting
details:
- after fts_close(), errno may be the one from fchdir() (this is the most
   important one, and has precedence) or an earlier error.  Errors from
   close() are ignored and errno is preserved across close().  However,
   the man page says that the error may be from chdir() or close().
- error handling after fchdir() failure in fts_read was too complicated
   to check.  Errors from close() are ignored/not allowed to affect errno
   almost everywhere.

>> % 		case FTS_ERR:
>> % 			errx(1, "%s: %s", p->fts_path, strerror(p->fts_errno));
>
>> Exit for fatal errors.  I think POSIX doesn't allow this, but requires
>> contining to the next arg.  Exiting the loop should give that, unless
>> fts_close() fails.

Exiting the loop doesn't give that, and fts_close() isn't even called,
due to fts's feature of handling all the args.

> It looks like FTS_ERR does not require an immediate exit. find(1) does
> not exit for FTS_ERR. It is a rare error though.

Does fts often manage to continue after it?  I tried to get to it using
another type of error in the above example: instead if rm -rf /tmp/b/0,
chmod 0 /tmp/b/0.  But this causes fts_read() to return NULL as before
except the errno is EACCES instead of ENOENT.

>> % 		case FTS_NS:
>> % 			/*
>> % 			 * Assume that since fts_read() couldn't stat the
>> % 			 * file, it can't be unlinked.
>> % 			 */
>> % 			if (!needstat)
>> % 				break;
>
>> Here we exit the loop for an assumed stat() error.  If this error is
>> anything other than ENOENT, then POSIX doesn't allow ignoring it but
>> it might allow quitting for the current arg.  If this error is
>> ENOENT, then the error handling seems to be even more broken.  We
>> shouldn't quit for the current arg on ENOENT, irrespective of fflag.
>
>> Permissions 000 on a file don't prevent stat'ing it, and I couldn't
>> find a simple example where this code is reached.
>
> The break exits from the switch, not the loop, and therefore continues
> with the next file.

Oops.  So the only exits from the loop are when fts_read() returns NULL,
and the above errx().

> Also note that there are two switches on p->fts_info here. The first
> one prints an error message for FTS_NS (except for [ENOENT] if -f is
> active, or if stat information is not needed). The latter part looks
> wrong: the second switch also ignores everything if -f is active, so
> even non-[ENOENT] errors are not written.

It is only the FTS_NS case in the second switch that ignores everything
if -f is active.

I don't like the looks of the -v and info printing code.  E.g.:

@ 			default:
@ 				rval = unlink(p->fts_accpath);
@ 				if (rval == 0 || (fflag && errno == ENOENT)) {
@ 					if (rval == 0 && vflag)
@ 						(void)printf("%s\n",
@ 						    p->fts_path);
@ 					if (rval == 0 && info) {
@ 						info = 0;
@ 						(void)printf("%s\n",
@ 						    p->fts_path);
@ 					}
@ 					continue;
@ 				}
@ 			}

The ugly duplication in ts helps give the bug of printing the filename twice
if rval == 0 && vflag && info.  Quick fix:

@ 			default:
@ 				rval = unlink(p->fts_accpath);
@ 				if (rval != 0 && (!fflag || errno != ENOENT)
@					break;
@ 				if (rval == 0) {
@ 					info = 0;
@ 					(void)printf("%s\n", p->fts_path);
@ 				}
@ 				continue;
@ 			}

This printing code is duplicated 3 times in rm_tree().  It may be cleaner
to do it after breaking from the case statement (set some printing variable
to control things there -- we now use break to reach a diagnostic message
and continue to avoid getting there).

This printing code is also duplicated in rm_file(), except with a
different arrangement and even more bugs.  In rm_tree(), it has a
complicated arrangement to avoid printing -v and info if a warning is
printed or the errno is ENOENT.  In rm_file(), the filename is printed
in the while loop after printing the warning, and only the warning is
suppressed the errno is ENOENT (and !vflag).

Bruce