File remove problem

Bruce Evans brde at optusnet.com.au
Fri Nov 30 18:01:35 PST 2007


On Fri, 30 Nov 2007, Matthew D. Fuller wrote:

> On Fri, Nov 30, 2007 at 05:00:21PM +1100 I heard the voice of
> Bruce Evans, and lo! it spake thus:
>>
>> Oops, this is missing a rm, and doesn't work with it.
>
> Last year, it used to not cause the softdep_waitidle messages and
> prevent the fs from being remounted.  Instead, it would give an error
> like:
>
> hostname kernel: /: update error: blocks 28 files 2

softdep_waitidle() is new.  It now detects the problem earlier and
handles it more robustly by not allowing the mount -u.  Well, maybe
this is less robust since it also doesn't allow unmount.

> and WOULD remount it, and even set the clean flag, but would still
> leave turds lying around that would need a manual fsck to clean up
> (fsck -p obviously would completely skip it, since it was marked
> clean).  It was early this year that it moved from that annoying to

It also shows bugs in fsck:
- even with the file system not marked clean (with later versions),
   fsck -p doesn't notice the problem.
- fsck notices the problem, but takes 2 or 3 passes to fix it, and
   doesn't notice that it needs several passes.

> the "locked fs" crippling variant.  (n.b.: I don't have any real
> evidence that it's a mutation of the same problem, rather than two
> different ones, aside from the trigger condition apparently being the
> same, and the newer completely replacing the older.)

I think it is the same.  softdep_waitidle() just waits a bit to flush
the dependencies after starting the flushing, but the bug gives an
unflushable dependency so the wait always times out.

>> It takes a reboot per test.

And has remarkable timing dependencies.  Once I got into a state in which
the bug didn't appear when exercised in a loop with the same delays that
seemed to cause it fairly deterministically other times.

Bruce


More information about the freebsd-fs mailing list