Stress testing the UFS2 filesystem

Kris Kennaway kris at obsecurity.org
Wed May 3 18:44:13 UTC 2006


On Wed, May 03, 2006 at 09:54:50AM +0200, Peter Holm wrote:
> 
> > On Wed, May 03, 2006 at 07:48:17AM +0200, Bj?rn K?nig wrote:
> >> Kris Kennaway schrieb:
> >> >On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
> >> >>Of course I think we could do patches to overcome corrupting panics,
> >> >>but the core FreeBSD team would not accept this, as they are happy
> >> >>with panics and corruptions they make to other filesystems.
> >> >
> >> >Of course not, don't make silly accusations :-)
> >> >
> >> >The problem is much more difficult to solve than "making the panic an
> >> >error return".
> >>
> >> I'm interested in more information about this issue. Do you have a
> >> reference to an old discussion about this topic or do you like to
> >> explain it a little bit further for me (and probably others)?
> >
> > See the URL that Peter provided in his original post.
> >
> > The issue that he is testing is how well the filesystem behaves when
> > you arbitrarily damage it and then run fsck (ideally, fsck should
> > detect all of the damage and repair it).  He seems to have found cases
> > where fsck does not detect and repair the damage, leading to panics at
> > runtime.
> >
> 
> Actually the filesystems mounts without any problems if fsck is run first.
> 
> The objective of this exercise was to show that background fsck may lead
> to panics. This was a problem I saw a lot a year ago when I did some
> testing of patches and in the cause of a working day saw two or three
> panics. With background fsck I would from time to time get a secondary
> panic, which typically zapped the original crash dump.

Oh, I misunderstood then :(

Yes, this is pretty much to be expected: bg fsck depends on the
(fairly strong) assumption that the only kinds of filesystem damage
that are present at startup on filesystems with soft updates enabled
are

a) survivable (i.e. will not cause runtime problems before they are
repaired), and

b) may be repaired online.

Modulo modern disk hardware violating these assumptions anyway, bg
fsck more or less works as long as you only have "power failure"
shutdowns.

When your kernel panics instead (especially if it's a
filesystem-related panic), all bets are off.  With its dying breath,
your kernel may decide to scribble all over your filesystem, causing
arbitrary damage to it.  Many of these types of damage are not
"survivable", as you have demonstrated -- in fact the very existence
of fsck is proof that the kernel is not designed to handle arbitrary
damage at runtime.

So the moral is that if your kernel is panicking a lot, turn off bg
fsck or you'll probably hit other filesystem panics at runtime.

I don't think you can completely prevent this problem, but one thing
that may help would be for the kernel to attempt to write a marker to
the dump device when it panics, and if this marker is present at boot
time a fg fsck is forced.  Of course the kernel will not always be
able to do this, but it should work most of the time (since crashdumps
usually work for most panics).

Kris

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20060503/b7d1010f/attachment.pgp


More information about the freebsd-fs mailing list