background fsck considered harmful? (Re: panic: handle_written_inodeblock: bad size)

Kirk McKusick mckusick at mckusick.com
Thu Jul 22 03:35:11 UTC 2010


> Date: Wed, 21 Jul 2010 17:15:28 -0400
> From: "Mikhail T." <mi+thun at aldan.algebra.com>
> Organization: Virtual Estates, Inc.
> To: Kirk McKusick <mckusick at mckusick.com>
> Cc: fs at freebsd.org
> Subject: Re: background fsck considered harmful?
> 
> 21.07.2010 16:15, Kirk McKusick:
> > Certainly disabling background fsck will eliminate that from your
> > possible set of issues and may prevent a recurrance. It does mean
> > that after a crash you will have to wait while your filesystems
> > are checked before your system will come up. If your filesystems
> > are below 0.5Tb that should be tolerable.
> >
> > The longer term solution is to use journaled soft updates when they
> > become available in 9.0.
>    
> We are about to ship 8.1 -- with background fsck enabled by default 
> possibly causing problems requiring far more admin time (and involving 
> real data-loss).
> 
> If the existing fsck can not be improved to properly fix the fs, when 
> running in background mode, just as well as when it is running 
> pre-mount, then, IMHO, it should not be enabled by default.
> 
> Crashes are quite rare and waiting once in a while for fsck to rumble 
> through would be better, than to have some people enter into a vicious 
> circle of mysterious panics (even if Jeremy's ongoing work makes them 
> slightly less mysterious).
> 
> Respectfully yours,
> 
>     -mi

I believe that you are being excessively harsh on background fsck.
Generally the problems are caused by hard-disk errors. Because
background fsck only checks a small subset of the disk it does
not find them and so when they eventually accumulate enough they
cause difficult problems. Foreground fsck checks all the disk
metadata every time, so hard disk errors are captured immediately
before they have had a chance to accumulate. But background fsck
users blame it because it has not found them.

If you have small disk systems, running foreground fsck is an
acceptable solution (and indeed I would recommend it). But when
you are running systems with 20Tb of disks, you are not willing
to have your system down for 10 hours after every crash.

A reasonable intermediate solution is to use background fsck by
default, but schedule down time to run a full fsck once a month
or so to check for accumulated hard disk errors.

	Kirk McKusick


More information about the freebsd-fs mailing list