background fsck considered harmful? (Re: panic:
handle_written_inodeblock: bad size)
mckusick at mckusick.com
Thu Jul 22 16:50:42 UTC 2010
> Date: Thu, 22 Jul 2010 11:21:45 -0400
> From: "Mikhail T." <mi+thun at aldan.algebra.com>
> Organization: Virtual Estates, Inc.
> To: Kirk McKusick <mckusick at mckusick.com>
> CC: fs at freebsd.org
> Subject: Re: background fsck considered harmful?
> 21.07.2010 23:35, Kirk McKusick
> > Foreground fsck checks all the disk
> > metadata every time, so hard disk errors are captured immediately
> > before they have had a chance to accumulate. But background fsck
> > users blame it because it has not found them.
> I don't blame the program itself -- if it was deliberately /designed/ to
> only do partial checking. However, I was under the impression, that the
> background fsck was meant to do the same job as the "real" one, and
> that, whenever it did not, it was simply a bug in the /implementation/.
> I suspect, this misconception is shared by plenty of other users...
> Indeed, even if a inquisitive admin wanted to find out, fsck(8) gives
> absolutely no warning to that effect -- it simply states, that
> background fsck will be attempted, whenever possible.
> > If you have small disk systems, running foreground fsck is an
> > acceptable solution (and indeed I would recommend it). But when
> > you are running systems with 20Tb of disks, you are not willing
> > to have your system down for 10 hours after every crash.
> > A reasonable intermediate solution is to use background fsck by
> > default, but schedule down time to run a full fsck once a month
> > or so to check for accumulated hard disk errors.
> Maybe, filesystems less than, say, 100Gb (default threshold, subject to
> admin's adjustment) in size should always be foreground fsck-ed? This
> should, at least, cover the system file-systems (such as / and /var) on
> typical installations...
If we did not have a better solution in the pipeline (journaled
soft updates), I would agree with you that always doing a full
check on small filesystems would be a useful enhancement. However,
since we do have a solution that will work well for all sizes of
filesystems in -current and expected out of the box with 9.0, I do
not think that it would be useful to add this extra complexity
at this time.
> And a stern warning issued, when a background fsck is attempted -- for
> whatever reason. Something like:
> background fsck, although faster, may be unable to detect certain
> rare forms of filesystem corruption. You are advised to perform a
> full fsck on %s on a regular basis. See fsck(8).
> should go into the right place under fsck_ffs/ -- not sure, where exactly...
Since most folks do not look at the output from background fsck and with
the changes noted above, I do not feel that adding this message would
be all that helpful at this time.
> Below is a simple patch for the top-level fsck(8). Somebody more
> knowledgeable of the details should augment fsck_ffs(8) -- it currently
> gives the lists of inconsistencies checked for without mentioning the
> difference in coverage between full and background modes...
> diff -U 2 -r18.104.22.168 fsck.8
> --- fsck.8 3 Aug 2009 08:13:06 -0000 22.214.171.124
> +++ fsck.8 22 Jul 2010 15:19:25 -0000
> @@ -170,4 +170,12 @@
> When running in background mode,
> only one file system at a time will be checked.
> +.Sy Warning:
> +because background fsck is performed while the filesystem
> +is in use, it is limited to checking for only the most commonly
> +occuring filesystem abnormalities. Under certain circumstances,
> +some errors can escape background fsck. It is recommended, that you
> +perform full fsck on your systems once in a while -- or whenever
> +you encounter filesystem-related panics.
> .It Fl t Ar fstype
I concur that adding a note to fsck(8) would be a good idea as best
practice is to run a full fsck after a disk-related panic. I would
be happy with your checking in:
diff -U 2 -r126.96.36.199 fsck.8
--- fsck.8 3 Aug 2009 08:13:06 -0000 188.8.131.52
+++ fsck.8 22 Jul 2010 15:19:25 -0000
@@ -170,4 +170,12 @@
When running in background mode,
only one file system at a time will be checked.
+background fsck is limited to checking for only the most commonly
+occuring filesystem abnormalities. Under certain circumstances,
+some errors can escape background fsck. It is recommended, that you
+perform full fsck on your systems once in a while -- or whenever
+you encounter filesystem-related panics.
.It Fl t Ar fstype
Does this work for you?
More information about the freebsd-fs