[bug] fsck refuses to repair damaged UFS using backup superblock

Rick Macklem rmacklem at uoguelph.ca
Sun Nov 25 23:45:09 UTC 2018


Kirk McKusick wrote:
>> From: Rick Macklem <rmacklem at uoguelph.ca>
>> To: "soralx at cydem.org" <soralx at cydem.org>,
>>         Kirk McKusick <mckusick at mckusick.com>
>> CC: "freebsd-fs at freebsd.org" <freebsd-fs at freebsd.org>,
>>         "Julian H. Stacey"
>>       <jhs at berklix.com>
>> Subject: Re: [bug] fsck refuses to repair damaged UFS using backup superblock
>> Date: Sun, 25 Nov 2018 15:25:21 +0000
>>
>>> Kirk McKusick wrote:
>>>
>>> Below is a proposed fix for fsck_ffs to properly handle superblock
>>> check-hash failures (notably to optionally search for a usable
>>> alternate superblock). Let me know if you still have a filesystem
>>> on which you can test it, and if so whether it works correctly.
>>
>> As above, I think you can reproduce this by running an older kernel
>> that mounts the file system. I ended up re-installing when I ran
>> into this yesterday (no biggy, it was just a test machine). It
>> happened after I had been running a kernel built from stable/12 on
>> the system and then tried to boot it.  (Since the root fs got these
>> errors, I couldn't boot any kernel on the root fs.)
>
>Kernels before -r339671 clear the CK_SUPERBLOCK flag in the superblock.
>Kernels at and after -r339671 ignore the check-hash if the CK_SUPERBLOCK
>flag is clear. So you should be able to run on older kernels without
>causing superblock check-hash failures on later kernels. Fsck will offer
>to enable the superblock check-hash if you are running on a kernel at
>or newer than -r339671.
Not if the kernel is stable/11 (I realized that was what I booted that trashed
the machine).
For stable/11, fs_metackhash is just fs_sparecon32[22].
I'm guessing that fs_sparecon32[22] happened to hit memory with
CK_SUPERBLOCK set in it when the superblock was written by the stable/11 kernel. Then the file system had a bogus checksum when trying to boot it because
fs_ckhash was just random garbage written for fs_sparecon32[21]?

Maybe setting all elements of fs_sparecon32[] to zeros before writing the
superblock out would minimize these issues for the future and could be MFC'd.
(I'm not claiming that a new FFS2 should be movable between stable/11 and
 head, but it might be a nice feature?)

>> It would be nice if there was a way to override the check and boot
>> the system.  (Is a loader tunable reasonable for this?)
>>
>> rick
>
>I have fixed the problem with fsck being unable to check filesystems
>with check-hash failures in -r340925.
>
>Rather than adding a loader tunable to override the check (which people
>would have to track down in the midst of a crisis), it might be better
>to simply have the loader print a warning when there is a mismatch and
>proceed to try using the filesystem. If successful, an fsck could then
>be run to try and clean it up. Does this seem reasonable?
Yes, that sounds fine to me.

rick


More information about the freebsd-fs mailing list