[bug] fsck refuses to repair damaged UFS using backup superblock

Julian H. Stacey jhs at berklix.com
Fri Nov 23 01:20:34 UTC 2018


Hi soralx at cydem.org,
Added cc: <freebsd-fs at freebsd.org> to ensure file system specialists see this.

Reference:
> From:		<soralx at cydem.org>
> Date:		Tue, 20 Nov 2018 05:30:00 -0800

soralx at cydem.org wrote:
> 
> Howdy!
> 
>  Since send-pr(1) is now gone, I guess the next option is to send a
>  message directly to the developers...
> 
>  Yesterday, I ran into a bug in fsck_ffs that gave me a little scare.
> 
>  Short story: on -CURRENT, fsck refuses to check a FS with a corrupted
>  superblock, even when an alternate (backup) SB location is given.
> 
>  Long story. I've been testing a newly-built system based on an X399
>  platform with a 2950X CPU and an Optane 905P 480GB U.2 drive. The
>  system ran a ~2-day old -CURRENT; when compiling newest world and
>  kernel, I found the machine in a locked-up state. After a hard reset,
>  boot failed because the root FS became corrupted & was not available:
>    kernel: Superblock check-hash failed: recorded check-hash XXX != computed check-hash YYY
> 
>  I have not yet figured out why the corruption happened... bad hardware?
>  bug in the NVMe driver?
> 
>  "OK", I thought, "No worries. We'll just boot using another disk, fsck
>  the corrupted FS with a backup superblock, and be up in a moment".
>  The machine was doing nothing but compiling, so no valuable data loss.
> 
>  So I did `dumpfs -m /dev/ada0p3` on the spare disk (which was the
>  source for the new disk image, thus had almost identical partitions
>  and filesystems) to get the FS details, then did `newfs -N [...]
>  /dev/ada0p3` to find locations of superblock backups, then finally
>  ran `fsck_ffs -b 192 /dev/nvd0p3` -- only to get the same "check-
>  -hash failed" message, plus another strange message: "Can't open
>  /dev/nvd0p3: [...]". Then fsck quits.
>  Note that `fsck_ffs -b ...` on a FS with good superblock works OK.
> 
>  After fiddling with a debugger for a bit, I commented out the line
>  "return (0);" in /usr/src/sbin/fsck_ffs/setup.c:136, recompiled fsck,
>  and the FS was recovered successfully.
> 
>  What was actually happening: fsck's setup.c calls ufs_disk_fillout()
>  from libufs' type.c, which in turn calls sbread() from the same
>  library, which then calls sbget(disk->d_fd, &fs, -1) [[where '-1'
>  is hard-coded to indicate the primary superblock]] that then simply
>  invokes ffs_sbget from ffs kernel driver -- and this returns ENOENT,
>  which eventually causes fsck to give up before even looking at the
>  specified backup superblock.
> 
>  I don't know what exactly ufs_disk_fillout() does, but fortunately
>  for me, fsck worked without the "sbread(disk)" part of that function
>  having much luck on a disk with corrupted superblock. Also, I have a
>  feeling that calling a kernel's ffs driver function when using fsck
>  to fix a broken filesystem is not the best thing to do...
> 
>  Please CC, as I am not subscribed.
> 
> -- 
> [SorAlx]  ridin' VN2000 Classic LT
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
> 

Cheers,
Julian
-- 
Julian Stacey, Computer Consultant, Systems Engineer, BSD Linux Unix, Munich.
 Brexit referendum stole 3,700,000 votes from Brits abroad, inc. 700,000 in EU
 UK PM lied it's democratic in Article 50   http://exitbrexit.uk/brexit/#lie
 Campaign lies, criminal funded; Markets, jobs & pound down; New Referendum!


More information about the freebsd-fs mailing list