Help recovering damaged drive - fsck segfaults, read-only mount looks ok

Polytropon freebsd at
Tue Feb 2 03:04:21 UTC 2021

On Mon, 1 Feb 2021 18:42:04 -0500, Matt Emmerton wrote:
> On Sun, 31 Jan 2021 17:12:40 -0500, Matt Emmerton wrote:
> > > If I force a mount in readonly mode, I can inspect the drive and at 
> > > first glance, everything seems valid.  Since this machine is used for 
> > > backups, I have lots of other medata (eg, checksums) and I'm slowly 
> > > working through to see if anything important is damaged.
> >
> > At this point: STOP.
> > If your data is important to you, get a copy of it NOW.
> Yep, I've been down this road before, which is why I have backups in the
> first place.  Still working through which is the "best" place to seed my new
> backups from - the backup system (maybe damaged) or the source system (still
> healthy).

Probably the source system. In worst case you can compare (at least
via checksums) to see if everything you have is what you expect it
to be.

Nothing is worse than assuming you got your file back - same location
as before, same name, same size, just discovering that is has been
filled with \0 bytes. That happened to me a few times on a buggy
system that froze. Luckily, with "grepdisk" I was able to find the
expected data "elsewhere" on the disk... :-)

> The thing is - I don't care about the deleted files.  What I care about is
> the fact that trying to "fix" the partially deleted directory tree is
> sending fsck down paths which might not be right, which has me worried.

So that is not a problem - any unallocated stuff will be in lost+found/
and can then be deleted from there (if fsck doesn't already perform
the pending removal). Everything else should go unaffected, except - and
that's the thing where guessing, hoping and maybe praying starts - some
other aspect of the filesystem has been damaged.

The case you're seeing that fsck segfaults is a reminder that there is
a severe (!) problem with the filesystem, significant in a way that even
fsck cannot repair it. In my (probably different) case that problem was
a directory called .snap that stopped fsck from doing its job. I don't
know why or how, but after I removed it forcedly (on the still unclean
filesystem, that is), fsck would run two times, and everything was back
to almost-normal.

> Preen mode doesn't do anything useful (basically exits immediately).
> Force mode sounds scary.  And I agree that fsck -y might not do the right
> thing.

The -f flag is especially used for the case when you want to check
a filesystem that fsck assumes to be clean and therefore does not
perform the check.

The -y flag will answer "YES" to all questions, even if "NO" would
have been your choice for a particular case, e. g., salvaging files,
removing entries, filling files at expected size with \0 bytes, or
truncating files to zero size.

> > And _that_ is how I finally got my files back (the initial "severe data
> > loss problem more than 10 years ago): With ls -i, I determined the
> > inode of an offending directory, then used fsdb (which I found out
> > about reading a reference manual about a GDR UNIX system) to
> > remove it, and _then_ (!) fsck was able, after two runs, to bring
> > the filesystem back to a consistent state.
> Ahah!  fsdb!  That's the tool I remember poking around with a long time ago.

Yes, this is the tool to "delete on an unclean filesystem" so that
fsck will then pick up the deletion and clean up what's remaining.

Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...

More information about the freebsd-questions mailing list