identifying filesystem blocks (was Re: better disk reliability on a desktop machine)

Nick Barnes Nick.Barnes at pobox.com
Fri Jul 15 22:39:44 GMT 2005


At 2005-07-15 17:01:18+0000, Chuck Swiger writes:
> Nick Barnes wrote:
> [ ... ]
> > I don't want to have to do all that ever again, after this iteration.
> 
> You've had a learning experience, I see.  :-)

Yeah, and I've had them before, and this time enough is enough.

On a related subject, the last time I lost a disk, or maybe the time
before, I asked on one of these lists whether there is a tool which
will identify the files (or inodes, or other filesystem metadata)
which are affected by one or more bad blocks.  At the time I was told
that there is no such tool, and started to write my own.  Maybe this
time around I'll finish the tool and distribute it.

Semi-automated binary-chop use of dd tells me that the following
blocks in my filesystem are broken:

65255940, 65255941, 65255942, 65255943, 65255944, 65255954, 65255965,
65256256, 65257133, 65257134, 65257514, 66713152, 66713158, 66713164,
66713536, 66713537, 66714306, 66714308, 66715648, 66715650

but without a suitable tool this information is useless.

Incidentally, two weeks ago I recovered a broken filesystem on a 4.10
server machine by dd'ing the working sectors (i.e. all but 2) onto a
freshly newfs'ed partition.  The broken filesystem wouldn't fsck at
all: some metadata was lost to a bad sector and fsck borked out in
phase 2.  But after the dd's (i.e. with those bad sectors replaced
with metadata fresh from newfs), fsck told me that the recovered
filesystem was fine.  As it happens, the filesystem was the repository
for an SCM system (Perforce) with internal checksums: after recovery
we checked those out and they all passed.

One interesting aspect of that war story is that I got one of the dd
commands wrong the first time, and tried to fsck a filesystem which
was partly Just Plain Missing.  The whole system went down: network
connections dropped and completely unresponsive at console, including
ctrl-C, ctrl-T, alt-Fn, and ctrl-alt-del.  It seems to me that fsck
shouldn't be able to do that....

Nick Barnes


More information about the freebsd-questions mailing list