ZFS on 8.1 - various problems after a disk failure.

Howard Jones howie at thingy.com
Fri Jun 10 12:37:45 UTC 2011

I have a FreeBSD 8.2 server at home with 4 2TB drives in it running ZFS
in a raidz. Some time ago, I had a disk fail. Initially it wasn't
totally obvious the disk had failed so I ran a 'zpool scrub' on the
pool, which threw up a lot of errors, and also produces a lot of sense
errors, making it obvious I had a dead disk.

I replaced the disk, then ran "zpool replace zjumbo ad4 ad4" to replace
the bad disk in-place, and start a resilver.

Now I have a few problems:
1) The old ad4 is still listed, even after several scrub/resilvers.
Shouldn't it go away?
2) Although I lost a whole directory with ~1TB of music, the space
allocated to that directory is still around according df.
3) I have another bunch of files that appear in directory listings, but
if I get "Illegal byte sequence" errors when trying to read them (with
anything - du, file, wc).

I have backups of most of the stuff on the pool (although it'd be nice
to recover the more recent data), but how do I get out of this situation
without nuking the site from orbit? (my current plan) Firstly, to get a
reliable representation of what's actually on the filesystem, and for
bonus points, getting back some of the data that should be intact (only
one disk in the set was actually bad, right?).

Here's my current zpool status. Thanks in advance for any pointers!


# zpool status
  pool: zjumbo
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 10h57m with 15190 errors on Thu May 19
09:26:59 2011

        NAME           STATE     READ WRITE CKSUM
        zjumbo         DEGRADED     0     0  199K
          raidz1       DEGRADED     0     0  792K
            replacing  DEGRADED     0     0     0
              ad4/old  UNAVAIL      0 16.1M     0  cannot open
              ad4      ONLINE       0     0     0  1.15T resilvered
            ad6        ONLINE       0     0     0  677M resilvered
            ad8        ONLINE       0     0     0  660M resilvered
            ad10       ONLINE       0     0     0  535M resilvered

errors: 15190 data errors, use '-v' for a list

More information about the freebsd-stable mailing list