Errors on a file on a zpool: How to remove?
Wes Morgan
morganw at chemikals.org
Sun Jan 24 00:40:19 UTC 2010
On Sat, 23 Jan 2010, Rich wrote:
> I have no files named 0x0.
>
> I have a number of files which, on attempting to do anything to them
> (stat, mv, rm), EIO occurs, the checksum error number on three of the
> disks in that pool ticks up, and /var/log/messages reports what I
> reported in my initial post. (i discovered this due to FreeBSD's daily
> check-for-setuid-bits-in-strange-places find command reporting EIO on
> some files.)
>
> My original post in this thread is about how to resolve this.
Do these bad files show up on "zpool status -v" after a scrub?
This really sounds much more like an issue of corrupt metadata. ZFS keeps
multiple copies of filesystem metadata even on non-redundant pools (ditto
blocks). You said there was bad ram in this machine at one point, which
may mean that *all* of the metadata was corrupt.
In my encounter with a bad stick of ram, the data was correct but the
stored checksums were wrong. I was able to "recover" the data by simply
changing zfs_read() to not report EIO when it encounters an ECKSUM error
from the zfs layer -- essentially ignoring the checksum error. I have no
idea what this might do if the metadata itself is corrupt, so that could
be risky.
Another option is the zdb solution mentioned earlier.
>
> On Sat, Jan 23, 2010 at 6:34 PM, Wes Morgan <morganw at chemikals.org> wrote:
> > On Sat, 23 Jan 2010, Rich wrote:
> >
> >> On Sat, Jan 23, 2010 at 4:21 PM, Wes Morgan <morganw at chemikals.org> wrote:
> >> > On Sat, 23 Jan 2010, Rich wrote:
> >> >
> >> >> I already diagnosed the bad hardware - one of the two sticks of RAM
> >> >> had gone bad, and fails memtest in the other machine.
> >> >>
> >> >> pool: rigatoni
> >> >> state: ONLINE
> >> >> status: One or more devices has experienced an error resulting in data
> >> >> corruption. Applications may be affected.
> >> >> action: Restore the file in question if possible. Otherwise restore the
> >> >> entire pool from backup.
> >> >> see: http://www.sun.com/msg/ZFS-8000-8A
> >> >> scrub: scrub completed after 15h28m with 1 errors on Thu Jan 21 18:09:25 2010
> >> >> config:
> >> >>
> >> >> NAME STATE READ WRITE CKSUM
> >> >> rigatoni ONLINE 0 0 1
> >> >> da4 ONLINE 0 0 2
> >> >> da5 ONLINE 0 0 2
> >> >> da7 ONLINE 0 0 0
> >> >> da6 ONLINE 0 0 0
> >> >> da2 ONLINE 0 0 2
> >> >>
> >> >> errors: Permanent errors have been detected in the following files:
> >> >>
> >> >> rigatoni/mirrors:<0x0>
> >> >
> >> > Can you post your entire pool filesystem structure? That message above
> >> > looks like an unreferenced block or corrupted metadata rather than an
> >> > actual file. Also, if it's part of a snapshot, you simply have to destroy
> >> > the snapshot.
> >> >
> >> > I had a pool become corrupted due to bad memory, and all of the files were
> >> > still able to be manipulated. The only time EIO popped up was on the
> >> > specific block that had a checksum error.
> >>
> >> # zfs list -r -t all rigatoni
> >> NAME USED AVAIL REFER MOUNTPOINT
> >> rigatoni 5.73T 984G 19K /rigatoni
> >> rigatoni/logs_bitch 269M 984G 269M /rigatoni/logs_bitch
> >> rigatoni/mirrors 5.73T 984G 5.73T /mirrors
> >>
> >> No snapshots here. :/
> >>
> >> EIO only pops up on the files I mentioned above - everything else in
> >> those directories, including renaming that directory, is fine.
> >
> > I must have missed it, what files is it showing besides the <0x0> address?
> > Or do you have a file named "<0x0>"?
>
>
>
>
More information about the freebsd-fs
mailing list