ZFS I/O errors
Jeremy Chadwick
freebsd at jdc.parodius.com
Mon May 30 10:33:51 UTC 2011
On Mon, May 30, 2011 at 12:10:51PM +0200, Olaf Seibert wrote:
> On Mon 30 May 2011 at 11:35:46 +0200, Olaf Seibert wrote:
> > How do I identify which files it is listing here?
On Solaris anyway, "zpool status -v" is supposed to show this.
Occasionally it shows something like <0xXX>:<something>, rather than a
path/filename, and on FreeBSD it appears to behave the same. Per your
own output:
tank/vol-fourquid-1:<0x0>
tank/vol-fourquid-1:<0xc8190da>
I'm not sure why this didn't actually map to a filename on the system
however. I've never quite understood what the hexadecimal values shown
represent (I have ideas but it'd be useful to know what they meant).
> The nighly 'find' has found the files for me. It is actually a bunch of
> directories, that were likely in use when the crash occurred.
>
> They give an "interesting" error when you try to ls them:
>
> find: /tank/vol-fourquid-1/evadh/CLEF-IP11/PARSED_CORPUS/EP/000000/45/97: Illegal byte sequence
Yes, and this is what error 86 was in the very first line of your kernel
output:
May 30 10:38:28 fourquid root: ZFS: zpool I/O failure, zpool=tank error=86
$ perror 86
Illegal byte sequence
> The file system is compressed, that may be the reason it can identify
> "illegal byte sequence"s.
>
> It isn't even possible to rm -r the directories, or even to mv them...
>
> (fortunately the standard trick works: move the parent directory
> instead, create a new one in its old place, and move the old, good,
> contents back).
>
> but now I seem to be left with some directories (elsewhere) that I still
> can't remove...
I sincerely hope your "zpool scrub" addresses these problems. Otherwise
I hope you have backup and can recreate the pool (zpool destroy, etc.).
Try running without compression and see if that improves things.
It's important to note that the I/O errors shown happened on "random
disks" (meaning more than just one device). What you didn't disclose is
what the disks are attached to. "camcontrol devlist -v" would have been
a good start, followed by any details of controller/driver/etc. you're
using. Possibly it's an underlying (silent) storage driver bug.
Finally, and leaving the most important point for last: you didn't state
what FreeBSD version you're using, and ceased to provide uname -a output
(to see kernel build date, etc.). It matters greatly.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP 4BD6C0CB |
More information about the freebsd-stable
mailing list