ZFS I/O errors

Jeremy Chadwick freebsd at jdc.parodius.com
Mon May 30 10:33:51 UTC 2011


On Mon, May 30, 2011 at 12:10:51PM +0200, Olaf Seibert wrote:
> On Mon 30 May 2011 at 11:35:46 +0200, Olaf Seibert wrote:
> > How do I identify which files it is listing here?

On Solaris anyway, "zpool status -v" is supposed to show this.
Occasionally it shows something like <0xXX>:<something>, rather than a
path/filename, and on FreeBSD it appears to behave the same.  Per your
own output:

        tank/vol-fourquid-1:<0x0>
        tank/vol-fourquid-1:<0xc8190da>

I'm not sure why this didn't actually map to a filename on the system
however.  I've never quite understood what the hexadecimal values shown
represent (I have ideas but it'd be useful to know what they meant).

> The nighly 'find' has found the files for me. It is actually a bunch of
> directories, that were likely in use when the crash occurred.
> 
> They give an "interesting" error when you try to ls them:
> 
> find: /tank/vol-fourquid-1/evadh/CLEF-IP11/PARSED_CORPUS/EP/000000/45/97:       Illegal byte sequence

Yes, and this is what error 86 was in the very first line of your kernel
output:

May 30 10:38:28 fourquid root: ZFS: zpool I/O failure, zpool=tank error=86

$ perror 86
Illegal byte sequence

> The file system is compressed, that may be the reason it can identify
> "illegal byte sequence"s.
> 
> It isn't even possible to rm -r the directories, or even to mv them...
> 
> (fortunately the standard trick works: move the parent directory
> instead, create a new one in its old place, and move the old, good,
> contents back).
> 
> but now I seem to be left with some directories (elsewhere) that I still
> can't remove...

I sincerely hope your "zpool scrub" addresses these problems.  Otherwise
I hope you have backup and can recreate the pool (zpool destroy, etc.).

Try running without compression and see if that improves things.

It's important to note that the I/O errors shown happened on "random
disks" (meaning more than just one device).  What you didn't disclose is
what the disks are attached to.  "camcontrol devlist -v" would have been
a good start, followed by any details of controller/driver/etc. you're
using.  Possibly it's an underlying (silent) storage driver bug.

Finally, and leaving the most important point for last: you didn't state
what FreeBSD version you're using, and ceased to provide uname -a output
(to see kernel build date, etc.).  It matters greatly.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list