ZFS: reproducable inability to accesss a pool (process hangs; other pools fine)

Peter Schuller peter.schuller at infidyne.com
Mon Oct 22 08:35:22 PDT 2007


Hello,

On the same system I recently posted about on -stable, with RELENG_7
from a few days ago, I am now running a SiL 3114 on a raidz2 in
degraded mode with one disk missing (it is degraded by design because
I wanted to create a 5 disk array but only had 4).

For the purpose of discovering any stability issues with the 3114
controller I did some stress tests that have yet to reveil controller
problems, but has triggered what appears to be a ZFS problem.

Test case:

/promraid       - root of the pool in question
/promraid/ports - copy of /usr/ports tree from my machine
/promraid/1     - empty directory
/promraid/2     - empty directory

I now run concurrently in two shells:

while [ 1 ] ; do rsync -a /promraid/ports /promraid/1/pp ; rm -rf /promraid/1/pp ; done

and:

while [ 1 ] ; do rsync -a /promraid/ports /promraid/2/pp ; rm -rf /promraid/2/pp ; done

This runs fine for some hours, but eventually I end up with hung
rsyncs in "zfs" state according to op. Attempting to e.g. ls /promraid
hangs as well. Yet ZFS continues working (another pool is entirely
fine), and there are no errors in dmesg.

iostat -x does NOT indicate that it is perpetually waiting on I/O from
a disk or something likethat (0% utilization). The processes are
unkillable, even by SIGKILL.

I should have this environment for a few more days, so can hopefully
reproduce this again. It has happened at least twice already (the
first time I was in X and X hung; I thought I had a panic so re-ran
the tests in the console; these two times I didn't get a panic but I
am unsure whether the failure case is different).

Does anyone have suggestions for what to do to produce the best
information possible? Given that there are no errors, no panic, etc.

One obvious bit is to ktrace them I realize, if that gives me anything
(the size of the trace if I were to trace it from the beginning would,
I suspect, be prohibitive). Will do that next time.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller at infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey at scode.org
E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20071022/5f0adb51/attachment.pgp


More information about the freebsd-fs mailing list