ZFS hanging
Dennis Glatting
freebsd at pki2.com
Mon Jul 9 20:13:17 UTC 2012
I have a ZFS array of disks where the system simply stops as if forever
blocked by some IO mutex. This happens often and the following is the
output of top:
last pid: 6075; load averages: 0.00, 0.00, 0.00 up 0+16:54:41
13:04:10
135 processes: 1 running, 134 sleeping
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Mem: 47M Active, 24M Inact, 18G Wired, 120M Buf, 44G Free
Swap: 32G Total, 32G Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU
COMMAND
2410 root 1 33 0 11992K 2820K zio->i 7 331:25 0.00%
bzip2
2621 root 1 52 4 28640K 5544K tx->tx 24 245:33 0.00%
john
2624 root 1 48 4 28640K 5544K tx->tx 4 239:08 0.00%
john
2623 root 1 49 4 28640K 5544K tx->tx 7 238:44 0.00%
john
2640 root 1 42 4 28640K 5420K tx->tx 23 206:51 0.00%
john
2638 root 1 42 4 28640K 5420K tx->tx 28 206:34 0.00%
john
2639 root 1 42 4 28640K 5420K tx->tx 9 206:30 0.00%
john
2637 root 1 42 4 28640K 5420K tx->tx 18 206:24 0.00%
john
This system is presently resilvering a disk but these stops have
happened before.
iirc# zpool status disk-1
pool: disk-1
state: DEGRADED
status: One or more devices is currently being resilvered. The pool
will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sun Jul 8 13:07:46 2012
104G scanned out of 12.4T at 1.73M/s, (scan is slow, no
estimated time)
10.3G resilvered, 0.82% done
config:
NAME STATE READ WRITE CKSUM
disk-1 DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
da1 ONLINE 0 0 0
da2 ONLINE 0 0 0
da10 ONLINE 0 0 0
da9 ONLINE 0 0 0
da5 ONLINE 0 0 0
da6 ONLINE 0 0 0
da7 ONLINE 0 0 0
replacing-7 DEGRADED 0 0 0
17938531774236227186 UNAVAIL 0 0 0 was /dev/da8
da3 ONLINE 0 0 0 (resilvering)
da8 ONLINE 0 0 0
da4 ONLINE 0 0 0
logs
ada2p1 ONLINE 0 0 0
cache
ada1 ONLINE 0 0 0
errors: No known data errors
This system has dissimilar disks, which I understand should not be a
problem but the stopping also happened before I started the slow disk
upgrade process.
The disks are served by:
* A LSI 9211 flashed to IT, and
* A LSI 2008 controller on the motherboard also flashed to IT.
The 2008 BIOS and firmware is the most recent from LSI. The motherboard
is a Supermicro H8DG6-F.
My question is what should I be looking at and how should I look at it?
There is nothing in the logs or the console, rather the system is
forever paused and entering commands results in no response (it's as if
everything is deadlocked).
More information about the freebsd-fs
mailing list