ZFS hanging
George Kontostanos
gkontos.mail at gmail.com
Tue Jul 10 08:32:40 UTC 2012
On Mon, Jul 9, 2012 at 11:13 PM, Dennis Glatting <freebsd at pki2.com> wrote:
> I have a ZFS array of disks where the system simply stops as if forever
> blocked by some IO mutex. This happens often and the following is the
> output of top:
>
> last pid: 6075; load averages: 0.00, 0.00, 0.00 up 0+16:54:41
> 13:04:10
> 135 processes: 1 running, 134 sleeping
> CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
> Mem: 47M Active, 24M Inact, 18G Wired, 120M Buf, 44G Free
> Swap: 32G Total, 32G Free
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU
> COMMAND
> 2410 root 1 33 0 11992K 2820K zio->i 7 331:25 0.00%
> bzip2
> 2621 root 1 52 4 28640K 5544K tx->tx 24 245:33 0.00%
> john
> 2624 root 1 48 4 28640K 5544K tx->tx 4 239:08 0.00%
> john
> 2623 root 1 49 4 28640K 5544K tx->tx 7 238:44 0.00%
> john
> 2640 root 1 42 4 28640K 5420K tx->tx 23 206:51 0.00%
> john
> 2638 root 1 42 4 28640K 5420K tx->tx 28 206:34 0.00%
> john
> 2639 root 1 42 4 28640K 5420K tx->tx 9 206:30 0.00%
> john
> 2637 root 1 42 4 28640K 5420K tx->tx 18 206:24 0.00%
> john
>
>
> This system is presently resilvering a disk but these stops have
> happened before.
>
>
> iirc# zpool status disk-1
> pool: disk-1
> state: DEGRADED
> status: One or more devices is currently being resilvered. The pool
> will
> continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
> scan: resilver in progress since Sun Jul 8 13:07:46 2012
> 104G scanned out of 12.4T at 1.73M/s, (scan is slow, no
> estimated time)
> 10.3G resilvered, 0.82% done
> config:
>
> NAME STATE READ WRITE CKSUM
> disk-1 DEGRADED 0 0 0
> raidz2-0 DEGRADED 0 0 0
> da1 ONLINE 0 0 0
> da2 ONLINE 0 0 0
> da10 ONLINE 0 0 0
> da9 ONLINE 0 0 0
> da5 ONLINE 0 0 0
> da6 ONLINE 0 0 0
> da7 ONLINE 0 0 0
> replacing-7 DEGRADED 0 0 0
> 17938531774236227186 UNAVAIL 0 0 0 was /dev/da8
> da3 ONLINE 0 0 0 (resilvering)
> da8 ONLINE 0 0 0
> da4 ONLINE 0 0 0
> logs
> ada2p1 ONLINE 0 0 0
> cache
> ada1 ONLINE 0 0 0
>
> errors: No known data errors
>
>
> This system has dissimilar disks, which I understand should not be a
> problem but the stopping also happened before I started the slow disk
> upgrade process.
>
> The disks are served by:
>
> * A LSI 9211 flashed to IT, and
> * A LSI 2008 controller on the motherboard also flashed to IT.
>
> The 2008 BIOS and firmware is the most recent from LSI. The motherboard
> is a Supermicro H8DG6-F.
>
>
> My question is what should I be looking at and how should I look at it?
> There is nothing in the logs or the console, rather the system is
> forever paused and entering commands results in no response (it's as if
> everything is deadlocked).
>
>
>
>
>
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
Can you post your 'dmesg | grep mps', the FreeBSD version you run?
Also, is there any chance that those disks are 4K?
--
George Kontostanos
Aicom telecoms ltd
http://www.aisecure.net
More information about the freebsd-fs
mailing list