CAM hangs in 9-STABLE? [Was: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE]
olivier
olivier777a7 at gmail.com
Tue Jan 15 19:55:25 UTC 2013
Dear All,
Still experiencing the same hangs I reported earlier with 9.1. I've been
running a kernel with WITNESS enabled to provide more information.
During an occurrence of the hang, running show alllocks gave
Process 25777 (sysctl) thread 0xfffffe014c5b2920 (102567)
exclusive sleep mutex Giant (Giant) r = 0 (0xffffffff811e34c0) locked @
/usr/src/sys/dev/usb/usb_transfer.c:3171
Process 25750 (sshd) thread 0xfffffe015a688000 (104313)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0bb98) locked @
/usr/src/sys/kern/uipc_sockbuf.c:148
Process 24922 (cnid_dbd) thread 0xfffffe0187ac4920 (103597)
shared lockmgr zfs (zfs) r = 0 (0xfffffe0973062488) locked @
/usr/src/sys/kern/vfs_syscalls.c:3591
Process 24117 (sshd) thread 0xfffffe07bd914490 (104195)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xfffffe0204e0a8f0) locked @
/usr/src/sys/kern/uipc_sockbuf.c:148
Process 1243 (java) thread 0xfffffe01ca85d000 (102704)
exclusive sleep mutex pmap (pmap) r = 0 (0xfffffe015aec1440) locked @
/usr/src/sys/amd64/amd64/pmap.c:4840
exclusive rw pmap pv global (pmap pv global) r = 0 (0xffffffff81409780)
locked @ /usr/src/sys/amd64/amd64/pmap.c:4802
exclusive sleep mutex vm page (vm page) r = 0 (0xffffffff813f0a80) locked @
/usr/src/sys/vm/vm_object.c:1128
exclusive sleep mutex vm object (standard object) r = 0
(0xfffffe01458e43a0) locked @ /usr/src/sys/vm/vm_object.c:1076
shared sx vm map (user) (vm map (user)) r = 0 (0xfffffe015aec1388) locked @
/usr/src/sys/vm/vm_map.c:2045
Process 994 (nfsd) thread 0xfffffe015a0df000 (102426)
shared lockmgr zfs (zfs) r = 0 (0xfffffe0c3b505878) locked @
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
Process 994 (nfsd) thread 0xfffffe015a0f8490 (102422)
exclusive lockmgr zfs (zfs) r = 0 (0xfffffe02db3b3e60) locked @
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1760
Process 931 (syslogd) thread 0xfffffe015af18920 (102365)
shared lockmgr zfs (zfs) r = 0 (0xfffffe0141dd6680) locked @
/usr/src/sys/kern/vfs_syscalls.c:3591
Process 22 (syncer) thread 0xfffffe0125077000 (100279)
exclusive lockmgr syncer (syncer) r = 0 (0xfffffe015a2ff680) locked @
/usr/src/sys/kern/vfs_subr.c:1809
I don't have full "show lockedvnods" output because the output does not get
captured by ddb after using "capture on", it doesn't fit on a single
screen, and doesn't get piped into a "more" equivalent. What I did manage
to get (copied by hand, typos possible) is:
0xfffffe0c3b5057e0: 0xfffffe0c3b5057e0: tag zfs, type VREG
tag zfs, type VREG
usecount 1, writecount 0, refcount 1 mountedhere 0
usecount 1, writecount 0, refcount 1 mountedhere 0
flags (VI_ACTIVE)
flags (VI_ACTIVE)
v_object 0xfffffe089bc1b828 ref 0 pages 0
v_object 0xfffffe089bc1b828 ref 0 pages 0
lock type zfs: SHARED (count 1)
lock type zfs: SHARED (count 1)
0xfffffe02db3b3dc8: 0xfffffe02db3b3dc8: tag zfs, type VREG
tag zfs, type VREG
usecount 6, writecount 0, refcount 6 mountedhere 0
usecount 6, writecount 0, refcount 6 mountedhere 0
flags (VI_ACTIVE)
flags (VI_ACTIVE)
v_object 0xfffffe0b79583ae0 ref 0 pages 0
v_object 0xfffffe0b79583ae0 ref 0 pages 0
lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
lock type zfs: EXCL by thread 0xfffffe015a0f8490 (pid 994)
with exclusive waiters pending
with exclusive waiters pending
The output of show witness is at http://pastebin.com/eSRb3FEu
The output of alltrace is at http://pastebin.com/X1LruNrf (a number of
threads are stuck in zio_wait, none I can find in zio_interrupt, and
according to gstat and disks eventually going to sleep all disk IO seems to
be stuck for good; I think Andriy explained earlier that these criteria
might indicate this is a ZFS hang).
The output of show geom is at http://pastebin.com/6nwQbKr4
The output of vmstat -i is at http://pastebin.com/9LcZ7Mi0 Interrupts are
occurring at a normal rate during the hang, as far as I can tell.
Any help would be greatly appreciated.
Thanks
Olivier
PS: my kernel was compiled from 9-STABLE from December, with CAM and ahci
from 9.0 (in the hope it would fix the hangs I was experiencing in plain
9-STABLE; obviously the hangs are still occurring). The rest of my
configuration is the same as posted earlier.
On Mon, Dec 24, 2012 at 9:42 PM, olivier <olivier777a7 at gmail.com> wrote:
> Dear All
> It turns out that reverting to an older version of the mps driver did not
> fix the ZFS hangs I've been struggling with in 9.1 and 9-STABLE after all
> (they just took a bit longer to occur again, possibly just by chance). I
> followed steps along lines suggested by Andriy to collect more information
> when the problem occurs. Hopefully this will help figure out what's going
> on.
>
> As far as I can tell, what happens is that at some point IO operations to
> a bunch of drives that belong to different pools get stuck. For these
> drives, gstat shows no activity but 1 pending operation, as such:
>
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps
> ms/d %busy Name
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da1
>
> I've been running gstat in a loop (every 100s) to monitor the machine.
> Just before the hang occurs, everything seems fine (see full gstat output
> below). Right after the hang occurs a number of drives seem stuck (see full
> gstat output below). Notably, some stuck drives are seen through the mps
> driver and others through the mpt driver. So the problem doesn't seem to be
> driver-specific. I have had the problem occur (at a lower frequency) on
> similar machines that don't use the mpt driver (and only have 1 disk
> provided through mps), so the problem doesn't seem to be caused by the mpt
> driver (and is likely not caused by defective hardware). Since based on the
> information I provided earlier Andriy thinks the problem might not
> originate in ZFS, perhaps that means that the problem is in the CAM layer?
>
> camcontrol tags -v (as suggested by Andriy) in the hung state shows for
> example
>
> (pass56:mpt1:0:8:20): dev_openings 254
> (pass56:mpt1:0:8:20): dev_active 1
> (pass56:mpt1:0:8:20): devq_openings 254
> (pass56:mpt1:0:8:20): devq_queued 0
> (pass56:mpt1:0:8:20): held 0
> (pass56:mpt1:0:8:20): mintags 2
> (pass56:mpt1:0:8:20): maxtags 255
> (I'm not providing full camcontrol tags output below because I couldn't
> get it to run during the specific hang I documented most thoroughly; the
> example above is from a different occurrence of the hang).
>
> The buses don't seem completely frozen: if I manually remove drives while
> the machine is hanging, that's picked up by the mpt driver, which prints
> out corresponding messages to the console. But camcontrol reset all or
> rescan all don't seem to do anything.
>
> I've tried reducing vfs.zfs.vdev.min_pending and vfs.zfs.vdev.max_pending
> to 1, to no avail.
>
> Any suggestions to resolve this problem, work around it, or further
> investigate it would be greatly appreciated!
> Thanks a lot
> Olivier
>
> Detailed information:
>
> Output of procstat -a -kk when the machine is hanging is available at
> http://pastebin.com/7D2KtT35 (not putting it here because it's pretty
> long)
>
> dmesg is available at http://pastebin.com/9zJQwWJG . Note that I'm using
> LUN masking, so the "illegal requests" reported aren't really errors. Maybe
> one day if I get my problems sorted out I'll use geom multipathing instead.
>
> My kernel config is
> include GENERIC
> ident MYKERNEL
>
> options IPSEC
> device crypto
>
> options OFED # Infiniband protocol
>
> device mlx4ib # ConnectX Infiniband support
> device mlxen # ConnectX Ethernet support
> device mthca # Infinihost cards
> device ipoib # IP over IB devices
>
> options ATA_CAM # Handle legacy controllers with CAM
> options ATA_STATIC_ID # Static device numbering
>
> options KDB
> options DDB
>
>
>
> Full output of gstat just before the hang (at most 100s before the hang):
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps
> ms/d %busy Name
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da0
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da2/da2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da0/da0
> 1 85 48 79 4.7 35 84 0.5 0 0
> 0.0 24.3 da1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da1/da1
> 1 83 47 77 4.3 34 79 0.5 0 0
> 0.0 22.1 da4
> 1 1324 1303 21433 0.6 19 42 0.7 0 0
> 0.0 79.8 da3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da5
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da6
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da7
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da8
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da9
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da10
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da11
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da12
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da13
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da14
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da15
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da16
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da17
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da18
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da19
> 0 97 57 93 3.5 38 84 0.3 0 0
> 0.0 21.3 da20
> 0 85 47 69 3.3 36 86 0.4 0 0
> 0.0 16.8 da21
> 0 1666 1641 18992 0.3 23 43 0.4 0 0
> 0.0 57.9 da22
> 0 93 55 98 3.5 36 87 0.4 0 0
> 0.0 20.6 da23
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da24
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da25
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da27
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da28
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da29
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da30
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da31
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da32
> 0 1200 0 0 0.0 1198 11751 0.6 0 0
> 0.0 67.3 da33
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da34
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da35
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da36
> 0 81 44 67 2.0 35 84 0.3 0 0
> 0.0 10.1 da37
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da38
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da39
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da40
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da41
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da42
> 1 1020 999 22028 0.8 19 42 0.7 0 0
> 0.0 84.8 da43
> 0 1050 1029 23479 0.8 19 47 0.7 0 0
> 0.0 83.3 da44
> 1 1006 984 22758 0.8 21 46 0.6 0 0
> 0.0 84.8 da45
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da46
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da47
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da48
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da49
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da50
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 cd0
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da4/da4
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da3/da3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da5/da5
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da6/da6
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da7/da7
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da8/da8
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da9/da9
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da10/da10
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da11/da11
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da12/da12
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da13/da13
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da14/da14
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da15/da15
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da16/da16
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da17/da17
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da18/da18
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da19/da19
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da20/da20
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da21/da21
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da22/da22
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da23/da23
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da24/da24
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da25/da25
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26/da26
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 PART/da26/da26
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26p1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26p2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26p3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da27/da27
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da28/da28
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da29/da29
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da30/da30
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da31/da31
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da32/da32
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da33/da33
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da34/da34
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da35/da35
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da36/da36
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da37/da37
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da38/da38
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da39/da39
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da40/da40
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da41/da41
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da42/da42
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da43/da43
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da44/da44
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da45/da45
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da46/da46
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da47/da47
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da48/da48
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da49/da49
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da50/da50
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/cd0/cd0
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26p1/da26p1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26p2/da26p2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 LABEL/da26p1/da26p1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 gptid/84d4487b-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26p3/da26p3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 LABEL/da26p2/da26p2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 gptid/b4255780-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0
> DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da25
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0
> DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da40
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da41
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da26p3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da29
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da30
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da24
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da6
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da7
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da16
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da17
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da20
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da21
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da37
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da23
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da4
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da43
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da44
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da22
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da33
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da45
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da3
>
>
> Full output of gstat just after the hang (at most 100s after the hang):
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps
> ms/d %busy Name
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da0
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da2/da2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da0/da0
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da1/da1
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da4
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da5
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da6
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da7
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da8
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da9
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da10
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da11
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da12
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da13
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da14
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da15
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da16
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da17
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da18
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da19
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da20
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da21
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da22
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da23
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da24
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da25
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da27
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da28
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da29
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da30
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da31
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da32
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da33
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da34
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da35
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da36
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da37
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da38
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da39
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da40
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da41
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da42
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da43
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da44
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da45
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da46
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da47
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da48
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da49
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da50
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 cd0
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da4/da4
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da3/da3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da5/da5
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da6/da6
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da7/da7
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da8/da8
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da9/da9
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da10/da10
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da11/da11
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da12/da12
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da13/da13
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da14/da14
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da15/da15
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da16/da16
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da17/da17
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da18/da18
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da19/da19
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da20/da20
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da21/da21
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da22/da22
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da23/da23
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da24/da24
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da25/da25
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26/da26
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 PART/da26/da26
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26p1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26p2
> 1 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 da26p3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da27/da27
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da28/da28
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da29/da29
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da30/da30
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da31/da31
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da32/da32
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da33/da33
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da34/da34
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da35/da35
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da36/da36
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da37/da37
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da38/da38
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da39/da39
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da40/da40
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da41/da41
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da42/da42
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da43/da43
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da44/da44
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da45/da45
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da46/da46
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da47/da47
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da48/da48
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da49/da49
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da50/da50
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/cd0/cd0
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26p1/da26p1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26p2/da26p2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 LABEL/da26p1/da26p1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 gptid/84d4487b-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 DEV/da26p3/da26p3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 LABEL/da26p2/da26p2
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 gptid/b4255780-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0
> DEV/gptid/84d4487b-34e3-11e2-b773-00259058949a/gptid/84d4487b-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da25
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0
> DEV/gptid/b4255780-34e3-11e2-b773-00259058949a/gptid/b4255780-34e3-11e2-b773-00259058949a
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da40
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da41
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da26p3
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da29
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da30
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da24
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da6
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da7
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da16
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da17
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da20
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da21
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da37
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da23
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da1
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da4
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da43
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da44
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da22
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da33
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da45
> 0 0 0 0 0.0 0 0 0.0 0 0
> 0.0 0.0 ZFS::VDEV/zfs::vdev/da3
>
>
> On Thu, Dec 13, 2012 at 10:14 PM, olivier <olivier777a7 at gmail.com> wrote:
>
>> For what it's worth, I think I might have solved my problem by reverting
>> to an older version of the mps driver. I checked out a recent version of
>> 9-STABLE and reversed the changes in
>> http://svnweb.freebsd.org/base?view=revision&revision=230592 (perhaps
>> there was a simpler way of reverting to the older mps driver). So far so
>> good, no hang even when hammering the file system.
>>
>> This does not conclusively prove that the new LSI mps driver is at fault,
>> but that seems to be a likely explanation.
>>
>> Thanks to everybody who pointed me in the right direction. Hope this
>> helps others who run into similar problems with 9.1
>> Olivier
>>
>>
>> On Thu, Dec 13, 2012 at 10:14 AM, olivier <olivier777a7 at gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Dec 13, 2012 at 9:54 AM, Andriy Gapon <avg at freebsd.org> wrote:
>>>
>>>> Google for "zfs deadman". This is already committed upstream and I
>>>> think that it
>>>> is imported into FreeBSD, but I am not sure... Maybe it's imported
>>>> just into the
>>>> vendor area and is not merged yet.
>>>>
>>>
>>> Yes, that's exactly what I had in mind. The logic for panicking makes
>>> sense.
>>> As far as I can tell you're correct that deadman is in the vendor area
>>> but not merged. Any idea when it might make it into 9-STABLE?
>>> Thanks
>>> Olivier
>>>
>>>
>>>
>>>
>>>> So, when enabled this logic would panic a system as a way of letting
>>>> know that
>>>> something is wrong. You can read in the links why panic was selected
>>>> for this job.
>>>>
>>>> And speaking FreeBSD-centric - I think that our CAM layer would be a
>>>> perfect place
>>>> to detect such issues in non-ZFS-specific way.
>>>>
>>>> --
>>>> Andriy Gapon
>>>>
>>>
>>>
>>
>
More information about the freebsd-fs
mailing list