zpool on Dell MD3000 causes frequent hangs

Thomas Johnson tommyj27 at gmail.com
Fri May 22 19:10:17 UTC 2015


Hello,

I am trying to track down an ongoing issue that I've been having, and
looking for any suggestions on a possible cause, or suggestions on how I
might troubleshoot further.

The issue seems to be related to a Dell MD3000 storage array, which
contains a zpool. It seems that the host attached to the array will
occasionally hang, usually during periods of high disk activity
(annoyingly, usually about 0300).

When the system hangs, I can ping the host, and switch between virtual
consoles (but not interact with them). The system is otherwise
unresponsive; with no errors reported on the console or logs. The only
remedy I have found is to hard-reset the host.

I believe this issue is tied to the MD3000. I have tried swapping out SAS
cables, HBAs, the controller on the MD3000, and the host itself. I have
updated all the firmware I can find. Before I upgraded the host OS to
FreeBSD 10.1 (from 10.0) last month, I experienced hangs about once a
month. Since the upgrade, I have seen several events per week.

In addition to the MD3000, I have a set of USB drives that are used in a
rotation as offsite backups for the zpool. I have seen a number of hang
events during zfs send/receive transfers to the USB disk.

After the most recent hang, I removed two [consumer] SSDs from the pool
that were being used as cache devices. It is too early to tell if this
change had any impact.

Here is some of the pertinent output from the host. I can provide any other
information that would be helpful.

root at leopard:/home/tom-> uname -a
FreeBSD leopard 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0 r281232: Tue
Apr  7 17:38:04 CDT 2015
root at cheshire-b:/pkg/base/obj_10.1-RELEASE-p9/pkg/base/src_10.1-RELEASE-p9/sys/GENERIC
amd64
root at leopard:/home/tom-> zpool list
NAME          SIZE  ALLOC   FREE   FRAG  EXPANDSZ    CAP  DEDUP  HEALTH
ALTROOT
backup       5.31T  3.61T  1.70T    22%         -    68%  1.00x  ONLINE  -
jumpdrive_f  2.72T  2.04T   693G    30%         -    75%  1.00x  ONLINE  -
root at leopard:/home/tom-> zpool status backup
  pool: backup
 state: ONLINE
  scan: scrub repaired 0 in 13h15m with 0 errors on Wed May 13 16:17:29 2015
config:

    NAME        STATE     READ WRITE CKSUM
    backup      ONLINE       0     0     0
      da0       ONLINE       0     0     0

errors: No known data errors
root at leopard:/home/tom-> zpool get all backup
NAME    PROPERTY                       VALUE                          SOURCE
backup  size                           5.31T                          -
backup  capacity                       68%                            -
backup  altroot                        -
default
backup  health                         ONLINE                         -
backup  guid                           12638712474922952450
default
backup  version                        -
default
backup  bootfs                         -
default
backup  delegation                     on
default
backup  autoreplace                    off
default
backup  cachefile                      -
default
backup  failmode                       wait
default
backup  listsnapshots                  off
default
backup  autoexpand                     off
default
backup  dedupditto                     0
default
backup  dedupratio                     1.00x                          -
backup  free                           1.70T                          -
backup  allocated                      3.61T                          -
backup  readonly                       off                            -
backup  comment                        -
default
backup  expandsize                     0                              -
backup  freeing                        0
default
backup  fragmentation                  22%                            -
backup  leaked                         0
default
backup  feature at async_destroy          enabled                        local
backup  feature at empty_bpobj            active                         local
backup  feature at lz4_compress           active                         local
backup  feature at multi_vdev_crash_dump  enabled                        local
backup  feature at spacemap_histogram     active                         local
backup  feature at enabled_txg            active                         local
backup  feature at hole_birth             active                         local
backup  feature at extensible_dataset     enabled                        local
backup  feature at embedded_data          active                         local
backup  feature at bookmarks              enabled                        local
backup  feature at filesystem_limits      enabled                        local


More information about the freebsd-fs mailing list