zpool on Dell MD3000 causes frequent hangs

Karli Sjöberg karli.sjoberg at slu.se
Fri May 22 20:42:33 UTC 2015


Den 22 maj 2015 9:10 em skrev Thomas Johnson <tommyj27 at gmail.com>:
>
> Hello,
>
> I am trying to track down an ongoing issue that I've been having, and
> looking for any suggestions on a possible cause, or suggestions on how I
> might troubleshoot further.
>
> The issue seems to be related to a Dell MD3000 storage array, which
> contains a zpool. It seems that the host attached to the array will
> occasionally hang, usually during periods of high disk activity
> (annoyingly, usually about 0300).
>
> When the system hangs, I can ping the host, and switch between virtual
> consoles (but not interact with them). The system is otherwise
> unresponsive; with no errors reported on the console or logs. The only
> remedy I have found is to hard-reset the host.
>
> I believe this issue is tied to the MD3000. I have tried swapping out SAS
> cables, HBAs, the controller on the MD3000, and the host itself. I have
> updated all the firmware I can find. Before I upgraded the host OS to
> FreeBSD 10.1 (from 10.0) last month, I experienced hangs about once a
> month. Since the upgrade, I have seen several events per week.

My bet is on this:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197164

/K

>
> In addition to the MD3000, I have a set of USB drives that are used in a
> rotation as offsite backups for the zpool. I have seen a number of hang
> events during zfs send/receive transfers to the USB disk.
>
> After the most recent hang, I removed two [consumer] SSDs from the pool
> that were being used as cache devices. It is too early to tell if this
> change had any impact.
>
> Here is some of the pertinent output from the host. I can provide any other
> information that would be helpful.
>
> root at leopard:/home/tom-> uname -a
> FreeBSD leopard 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0 r281232: Tue
> Apr  7 17:38:04 CDT 2015
> root at cheshire-b:/pkg/base/obj_10.1-RELEASE-p9/pkg/base/src_10.1-RELEASE-p9/sys/GENERIC
> amd64
> root at leopard:/home/tom-> zpool list
> NAME          SIZE  ALLOC   FREE   FRAG  EXPANDSZ    CAP  DEDUP  HEALTH
> ALTROOT
> backup       5.31T  3.61T  1.70T    22%         -    68%  1.00x  ONLINE  -
> jumpdrive_f  2.72T  2.04T   693G    30%         -    75%  1.00x  ONLINE  -
> root at leopard:/home/tom-> zpool status backup
>   pool: backup
>  state: ONLINE
>   scan: scrub repaired 0 in 13h15m with 0 errors on Wed May 13 16:17:29 2015
> config:
>
>     NAME        STATE     READ WRITE CKSUM
>     backup      ONLINE       0     0     0
>       da0       ONLINE       0     0     0
>
> errors: No known data errors
> root at leopard:/home/tom-> zpool get all backup
> NAME    PROPERTY                       VALUE                          SOURCE
> backup  size                           5.31T                          -
> backup  capacity                       68%                            -
> backup  altroot                        -
> default
> backup  health                         ONLINE                         -
> backup  guid                           12638712474922952450
> default
> backup  version                        -
> default
> backup  bootfs                         -
> default
> backup  delegation                     on
> default
> backup  autoreplace                    off
> default
> backup  cachefile                      -
> default
> backup  failmode                       wait
> default
> backup  listsnapshots                  off
> default
> backup  autoexpand                     off
> default
> backup  dedupditto                     0
> default
> backup  dedupratio                     1.00x                          -
> backup  free                           1.70T                          -
> backup  allocated                      3.61T                          -
> backup  readonly                       off                            -
> backup  comment                        -
> default
> backup  expandsize                     0                              -
> backup  freeing                        0
> default
> backup  fragmentation                  22%                            -
> backup  leaked                         0
> default
> backup  feature at async_destroy          enabled                        local
> backup  feature at empty_bpobj            active                         local
> backup  feature at lz4_compress           active                         local
> backup  feature at multi_vdev_crash_dump  enabled                        local
> backup  feature at spacemap_histogram     active                         local
> backup  feature at enabled_txg            active                         local
> backup  feature at hole_birth             active                         local
> backup  feature at extensible_dataset     enabled                        local
> backup  feature at embedded_data          active                         local
> backup  feature at bookmarks              enabled                        local
> backup  feature at filesystem_limits      enabled                        local
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"


More information about the freebsd-fs mailing list