zpool on Dell MD3000 causes frequent hangs

Thomas Johnson tommyj27 at gmail.com
Fri May 22 21:05:06 UTC 2015


That looks like a match. I'll the "abuse" knob up to 11, and see if I can
break it.

Thanks!

On Fri, May 22, 2015 at 3:27 PM, Karli Sjöberg <karli.sjoberg at slu.se> wrote:

>
> Den 22 maj 2015 9:10 em skrev Thomas Johnson <tommyj27 at gmail.com>:
> >
> > Hello,
> >
> > I am trying to track down an ongoing issue that I've been having, and
> > looking for any suggestions on a possible cause, or suggestions on how I
> > might troubleshoot further.
> >
> > The issue seems to be related to a Dell MD3000 storage array, which
> > contains a zpool. It seems that the host attached to the array will
> > occasionally hang, usually during periods of high disk activity
> > (annoyingly, usually about 0300).
> >
> > When the system hangs, I can ping the host, and switch between virtual
> > consoles (but not interact with them). The system is otherwise
> > unresponsive; with no errors reported on the console or logs. The only
> > remedy I have found is to hard-reset the host.
> >
> > I believe this issue is tied to the MD3000. I have tried swapping out SAS
> > cables, HBAs, the controller on the MD3000, and the host itself. I have
> > updated all the firmware I can find. Before I upgraded the host OS to
> > FreeBSD 10.1 (from 10.0) last month, I experienced hangs about once a
> > month. Since the upgrade, I have seen several events per week.
>
> My bet is on this:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197164
>
> /K
>
> >
> > In addition to the MD3000, I have a set of USB drives that are used in a
> > rotation as offsite backups for the zpool. I have seen a number of hang
> > events during zfs send/receive transfers to the USB disk.
> >
> > After the most recent hang, I removed two [consumer] SSDs from the pool
> > that were being used as cache devices. It is too early to tell if this
> > change had any impact.
> >
> > Here is some of the pertinent output from the host. I can provide any
> other
> > information that would be helpful.
> >
> > root at leopard:/home/tom-> uname -a
> > FreeBSD leopard 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0 r281232: Tue
> > Apr  7 17:38:04 CDT 2015
> > root at cheshire-b
> :/pkg/base/obj_10.1-RELEASE-p9/pkg/base/src_10.1-RELEASE-p9/sys/GENERIC
> > amd64
> > root at leopard:/home/tom-> zpool list
> > NAME          SIZE  ALLOC   FREE   FRAG  EXPANDSZ    CAP  DEDUP  HEALTH
> > ALTROOT
> > backup       5.31T  3.61T  1.70T    22%         -    68%  1.00x  ONLINE
> -
> > jumpdrive_f  2.72T  2.04T   693G    30%         -    75%  1.00x  ONLINE
> -
> > root at leopard:/home/tom-> zpool status backup
> >   pool: backup
> >  state: ONLINE
> >   scan: scrub repaired 0 in 13h15m with 0 errors on Wed May 13 16:17:29
> 2015
> > config:
> >
> >     NAME        STATE     READ WRITE CKSUM
> >     backup      ONLINE       0     0     0
> >       da0       ONLINE       0     0     0
> >
> > errors: No known data errors
> > root at leopard:/home/tom-> zpool get all backup
> > NAME    PROPERTY                       VALUE
> SOURCE
> > backup  size                           5.31T                          -
> > backup  capacity                       68%                            -
> > backup  altroot                        -
> > default
> > backup  health                         ONLINE                         -
> > backup  guid                           12638712474922952450
> > default
> > backup  version                        -
> > default
> > backup  bootfs                         -
> > default
> > backup  delegation                     on
> > default
> > backup  autoreplace                    off
> > default
> > backup  cachefile                      -
> > default
> > backup  failmode                       wait
> > default
> > backup  listsnapshots                  off
> > default
> > backup  autoexpand                     off
> > default
> > backup  dedupditto                     0
> > default
> > backup  dedupratio                     1.00x                          -
> > backup  free                           1.70T                          -
> > backup  allocated                      3.61T                          -
> > backup  readonly                       off                            -
> > backup  comment                        -
> > default
> > backup  expandsize                     0                              -
> > backup  freeing                        0
> > default
> > backup  fragmentation                  22%                            -
> > backup  leaked                         0
> > default
> > backup  feature at async_destroy          enabled
> local
> > backup  feature at empty_bpobj            active
> local
> > backup  feature at lz4_compress           active
> local
> > backup  feature at multi_vdev_crash_dump  enabled
> local
> > backup  feature at spacemap_histogram     active
> local
> > backup  feature at enabled_txg            active
> local
> > backup  feature at hole_birth             active
> local
> > backup  feature at extensible_dataset     enabled
> local
> > backup  feature at embedded_data          active
> local
> > backup  feature at bookmarks              enabled
> local
> > backup  feature at filesystem_limits      enabled
> local
> > _______________________________________________
> > freebsd-fs at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>
>


More information about the freebsd-fs mailing list