zpool on Dell MD3000 causes frequent hangs
Karli Sjöberg
karli.sjoberg at slu.se
Fri May 22 20:42:33 UTC 2015
Den 22 maj 2015 9:10 em skrev Thomas Johnson <tommyj27 at gmail.com>:
>
> Hello,
>
> I am trying to track down an ongoing issue that I've been having, and
> looking for any suggestions on a possible cause, or suggestions on how I
> might troubleshoot further.
>
> The issue seems to be related to a Dell MD3000 storage array, which
> contains a zpool. It seems that the host attached to the array will
> occasionally hang, usually during periods of high disk activity
> (annoyingly, usually about 0300).
>
> When the system hangs, I can ping the host, and switch between virtual
> consoles (but not interact with them). The system is otherwise
> unresponsive; with no errors reported on the console or logs. The only
> remedy I have found is to hard-reset the host.
>
> I believe this issue is tied to the MD3000. I have tried swapping out SAS
> cables, HBAs, the controller on the MD3000, and the host itself. I have
> updated all the firmware I can find. Before I upgraded the host OS to
> FreeBSD 10.1 (from 10.0) last month, I experienced hangs about once a
> month. Since the upgrade, I have seen several events per week.
My bet is on this:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197164
/K
>
> In addition to the MD3000, I have a set of USB drives that are used in a
> rotation as offsite backups for the zpool. I have seen a number of hang
> events during zfs send/receive transfers to the USB disk.
>
> After the most recent hang, I removed two [consumer] SSDs from the pool
> that were being used as cache devices. It is too early to tell if this
> change had any impact.
>
> Here is some of the pertinent output from the host. I can provide any other
> information that would be helpful.
>
> root at leopard:/home/tom-> uname -a
> FreeBSD leopard 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0 r281232: Tue
> Apr 7 17:38:04 CDT 2015
> root at cheshire-b:/pkg/base/obj_10.1-RELEASE-p9/pkg/base/src_10.1-RELEASE-p9/sys/GENERIC
> amd64
> root at leopard:/home/tom-> zpool list
> NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH
> ALTROOT
> backup 5.31T 3.61T 1.70T 22% - 68% 1.00x ONLINE -
> jumpdrive_f 2.72T 2.04T 693G 30% - 75% 1.00x ONLINE -
> root at leopard:/home/tom-> zpool status backup
> pool: backup
> state: ONLINE
> scan: scrub repaired 0 in 13h15m with 0 errors on Wed May 13 16:17:29 2015
> config:
>
> NAME STATE READ WRITE CKSUM
> backup ONLINE 0 0 0
> da0 ONLINE 0 0 0
>
> errors: No known data errors
> root at leopard:/home/tom-> zpool get all backup
> NAME PROPERTY VALUE SOURCE
> backup size 5.31T -
> backup capacity 68% -
> backup altroot -
> default
> backup health ONLINE -
> backup guid 12638712474922952450
> default
> backup version -
> default
> backup bootfs -
> default
> backup delegation on
> default
> backup autoreplace off
> default
> backup cachefile -
> default
> backup failmode wait
> default
> backup listsnapshots off
> default
> backup autoexpand off
> default
> backup dedupditto 0
> default
> backup dedupratio 1.00x -
> backup free 1.70T -
> backup allocated 3.61T -
> backup readonly off -
> backup comment -
> default
> backup expandsize 0 -
> backup freeing 0
> default
> backup fragmentation 22% -
> backup leaked 0
> default
> backup feature at async_destroy enabled local
> backup feature at empty_bpobj active local
> backup feature at lz4_compress active local
> backup feature at multi_vdev_crash_dump enabled local
> backup feature at spacemap_histogram active local
> backup feature at enabled_txg active local
> backup feature at hole_birth active local
> backup feature at extensible_dataset enabled local
> backup feature at embedded_data active local
> backup feature at bookmarks enabled local
> backup feature at filesystem_limits enabled local
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
More information about the freebsd-fs
mailing list