Deadlock on zfs import

David Wimsey david at wimsey.us
Sat Oct 6 06:01:04 UTC 2012


I have a FreeBSD 9-RELEASE machine that deadlocks when importing one of the ZFS pools on it.  When I run spool import zfs01 (the offending pool) the data becomes available, and mounts show up as expected.  Then it continues to chew away at the disks like its doing a scrub eventually deadlocking.  I've confirmed I can get to some of the data before the deadlock happens, so I can get data off of it but its a tedious process and doesn't help me long term.

Heres how I got to this point:

This machine is essentially a network file server, it serves NFS for a vmware ESXi machine, samba for the family's Windows based machines, and afpd for the macs as well as a couple of jails for subversion and tftp/netboot services.  Other than home directories, all of the mount points on this machine are generally are set as  readonly and are never writable over the network.  If I need to add something to the server, its dropped into my home directory then moved to its final location from the command line on the server itself.

Noticing the offending pool was at 94% capacity I started rearranging things and cleaning up.  I had multiple shells open copying to multiple different file systems on the same zfs pool.  This normally works fine.  This time it doesn't seem so normal.  Some point in copying roughly 25GB between different filesystems on the same pool the machine deadlocked.

On reboot the machine will reach the 'mounting local filesystems' phase and then starts chugging away at the disks until it locks up again.  The only way to get it to boot is to boot to single user mode then ifs export the offending pool.  After doing so the machine works fine with the exception of bits that depend on file systems on the offending pool.  The two other pools on the machine (boot and zfs02) work perfectly.

If I boot with zfs01 exported and then import it after boot, it chugs away at the disks for a long time and then deadlocks the machine eventually.

Some filesystems have compression and/or dedup enabled, but I have been turning that off due to the machine only having 4GB of ram.

So, can some one point me in the direction of figuring out whats wrong and how to maybe go about fixing it.  How can I tell if its memory exhaustion thats causing the problem?

Is there a way to roll the pool back (without snapshots, which I had actually just deleted from the pool, heh) to maybe the last valid state on the pool?



Summary of machine config (output of various commands shown at the bottom due to its size):

4GB of ram
2 SSDs,  64GB each
4 standard drives, 500GB each  (2 western digital, 2 seagate)
3 ZFS pools

zboot - Configured with one vdev, it is 2 slices from the SSDs as a mirror - This pool imports normally with no issues
zfs02 - Configured with one vdev, it is 2 slices from the SSDs as a mirror - This pool imports normally with no issues
zfs01 - This is the offending pool, and of course the only one with data that can't be replaced easily or if at all.

1 raid-z vdev consisting of 3 HDDs and one hot spare HDD.
1 mirrored vdev consisting of 2 slices from the SSDs for the ZIL
2 slices from the SSDs for L2ARC

Drives are all SATA connections split between the motherboard SATA ports and a 4 port RocketPort PCI-e 'raid controller', no raid configured, just using as additional SATA ports and providing fault tolerance if the onboard controller fails.




Output of various commands:

mayham# dmesg | head -n 15
Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.0-RELEASE-p3 #0: Tue Jun 12 02:52:29 UTC 2012
    root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
CPU: AMD Phenom(tm) II X4 945 Processor (3013.28-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x100f42  Family = 10  Model = 4  Stepping = 2
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x802009<SSE3,MON,CX16,POPCNT>
  AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
  AMD Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT>
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 4075692032 (3886 MB)
 
 
 
mayham# dmesg | grep ada
ada0 at ahcich1 bus 0 scbus1 target 0 lun 0
ada0: <ST3500418AS CC34> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad6
ada1 at ahcich2 bus 0 scbus2 target 0 lun 0
ada1: <ST3500418AS CC34> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad8
ada2 at ahcich3 bus 0 scbus3 target 0 lun 0
ada2: <M4-CT064M4SSD2 0309> ATA-9 SATA 3.x device
ada2: 600.000MB/s transfers (SATA 3.x, UDMA5, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 61057MB (125045424 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad10
ada3 at ahcich4 bus 0 scbus5 target 0 lun 0
ada3: <M4-CT064M4SSD2 0309> ATA-9 SATA 3.x device
ada3: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 61057MB (125045424 512 byte sectors: 16H 63S/T 16383C)
ada3: Previously was known as ad14
ada4 at ahcich5 bus 0 scbus6 target 0 lun 0
ada4: <WDC WD5000AAKS-65YGA0 12.01C02> ATA-8 SATA 2.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
ada4: Previously was known as ad16
ada5 at ahcich6 bus 0 scbus7 target 0 lun 0
ada5: <WDC WD5000AACS-00ZUB0 01.01B01> ATA-8 SATA 2.x device
ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
ada5: Previously was known as ad18
 
 
 
mayham# dmesg | grep -i zfs
ZFS filesystem version 5
ZFS storage pool version 28
Trying to mount root from zfs:zboot []...
 
 
 
mayham# cat /boot/loader.conf
zfs_load="YES"
vfs.root.mountfrom="zfs:zboot"
splash_bmp_load="YES"
vesa_load="YES"
loader_logo="orb"
loader_color="YES"
bitmap_load="YES"
if_vlan_load="YES"


# Added after deadlock occured
vm.kmem_size="512M"
vm.kmem_size_max="512M"
vfs.zfs.arc_max="40M"
vfs.zfs.vdev.cache.size="5M"
vfs.zfs.prefetch_disable="1"
 
 
 
mayham# zpool status
  pool: zboot
 state: ONLINE
 scan: scrub repaired 0 in 0h1m with 0 errors on Sun Aug 12 03:35:52 2012
config:

	NAME        STATE     READ WRITE CKSUM
	zboot       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada3p2  ONLINE       0     0     0

errors: No known data errors

  pool: zfs01
 state: ONLINE
 scan: resilvered 144K in 0h0m with 0 errors on Thu Aug 30 02:35:33 2012
config:

	NAME         STATE     READ WRITE CKSUM
	zfs01        ONLINE       0     0     0
	  raidz1-0   ONLINE       0     0     0
	    ada1p3   ONLINE       0     0     0
	    ada0p3   ONLINE       0     0     0
	    ada5p3   ONLINE       0     0     0
	logs
	  ada2p4     ONLINE       0     0     0
	  ada3p4     ONLINE       0     0     0
	cache
	  ada2p5     ONLINE       0     0     0
	  ada3p5     ONLINE       0     0     0
	spares
	  gpt/disk3  AVAIL   

errors: No known data errors

  pool: zfs02
 state: ONLINE
 scan: scrub repaired 0 in 0h1m with 0 errors on Fri Oct  5 04:42:19 2012
config:

	NAME        STATE     READ WRITE CKSUM
	zfs02       ONLINE       0     0     0
	  ada2p6    ONLINE       0     0     0
	  ada3p6    ONLINE       0     0     0

errors: No known data errors

mayham# pool list
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
zboot  3.97G  2.32G  1.65G    58%  1.12x  ONLINE  -
zfs01  1.30T  1.24T  69.3G    94%  1.36x  ONLINE  -
zfs02    41G  39.4G  1.63G    96%  1.21x  ONLINE  -



More information about the freebsd-fs mailing list