RELENG_6: I/O deadlock under load

Ulrich Spoerlein uspoerlein at gmail.com
Fri Oct 27 06:37:43 UTC 2006


Hi all,

Our fileserver deadlocked, again. It is running RELENG_6 checked out
yesterday. I have enabled DDB, WITNESS and INVARIANTS and have it
hooked up via serial console.

I can not give out shell access, but I can run any command you might
consider useful, here's more details:

The system has two 3Ware controllers with a big RAID5 volume each:

3ware device driver for 9000 series storage controllers, version: 3.60.02.012
twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem
0xdc000000-0xddffffff,0xd8300000-0xd8300fff irq 48 at device 1.0 on
pci3
twa0: [FAST]
twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SXU-8LP, 8
ports, Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002
em0: <Intel(R) PRO/1000 Network Connection Version - 6.1.4> port
0x3040-0x307f mem 0xd8320000-0xd833ffff irq 54 at device 2.0 on pci3
em0: Ethernet address: 00:30:48:30:11:a2
em0: [FAST]
em1: <Intel(R) PRO/1000 Network Connection Version - 6.1.4> port
0x3080-0x30bf mem 0xd8340000-0xd835ffff irq 55 at device 2.1 on pci3
em1: Ethernet address: 00:30:48:30:11:a3
em1: [FAST]
pci1: <base peripheral, interrupt controller> at device 0.3 (no driver attached)
pcib4: <ACPI PCI-PCI bridge> irq 16 at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 6.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
pci6: <ACPI PCI bus> on pcib6
pci5: <base peripheral, interrupt controller> at device 0.1 (no driver attached)
pcib7: <ACPI PCI-PCI bridge> at device 0.2 on pci5
pci7: <ACPI PCI bus> on pcib7
twa1: <3ware 9000 series Storage Controller> port 0x4000-0x403f mem
0xde000000-0xdfffffff,0xd8500000-0xd8500fff irq 96 at device 1.0 on
pci7
twa1: [FAST]
twa1: INFO: (0x15: 0x1300): Controller details:: Model 9550SXU-8LP, 8
ports, Firmware FE9X 3.04.00.005, BIOS BE9X 3.04.00.002
da0 at twa0 bus 0 target 0 lun 0
da0: <AMCC 9550SXU-8L DISK 3.04> Fixed Direct Access SCSI-3 device
da0: 100.000MB/s transfers
da0: 1430448MB (2929557504 512 byte sectors: 255H 63S/T 182356C)
da1 at twa1 bus 0 target 0 lun 0
da1: <AMCC 9550SXU-8L DISK 3.04> Fixed Direct Access SCSI-3 device
da1: 100.000MB/s transfers
da1: 1430448MB (2929557504 512 byte sectors: 255H 63S/T 182356C)
SMP: AP CPU #1 Launched!
GEOM_MIRROR: Device gm0 created (id=3977032851).
GEOM_MIRROR: Device gm0: provider da0s1 detected.
GEOM_MIRROR: Device gm0: provider da1s1 detected.
GEOM_MIRROR: Device gm0: provider da1s1 activated.
GEOM_MIRROR: Device gm0: provider da0s1 activated.
GEOM_MIRROR: Device gm0: provider mirror/gm0 launched.

The base OS is sitting on a 8GB GMIRROR device across those two
volumes. There were multiple processes running at the time of the
deadlock:
Two dd if=/dev/urandom were writing to the filesystems on each volume.
An rsync was pumping data to a different server. This server also
exposed a part of the volume via GEOM_GATE to the deadlocked host.
This ggate device and a local device formed another gmirror, which was
just rebuilding.

I startet a dump of this gmirrored filesystem, but had to abort
because the tape drive was not recognized. I aborted the dump, ran
camcontrol rescan to get my /dev/sa0 device. mksnap_ffs was still
running, and as I was inpatient, I restarted my dump script. dump(8)
was blocking, because another mksnap_ffs was running. It looks like as
soon as the first mksnap_ffs was finished, the system deadlocked.

Yeah, this is pretty much, but the system has deadlocked before, with
*only* mksnap_ffs running, so I suspect this is the only culprit. I
could still enter the debugger via serial break (pinging the host
still works, switching virtual console work, BUT pressing enter on any
console produces nothing). It also continues to push out syslog
messages to the console ...

db> ps
  pid  ppid  pgrp   uid   state   wmesg     wchan    cmd
74669 82674 82674    25  N                           sendmail
35897 80497 80497     0  N                           sendmail
13932 81866  9485     0  SL      vnread   0xdc38a690 grep
81866 64561  9485     0  S       wait     0xc89a5000 sh
54507 32103 32103     0  SL+     pfault   0xc096db18 sleep
64561  9485  9485     0  S       piperd   0xc9cce4c8 perl5.8.8
 9485 24955  9485     0  Ss      wait     0xca6ad000 sh
24955  3564  3564     0  S       piperd   0xc9c85b28 cron
24201 10966 75715     0  SL+     physrd   0xdc38f600 dump
72560 10966 75715     0  SL+     physrd   0xdc389c50 dump
31224 10966 75715     0  SL+     physrd   0xdc38ae40 dump
10966  5349 75715     0  S+      sbwait   0xc86a7370 dump
 5349 43148 75715     0  S+      wait     0xca690430 dump
43148 75715 75715     0  S+      wait     0xcc284c90 sh
95955 59838 59838     0  S       nfslockd 0xc0967f08 rpc.lockd
59838     1 59838     0  Ss      select   0xc0964224 rpc.lockd
11779     1 11779     0  Ss      select   0xc0964224 rpc.statd
53756 59946 59946     0  S       -        0xc84fbc00 nfsd
50902 59946 59946     0  S       -        0xcc812200 nfsd
97900 59946 59946     0  S       -        0xca9e3000 nfsd
37548 59946 59946     0  S       -        0xcc811c00 nfsd
30961 59946 59946     0  S       -        0xcc811000 nfsd
62718 59946 59946     0  S       -        0xcc811600 nfsd
52612 59946 59946     0  S       -        0xcc811400 nfsd
51990 59946 59946     0  S       -        0xcc7d5a00 nfsd
24333 59946 59946     0  S       -        0xc8942a00 nfsd
35604 59946 59946     0  S       -        0xca9e3400 nfsd
96924 59946 59946     0  S       -        0xcc811800 nfsd
 8213 59946 59946     0  S       -        0xca9e3200 nfsd
91777 59946 59946     0  S       -        0xca9e3e00 nfsd
27135 59946 59946     0  S       -        0xcc811a00 nfsd
88659 59946 59946     0  S       -        0xc94a9c00 nfsd
22957 59946 59946     0  S       -        0xc8e25c00 nfsd
59946     1 59946     0  Ss      accept   0xc881019e nfsd
25073     1 25073     0  Ss      select   0xc0964224 mountd
61130     1 61130     0  Ss      select   0xc0964224 rpcbind
84138 96207 96207     0  S       sbwait   0xc8a3b638 heartbeat
28790 96207 96207     0  S       select   0xc0964224 heartbeat
65938 96207 96207     0  S       sbwait   0xc8ab079c heartbeat
96207     1 96207     0  Ss      select   0xc0964224 heartbeat
85402 68495 13044     0  S       select   0xc0964224 ssh
68495 31406 13044     0  S       pfault   0xc096db18 rsync
31406 80822 13044     0  SW      wait     0xc87e2a78 sh
80822 97693 13044     0  SW      wait     0xc87e2218 sh
97693 83489 13044     0  SW      wait     0xc8a62a78 sh
83489 13044 13044     0  SW      wait     0xc908d218 time
13044 32445 13044     0  SWs     wait     0xc908d648 sh
32445  3564  3564     0  S       piperd   0xc8751660 cron
32103 73903 32103     0  S+      wait     0xc908b218 sh
50439     1 50439     0  Ss      (threaded)          ggatec
100126                   S       kserel   0xc84d48d4 ggatec
100131                   S       kserel   0xc84d48d4 ggatec
100092                   S       connec   0xc86b219e ggatec
100111                   S       ksesigwa 0xc89a5350 ggatec
73903 82004 73903     0  SWs+    pause    0xc86c7034 tcsh
45760 82004 45760     0  Ss+     ttyin    0xc8ea0010 tcsh
75715 18700 75715     0  S+      wait     0xc89a5430 sh
18700 82004 18700     0  SWs+    pause    0xc8a64034 tcsh
79812 30532 79812     0  S+      nanslp   0xc0916ecc iostat
60930 82004 60930     0  Ss+     ttyin    0xc8426410 tcsh
30532 82004 30532     0  SWs+    pause    0xc850767c tcsh
82004     1 82004     0  Ss      select   0xc0964224 screen
78913     1 78913     0  ?s+                         getty
63737     1 63737     0  SWs+    ttyin    0xc8520010 getty
 4887     1  4887     0  SWs+    ttyin    0xc8520410 getty
57691     1 57691     0  SWs+    ttyin    0xc8515010 getty
95149     1 95149     0  SWs+    ttyin    0xc8517410 getty
 7519     1  7519     0  SWs+    ttyin    0xc8510c10 getty
33308     1 33308     0  SWs+    ttyin    0xc8514410 getty
57883     1 57883     0  SWs+    ttyin    0xc8516010 getty
30755     1 30755     0  SWs+    ttyin    0xc8516810 getty
63367     1 63367     0  SLs     pfault   0xc096db18 bsnmpd
 6138     1  6138     0  SWs     select   0xc0964224 inetd
 3564     1  3564     0  SLs     pfault   0xc096db18 cron
82674     1 82674    25  SLs     vmwait   0xc096db18 sendmail
80497     1 80497     0  SLs     vmwait   0xc096db18 sendmail
43757     1 43757     0  ?s                          sshd
14937     1 14937     0  Ss      biowr    0xdc49e708 ntpd
51469     1 11351     0  S       nanslp   0xc0916ecc smartd
37144     1 37144     0  Ss      select   0xc0964224 amd
96296     1 96296   115  SLs     pfault   0xc096db18 symon
28150     1 28150     0  Ss      bo_wwait 0xcc423a74 syslogd
17976     1 17976     0  SWs     select   0xc0964224 devd
   51     0     0     0  SL      gwrite   0xc8fdd420 [g_mirror share]
   50     0     0     0  SL      m:w1     0xc8420400 [g_mirror gm0]
   49     0     0     0  SL      -        0xe8c17d04 [schedcpu]
   48     0     0     0  SL      sdflush  0xc096d334 [softdepflush]
   47     0     0     0  SL      vlruwt   0xc84d2c90 [vnlru]
   46     0     0     0  SL      getblk   0xdc52ff70 [syncer]
   45     0     0     0  SL      psleep   0xc096478c [bufdaemon]
   44     0     0     0  SL      pgzero   0xc096e2a4 [pagezero]
   43     0     0     0  SL      psleep   0xc096ddf4 [vmdaemon]
   42     0     0     0  SL      wswbuf0  0xc096d554 [pagedaemon]
   41     0     0     0  WL                          [swi0: sio]
   40     0     0     0  WL                          [irq1: atkbd0]
   39     0     0     0  WL                          [irq17: ichsmb0]
   38     0     0     0  WL                          [irq15: ata1]
   37     0     0     0  WL                          [irq14: ata0]
   36     0     0     0  SL      usbevt   0xc841aa10 [usb4]
   35     0     0     0  WL                          [irq23: ehci0]
   34     0     0     0  SL      usbevt   0xc84e9210 [usb3]
   33     0     0     0  SL      usbevt   0xc84de210 [usb2]
   32     0     0     0  WL                          [irq18: uhci2]
   31     0     0     0  SL      usbevt   0xc84d6210 [usb1]
   30     0     0     0  WL                          [irq19: uhci1]
   29     0     0     0  SL      usbtsk   0xc0913b64 [usbtask]
   28     0     0     0  SL      usbevt   0xc848e210 [usb0]
   27     0     0     0  WL                          [irq16: uhci0 uhci3]
   26     0     0     0  SL      -        0xc844bb00 [em1 taskq]
   25     0     0     0  SL      -        0xc8480380 [em0 taskq]
   24     0     0     0  SL      idle     0xc844c000 [aic_recovery1]
   23     0     0     0  WL                          [irq25: ahd1]
   22     0     0     0  SL      idle     0xc8448000 [aic_recovery0]
   21     0     0     0  WL                          [irq24: ahd0]
   20     0     0     0  WL                          [irq9: acpi0]
   19     0     0     0  WL                          [swi5: +]
    9     0     0     0  SL      -        0xc840da80 [thread taskq]
   18     0     0     0  WL                          [swi6: Giant taskq]
    8     0     0     0  SL      -        0xc840dc00 [acpi_task_2]
    7     0     0     0  SL      -        0xc840dc00 [acpi_task_1]
    6     0     0     0  SL      -        0xc840dc00 [acpi_task_0]
   17     0     0     0  WL                          [swi6: task queue]
    5     0     0     0  SL      -        0xc840dd80 [kqueue taskq]
   16     0     0     0  WL                          [swi2: cambio]
   15     0     0     0  SL      -        0xc09118a0 [yarrow]
    4     0     0     0  SL      -        0xc0914388 [g_down]
    3     0     0     0  SL      -        0xc0914384 [g_up]
    2     0     0     0  SL      -        0xc091437c [g_event]
   14     0     0     0  WL                          [swi3: vm]
   13     0     0     0  WL                          [swi4: clock sio]
   12     0     0     0  WL                          [swi1: net]
   11     0     0     0  RL      CPU 0               [idle: cpu0]
   10     0     0     0  RL      CPU 1               [idle: cpu1]
    1     0     1     0  SLs     wait     0xc8337000 [init]
    0     0     0     0  SLs     vmwait   0xc096db18 [swapper]
41681 64561  9485     0  Z                           ifconfig
86595 64561  9485     0  Z                           df
db> show alllocks
Process 51 (g_mirror share) thread 0xc84d3a80 (100038)
exclusive sx gmirror:lock r = 0 (0xc84b8c2c) locked @
/usr/src/sys/modules/geom/geom_mirror/../../../geom/mirror/g_mirror.c:1809
db> where 51
Tracing pid 51 tid 100038 td 0xc84d3a80
sched_switch(c84d3a80,0,1) at sched_switch+0x177
mi_switch(1,0) at mi_switch+0x270
sleepq_switch(c8fdd420,c09165a0,0,c084049e,236,...) at sleepq_switch+0xc1
sleepq_timedwait(c8fdd420) at sleepq_timedwait+0x4a
msleep(c8fdd420,c0964860,4c,c0837cfb,64) at msleep+0x255
biowait(c8fdd420,c0837cfb,c8fdd420,c851b9c0,e8c1ac40,...) at biowait+0x52
g_write_data(c851b9c0,bb1cb200,ee,ca26a800,200,...) at g_write_data+0x8d
g_mirror_write_metadata(c84b8c00,c8710480,e8c1abc0,4d4f4547,494d3a3a,...)
at g_mirror_write_metadata+0x2e0
g_mirror_update_metadata(c8710480,e8c1ac7c,c0651510,c0a7e9a5,c080a66e,...)
at g_mirror_update_metadata+0x27
g_mirror_register_request(c083ee95,c84b8c2c,c84b8c6c,c84b8c80,202,...)
at g_mirror_register_request+0xea
g_mirror_worker(c84b8c00,e8c1ad38,c84b8c00,c0a79554,0,...) at
g_mirror_worker+0x6b8
fork_exit(c0a79554,c84b8c00,e8c1ad38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe8c1ad6c, ebp = 0 ---

db> show allpcpu
Current CPU: 0

cpuid        = 0
curthread    = 0xc8333780: pid 11 "idle: cpu0"
curpcb       = 0xe6e95d90
fpcurthread  = none
idlethread   = 0xc8333780: pid 11 "idle: cpu0"
APIC ID      = 0
currentldt   = 0x50
spin locks held:

cpuid        = 1
curthread    = 0xc8333600: pid 10 "idle: cpu1"
curpcb       = 0xe6e92d90
fpcurthread  = none
idlethread   = 0xc8333600: pid 10 "idle: cpu1"
APIC ID      = 6
currentldt   = 0x50
spin locks held:

db> show lockedvnods
Locked vnodes

0xc87f6990: tag ufs, type VDIR
    usecount 1, writecount 0, refcount 2 mountedhere 0
    flags ()
    v_object 0xcbb47294 ref 0 pages 0
     lock type ufs: EXCL (count 1) by thread 0xc8508600 (pid 14937)
        ino 94210, on dev ufs/var

0xcc423990: tag ufs, type VREG
    usecount 1, writecount 1, refcount 3 mountedhere 0
    flags ()
    v_object 0xcaea44a4 ref 0 pages 13
     lock type ufs: EXCL (count 1) by thread 0xc8509d80 (pid 28150)
        ino 94280, on dev ufs/var

0xca3ab330: tag ufs, type VREG
    usecount 1, writecount 0, refcount 1 mountedhere 0
    flags ()
     lock type ufs: EXCL (count 1) by thread 0xc8508600 (pid 14937)
        ino 94248, on dev ufs/var

0xc87c2770: tag ufs, type VREG
    usecount 5, writecount 0, refcount 6 mountedhere 0
    flags ()
    v_object 0xc8771210 ref 4 pages 6
     lock type ufs: SHARED (count 1)
        ino 565709, on dev ufs/usr

0xc8ee9660: tag syncer, type VNON
    usecount 1, writecount 0, refcount 2 mountedhere 0
    flags ()
     lock type syncer: EXCL (count 1) by thread 0xc84d3480 (pid 46)

0xc9278220: tag ufs, type VREG
    usecount 2, writecount 0, refcount 3 mountedhere 0
    flags (VV_SYSTEM)
    v_object 0xc8eca108 ref 0 pages 19358
     lock type snaplk: EXCL (count 1) by thread 0xc84d3480 (pid 46)
        ino 5, on dev mirror/share

If you need any further info or wish me to try different patches
against RELENG_6, don't hesistate. But please be quick about it, since
this file server has to go into production soon.

Uli


More information about the freebsd-stable mailing list