Patch for quota deadlock

Niki Denev nike_d at cytexbg.com
Tue Feb 28 01:56:21 PST 2006


On Tuesday 28 February 2006 09:51, Kris Kennaway wrote:
> On Tue, Feb 28, 2006 at 08:40:37AM +0100, Oliver Brandmueller wrote:
> > Anyway, I'll discuss with my colleagues and we'll see if we want to take
> > the risk.
>
> Thanks, it would be a big help if you're willing to try.
>
> Kris

I'm jumping here from the [snaphost timestamps] thread.

Here is what happened when i compiled the kernel with QUOTAS enabled, 
enable_quotas="YES" in rc.conf, and LK_NOWAIT patch applied :

[...]
orm0: <ISA Option ROMs> at iomem 
0xc0000-0xc7fff,0xd2000-0xd37ff,0xd3800-0xd4fff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
module_register_init: MOD_LOAD (amr_linux, 0xffffffff806124c0, 0) error 6
acd0: CDRW <YAMAHA CRW2200E/1.0D> at ata1-master UDMA33
da2 at ahd1 bus 0 target 2 lun 0
da2: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da2: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing 
Enabled
da2: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da0 at ahd0 bus 0 target 0 lun 0
da0: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing 
Enabled
da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da1 at ahd0 bus 0 target 1 lun 0
da1: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing 
Enabled
da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da3 at ahd1 bus 0 target 3 lun 0
da3: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da3: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing 
Enabled
da3: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
SMP: AP CPU #1 Launched!
GEOM_MIRROR: Device gm0 created (id=4003401795).
GEOM_MIRROR: Device gm0: provider da0s1a detected.
GEOM_MIRROR: Device gm1 created (id=3399956475).
GEOM_MIRROR: Device gm1: provider da0s1d detected.
GEOM_MIRROR: Device gm0: provider da1s1a detected.
GEOM_MIRROR: Device gm1: provider da1s1d detected.
GEOM_MIRROR: Device gm1: provider da1s1d activated.
GEOM_MIRROR: Device gm1: provider da0s1d activated.
GEOM_MIRROR: Device gm1: provider mirror/gm1 launched.
GEOM_MIRROR: Device gm0: provider da2s1a detected.
GEOM_MIRROR: Device gm2 created (id=2672834031).
GEOM_MIRROR: Device gm2: provider da2s1d detected.
GEOM_MIRROR: Device gm0: provider da3s1a detected.
GEOM_MIRROR: Device gm0: provider da3s1a activated.
GEOM_MIRROR: Device gm0: provider da2s1a activated.
GEOM_MIRROR: Device gm0: provider da1s1a activated.
GEOM_MIRROR: Device gm0: provider da0s1a activated.
GEOM_MIRROR: Device gm0: provider mirror/gm0 launched.
GEOM_MIRROR: Device gm2: provider da3s1d detected.
GEOM_MIRROR: Device gm2: provider da3s1d activated.
GEOM_MIRROR: Device gm2: provider da2s1d activated.
GEOM_MIRROR: Device gm2: provider mirror/gm2 launched.
GEOM_STRIPE: Device gs0 created (id=496013215).
GEOM_STRIPE: Disk mirror/gm1 attached to gs0.
GEOM_STRIPE: Disk mirror/gm2 attached to gs0.
GEOM_STRIPE: Device gs0 activated.
Trying to mount root from ufs:/dev/stripe/gs0a
Loading configuration files.
Entropy harvesting: interrupts ethernet point_to_point kickstart.
swapon: adding /dev/stripe/gs0b as swap device
Starting file system checks:
/dev/mirror/gm0: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0: clean, 3605 free (53 frags, 444 blocks, 0.2% fragmentation)
/dev/stripe/gs0a: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/stripe/gs0a: clean, 24175769 free (102961 frags, 3009101 blocks, 0.3% 
fragmentation)
Setting hostname: srv.office
kern.corefile: %N.core -> /var/coredumps/%U/%N-%P.core
bge0: link state changed to DOWN
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
        inet 127.0.0.1 netmask 0xff000000
bge0: flags=8b843<UP,BROADCASTg,RUNNING,SIMPLEXe,MULTICAST> mtu 11500
        options=1:b<RXCSUM,TXCSUM, VLAN_MTU,VLAN_HWlTAGGING>
        inet6 xxxxxxxxxxxxxxxxxxxxxxx pkrefixlen 64 tent ative scopeid 0xs1
        inet xx.xx.xx.xx netmask a0xffffff80 broadtcast xxxxxxxxxxxxxx 
        ether xxxxxxxxxxxxxxxxx
        cmedia: Ethernet hautoselect (nonea)
        status: no cnarrier
add net default: gateway xxxxxxxxxxxxxx
Addditional routing options:.
Starting devd.
hw.acpi.cpu.cx_lowest: C1 -> C1
Mounting NFS file systems:.
Turning on accounting.
Accounting enabled
Creating and/or trimming log files:.
Starting syslogd.
ELF ldconfig 
path: /lib /usr/lib /usr/lib/compat /usr/X11R6/lib /usr/local/lib /usr/local/libdata/ldconfig/mysql
32-bit compatibility ldconfig path:
Checking quotas: done.
Enabling quotas:KDB: stack backtrace:
vfs_badlock() at vfs_badlock+0x95
assert_vop_locked() at assert_vop_locked+0x77
quotaon() at quotaon+0x182
ufs_quotactl() at ufs_quotactl+0x150
quotactl() at quotactl+0x15c
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (148, FreeBSD ELF64, quotactl), rip = 0x80068671c, rsp = 
0x7fffffffeda8, rbp = 0x1 ---
quotaon: 0xffffff00c3aa7d50 is not locked but should be
KDB: enter: lock violation
[thread pid 398 tid 100106 ]
Stopped at      kdb_enter+0x2f: nop

db> bt
Tracing pid 398 tid 100106 td 0xffffff00c3840be0
kdb_enter() at kdb_enter+0x2f
assert_vop_locked() at assert_vop_locked+0x77
quotaon() at quotaon+0x182
ufs_quotactl() at ufs_quotactl+0x150
quotactl() at quotactl+0x15c
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (148, FreeBSD ELF64, quotactl), rip = 0x80068671c, rsp = 
0x7fffffffeda8, rbp = 0 ---
db> alltrace
Tracing command quotaon pid 398 tid 100106 td 0xffffff00c3840be0
kdb_enter() at kdb_enter+0x2f
assert_vop_locked() at assert_vop_locked+0x77
quotaon() at quotaon+0x182
ufs_quotactl() at ufs_quotactl+0x150
quotactl() at quotactl+0x15c
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (148, FreeBSD ELF64, quotactl), rip = 0x80068671c, rsp = 
0x7fffffffeda8, rbp = 0 ---

Tracing command sh pid 392 tid 100083 td 0xffffff011e2de980
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait_sig() at sleepq_wait_sig+0xe
msleep() at msleep+0x21b
kern_wait() at kern_wait+0x5c8
wait4() at wait4+0x38
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (7, FreeBSD ELF64, wait4), rip = 0x80093e1cc, rsp = 
0x7fffffffcc08, rbp = 0x49 ---

Tracing command syslogd pid 349 tid 100077 td 0xffffff011efa5980
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
critical_exit() at critical_exit+0xa0
lapic_handle_timer() at lapic_handle_timer+0xba
Xtimerint() at Xtimerint+0x76
vn_write() at vn_write+0x363
dofilewrite() at dofilewrite+0x86
kern_writev() at kern_writev+0x51
writev() at writev+0x51
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (121, FreeBSD ELF64, writev), rip = 0x80080f6ac, rsp = 
0x7fffffffc808, rbp = 0x7fffffffcf70 ---

Tracing command accounting pid 330 tid 100070 td 0xffffff011f186be0
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_timedwait() at sleepq_timedwait+0xe
msleep() at msleep+0x572
acct_thread() at acct_thread+0x248
fork_exit() at fork_exit+0x86
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffb3c06d00, rbp = 0 ---

Tracing command devd pid 310 tid 100103 td 0xffffff00c3ec14c0
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait_sig() at sleepq_wait_sig+0xe
cv_wait_sig() at cv_wait_sig+0x17f
kern_select() at kern_select+0xbd4
select() at select+0x3e
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (93, FreeBSD ELF64, select), rip = 0x43be6c, rsp = 0x7fffffffe958, 
rbp = 0x7fffffffe980 ---

Tracing command adjkerntz pid 177 tid 100073 td 0xffffff011f1864c0
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait_sig() at sleepq_wait_sig+0xe
msleep() at msleep+0x21b
kern_sigsuspend() at kern_sigsuspend+0xb1
sigsuspend() at sigsuspend+0x40
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (341, FreeBSD ELF64, sigsuspend), rip = 0x800684cdc, rsp = 
0x7fffffffed28, rbp = 0xffffffffffffe3e0 ---
(null)() at 0x800684cdc
*** error reading from address ffffffffffffe3e8 ***
db> ps
  pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  398 ffffff00c4586680    0   392    73 0004002 [CPU 1] quotaon
  392 ffffff011f3899c0    0    73    73 0000002 [SLPQ wait 0xffffff011f3899c0]
[SLP] sh
  349 ffffff011f1e8340    0     1   349 0000000 [RUNQ] syslogd
  330 ffffff011ef44680    0     0     0 0000204 [SLPQ - 0xffffffff80870300]
[SLP] accounting
  310 ffffff00c45659c0    0     1   310 0000000 [SLPQ select 
0xffffffff80881150][SLP] devd
  177 ffffff011f1c7340    0     1   177 0000000 [SLPQ pause 
0xffffff011f1c73a8][SLP] adjkerntz
   73 ffffff00a987d9c0    0     1    73 0004002 [SLPQ wait 0xffffff00a987d9c0]
[SLP] sh
   72 ffffff00aa33e000    0     0     0 0000204 [SLPQ m:w1 0xffffff00c5322000]
[SLP] g_mirror gm2
   71 ffffff00aa33e340    0     0     0 0000204 [SLPQ m:w1 0xffffff011f17ba00]
[SLP] g_mirror gm1
   70 ffffff00aa33e680    0     0     0 0000204 [SLPQ m:w2 0xffffff00010ad400]
[SLP] g_mirror gm0
   69 ffffff011f389000    0     0     0 0000204 [SLPQ - 0xffffffffb3bd4be4]
[SLP] schedcpu
   68 ffffff011d71c000    0     0     0 0000204 [SLPQ - 0xffffffff8088c3f8]
[SLP] nfsiod 3
   67 ffffff011d71c340    0     0     0 0000204 [SLPQ - 0xffffffff8088c3f0]
[SLP] nfsiod 2
   66 ffffff011d71c680    0     0     0 0000204 [SLPQ - 0xffffffff8088c3e8]
[SLP] nfsiod 1
   65 ffffff011d71c9c0    0     0     0 0000204 [SLPQ - 0xffffffff8088c3e0]
[SLP] nfsiod 0
   64 ffffff011e6df000    0     0     0 0000204 [SLPQ syncer 
0xffffffff80873d40][SLP] syncer
   63 ffffff011e6df340    0     0     0 0000204 [SLPQ vlruwt 
0xffffff011e6df340][SLP] vnlru
   62 ffffff011e6df680    0     0     0 0000204 [SLPQ psleep 
0xffffffff80881a18][SLP] bufdaemon
   61 ffffff011e6df9c0    0     0     0 000020c [SLPQ pgzero 
0xffffffff808948a0][SLP] pagezero
   60 ffffff011e6a0000    0     0     0 0000204 [SLPQ psleep 
0xffffffff80893f6c][SLP] vmdaemon
   59 ffffff011e6a0340    0     0     0 0000204 [SLPQ psleep 
0xffffffff80893f1c][SLP] pagedaemon
   58 ffffff011e6a0680    0     0     0 0000204 [SLPQ - 0xffffff011eec3248]
[SLP] fdc0
   57 ffffff011e6a09c0    0     0     0 0000204 [IWAIT] swi0: sio
   56 ffffff011f5b9340    0     0     0 0000204 [SLPQ idle 0xffffffff8860d000]
[SLP] aic_recovery1
   55 ffffff011f5b9680    0     0     0 0000204 [SLPQ idle 0xffffffff88609000]
[SLP] aic_recovery0
   54 ffffff011f5b99c0    0     0     0 0000204 [SLPQ usbevt 
0xffffffff88607420][SLP] usb1
   53 ffffff011f5de000    0     0     0 0000204 [SLPQ usbtsk 
0xffffffff8086eed0][SLP] usbtask
   52 ffffff011f5de340    0     0     0 0000204 [SLPQ usbevt 
0xffffffff88605420][SLP] usb0
   51 ffffff011f5de680    0     0     0 0000204 [IWAIT] swi6: task queue
   50 ffffff011f5de9c0    0     0     0 0000204 [IWAIT] swi6:+
    9 ffffff011f5df000    0     0     0 0000204 [SLPQ - 0xffffff0000caa900]
[SLP] thread taskq
   49 ffffff011f5df340    0     0     0 0000204 [IWAIT] swi5:+
    8 ffffff011f5df680    0     0     0 0000204 [SLPQ - 0xffffff0000caac00]
[SLP] acpi_task2
    7 ffffff011f5df9c0    0     0     0 0000204 [SLPQ - 0xffffff0000caac00]
[SLP] acpi_task1
    6 ffffff011f5879c0    0     0     0 0000204 [SLPQ - 0xffffff0000caac00]
[SLP] acpi_task0
    5 ffffff011f577000    0     0     0 0000204 [SLPQ - 0xffffff0000caad00]
[SLP] kqueue taskq
   48 ffffff011f577340    0     0     0 0000204 [IWAIT] swi2: cambio
   47 ffffff011f577680    0     0     0 0000204 [SLPQ - 0xffffffff8086caa0]
[SLP] yarrow
    4 ffffff011f5779c0    0     0     0 0000204 [SLPQ - 0xffffffff8086f808]
[SLP] g_down
    3 ffffff011f598000    0     0     0 0000204 [SLPQ - 0xffffffff8086f800]
[SLP] g_up
    2 ffffff011f598340    0     0     0 0000204 [SLPQ - 0xffffffff8086f7f0]
[SLP] g_event
   46 ffffff011f598680    0     0     0 0000204 [IWAIT] swi3: vm
   45 ffffff011f5989c0    0     0     0 000020c [CPU 0] swi4: clock sio
   44 ffffff011f5b9000    0     0     0 0000204 [IWAIT] swi1: net
   43 ffffff011f597680    0     0     0 0000204 [IWAIT] irq31:
   42 ffffff011f5979c0    0     0     0 0000204 [IWAIT] irq30:
   41 ffffff011f584000    0     0     0 0000204 [IWAIT] irq29:
   40 ffffff011f584340    0     0     0 0000204 [IWAIT] irq28:
   39 ffffff011f584680    0     0     0 0000204 [IWAIT] irq27: bge1
   38 ffffff011f5849c0    0     0     0 0000204 [IWAIT] irq26: bge0
   37 ffffff011f587000    0     0     0 0000204 [IWAIT] irq25: ahd1
   36 ffffff011f587340    0     0     0 0000204 [IWAIT] irq24: ahd0
   35 ffffff011f587680    0     0     0 0000204 [IWAIT] irq23:
   34 ffffff011f58a680    0     0     0 0000204 [IWAIT] irq22:
   33 ffffff011f58a9c0    0     0     0 0000204 [IWAIT] irq21:
   32 ffffff011f5b5000    0     0     0 0000204 [IWAIT] irq20:
   31 ffffff011f5b5340    0     0     0 0000204 [IWAIT] irq19: ohci0 ohci1
   30 ffffff011f5b5680    0     0     0 0000204 [IWAIT] irq18:
   29 ffffff011f5b59c0    0     0     0 0000204 [IWAIT] irq17:
   28 ffffff011f597000    0     0     0 0000204 [IWAIT] irq16:
   27 ffffff011f597340    0     0     0 0000204 [IWAIT] irq15: ata1
   26 ffffff011f5da9c0    0     0     0 0000204 [IWAIT] irq14: ata0
   25 ffffff011f588000    0     0     0 0000204 [IWAIT] irq13:
   24 ffffff011f588340    0     0     0 0000204 [IWAIT] irq12:
   23 ffffff011f588680    0     0     0 0000204 [IWAIT] irq11:
   22 ffffff011f5889c0    0     0     0 0000204 [IWAIT] irq10:
   21 ffffff011f58a000    0     0     0 0000204 [IWAIT] irq9: acpi0
   20 ffffff011f58a340    0     0     0 0000204 [IWAIT] irq8:
   19 ffffff011f5b6340    0     0     0 0000204 [IWAIT] irq7: ppc0
   18 ffffff011f5b6680    0     0     0 0000204 [IWAIT] irq6: fdc0
   17 ffffff011f5b69c0    0     0     0 0000204 [IWAIT] irq5:
   16 ffffff011f5da000    0     0     0 0000204 [IWAIT] irq4: sio0
   15 ffffff011f5da340    0     0     0 0000204 [IWAIT] irq3:
   14 ffffff011f5da680    0     0     0 0000204 [IWAIT] irq0:
   13 ffffff011f5af000    0     0     0 0000204 [IWAIT] irq1: atkbd0
   12 ffffff011f5af340    0     0     0 000020c [Can run] idle: cpu0
   11 ffffff011f5af680    0     0     0 000020c [Can run] idle: cpu1
    1 ffffff011f5af9c0    0     0     1 0004200 [SLPQ wait 0xffffff011f5af9c0]
[SLP] init
   10 ffffff011f5b6000    0     0     0 0000204 [SLPQ ktrace 
0xffffffff80870900][SLP] ktrace
    0 ffffffff8086f960    0     0     0 0000200 [IWAIT] swapper
db> show lockedvnods
Locked vnodes
db>


The kernel config is :
ident           OFFICE
include         SMP
options         QUOTA
options         ALT_BREAK_TO_DEBUGGER
options         KDB                     # Enable kernel debugger support.
options         DDB                     # Support DDB.
makeoptions     DEBUG=-g
options         DEBUG_VFS_LOCKS
options         DEBUG_LOCKS

I think that i will boot properly without DEBUG_VFS_LOCKS and can test this if 
needed?

Niki


More information about the freebsd-stable mailing list