Patch for quota deadlock
Niki Denev
nike_d at cytexbg.com
Tue Feb 28 01:56:21 PST 2006
On Tuesday 28 February 2006 09:51, Kris Kennaway wrote:
> On Tue, Feb 28, 2006 at 08:40:37AM +0100, Oliver Brandmueller wrote:
> > Anyway, I'll discuss with my colleagues and we'll see if we want to take
> > the risk.
>
> Thanks, it would be a big help if you're willing to try.
>
> Kris
I'm jumping here from the [snaphost timestamps] thread.
Here is what happened when i compiled the kernel with QUOTAS enabled,
enable_quotas="YES" in rc.conf, and LK_NOWAIT patch applied :
[...]
orm0: <ISA Option ROMs> at iomem
0xc0000-0xc7fff,0xd2000-0xd37ff,0xd3800-0xd4fff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
module_register_init: MOD_LOAD (amr_linux, 0xffffffff806124c0, 0) error 6
acd0: CDRW <YAMAHA CRW2200E/1.0D> at ata1-master UDMA33
da2 at ahd1 bus 0 target 2 lun 0
da2: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da2: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da2: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da0 at ahd0 bus 0 target 0 lun 0
da0: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da1 at ahd0 bus 0 target 1 lun 0
da1: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da3 at ahd1 bus 0 target 3 lun 0
da3: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
da3: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da3: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
SMP: AP CPU #1 Launched!
GEOM_MIRROR: Device gm0 created (id=4003401795).
GEOM_MIRROR: Device gm0: provider da0s1a detected.
GEOM_MIRROR: Device gm1 created (id=3399956475).
GEOM_MIRROR: Device gm1: provider da0s1d detected.
GEOM_MIRROR: Device gm0: provider da1s1a detected.
GEOM_MIRROR: Device gm1: provider da1s1d detected.
GEOM_MIRROR: Device gm1: provider da1s1d activated.
GEOM_MIRROR: Device gm1: provider da0s1d activated.
GEOM_MIRROR: Device gm1: provider mirror/gm1 launched.
GEOM_MIRROR: Device gm0: provider da2s1a detected.
GEOM_MIRROR: Device gm2 created (id=2672834031).
GEOM_MIRROR: Device gm2: provider da2s1d detected.
GEOM_MIRROR: Device gm0: provider da3s1a detected.
GEOM_MIRROR: Device gm0: provider da3s1a activated.
GEOM_MIRROR: Device gm0: provider da2s1a activated.
GEOM_MIRROR: Device gm0: provider da1s1a activated.
GEOM_MIRROR: Device gm0: provider da0s1a activated.
GEOM_MIRROR: Device gm0: provider mirror/gm0 launched.
GEOM_MIRROR: Device gm2: provider da3s1d detected.
GEOM_MIRROR: Device gm2: provider da3s1d activated.
GEOM_MIRROR: Device gm2: provider da2s1d activated.
GEOM_MIRROR: Device gm2: provider mirror/gm2 launched.
GEOM_STRIPE: Device gs0 created (id=496013215).
GEOM_STRIPE: Disk mirror/gm1 attached to gs0.
GEOM_STRIPE: Disk mirror/gm2 attached to gs0.
GEOM_STRIPE: Device gs0 activated.
Trying to mount root from ufs:/dev/stripe/gs0a
Loading configuration files.
Entropy harvesting: interrupts ethernet point_to_point kickstart.
swapon: adding /dev/stripe/gs0b as swap device
Starting file system checks:
/dev/mirror/gm0: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0: clean, 3605 free (53 frags, 444 blocks, 0.2% fragmentation)
/dev/stripe/gs0a: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/stripe/gs0a: clean, 24175769 free (102961 frags, 3009101 blocks, 0.3%
fragmentation)
Setting hostname: srv.office
kern.corefile: %N.core -> /var/coredumps/%U/%N-%P.core
bge0: link state changed to DOWN
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet 127.0.0.1 netmask 0xff000000
bge0: flags=8b843<UP,BROADCASTg,RUNNING,SIMPLEXe,MULTICAST> mtu 11500
options=1:b<RXCSUM,TXCSUM, VLAN_MTU,VLAN_HWlTAGGING>
inet6 xxxxxxxxxxxxxxxxxxxxxxx pkrefixlen 64 tent ative scopeid 0xs1
inet xx.xx.xx.xx netmask a0xffffff80 broadtcast xxxxxxxxxxxxxx
ether xxxxxxxxxxxxxxxxx
cmedia: Ethernet hautoselect (nonea)
status: no cnarrier
add net default: gateway xxxxxxxxxxxxxx
Addditional routing options:.
Starting devd.
hw.acpi.cpu.cx_lowest: C1 -> C1
Mounting NFS file systems:.
Turning on accounting.
Accounting enabled
Creating and/or trimming log files:.
Starting syslogd.
ELF ldconfig
path: /lib /usr/lib /usr/lib/compat /usr/X11R6/lib /usr/local/lib /usr/local/libdata/ldconfig/mysql
32-bit compatibility ldconfig path:
Checking quotas: done.
Enabling quotas:KDB: stack backtrace:
vfs_badlock() at vfs_badlock+0x95
assert_vop_locked() at assert_vop_locked+0x77
quotaon() at quotaon+0x182
ufs_quotactl() at ufs_quotactl+0x150
quotactl() at quotactl+0x15c
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (148, FreeBSD ELF64, quotactl), rip = 0x80068671c, rsp =
0x7fffffffeda8, rbp = 0x1 ---
quotaon: 0xffffff00c3aa7d50 is not locked but should be
KDB: enter: lock violation
[thread pid 398 tid 100106 ]
Stopped at kdb_enter+0x2f: nop
db> bt
Tracing pid 398 tid 100106 td 0xffffff00c3840be0
kdb_enter() at kdb_enter+0x2f
assert_vop_locked() at assert_vop_locked+0x77
quotaon() at quotaon+0x182
ufs_quotactl() at ufs_quotactl+0x150
quotactl() at quotactl+0x15c
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (148, FreeBSD ELF64, quotactl), rip = 0x80068671c, rsp =
0x7fffffffeda8, rbp = 0 ---
db> alltrace
Tracing command quotaon pid 398 tid 100106 td 0xffffff00c3840be0
kdb_enter() at kdb_enter+0x2f
assert_vop_locked() at assert_vop_locked+0x77
quotaon() at quotaon+0x182
ufs_quotactl() at ufs_quotactl+0x150
quotactl() at quotactl+0x15c
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (148, FreeBSD ELF64, quotactl), rip = 0x80068671c, rsp =
0x7fffffffeda8, rbp = 0 ---
Tracing command sh pid 392 tid 100083 td 0xffffff011e2de980
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait_sig() at sleepq_wait_sig+0xe
msleep() at msleep+0x21b
kern_wait() at kern_wait+0x5c8
wait4() at wait4+0x38
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (7, FreeBSD ELF64, wait4), rip = 0x80093e1cc, rsp =
0x7fffffffcc08, rbp = 0x49 ---
Tracing command syslogd pid 349 tid 100077 td 0xffffff011efa5980
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
critical_exit() at critical_exit+0xa0
lapic_handle_timer() at lapic_handle_timer+0xba
Xtimerint() at Xtimerint+0x76
vn_write() at vn_write+0x363
dofilewrite() at dofilewrite+0x86
kern_writev() at kern_writev+0x51
writev() at writev+0x51
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (121, FreeBSD ELF64, writev), rip = 0x80080f6ac, rsp =
0x7fffffffc808, rbp = 0x7fffffffcf70 ---
Tracing command accounting pid 330 tid 100070 td 0xffffff011f186be0
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_timedwait() at sleepq_timedwait+0xe
msleep() at msleep+0x572
acct_thread() at acct_thread+0x248
fork_exit() at fork_exit+0x86
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffb3c06d00, rbp = 0 ---
Tracing command devd pid 310 tid 100103 td 0xffffff00c3ec14c0
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait_sig() at sleepq_wait_sig+0xe
cv_wait_sig() at cv_wait_sig+0x17f
kern_select() at kern_select+0xbd4
select() at select+0x3e
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (93, FreeBSD ELF64, select), rip = 0x43be6c, rsp = 0x7fffffffe958,
rbp = 0x7fffffffe980 ---
Tracing command adjkerntz pid 177 tid 100073 td 0xffffff011f1864c0
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait_sig() at sleepq_wait_sig+0xe
msleep() at msleep+0x21b
kern_sigsuspend() at kern_sigsuspend+0xb1
sigsuspend() at sigsuspend+0x40
syscall() at syscall+0x642
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (341, FreeBSD ELF64, sigsuspend), rip = 0x800684cdc, rsp =
0x7fffffffed28, rbp = 0xffffffffffffe3e0 ---
(null)() at 0x800684cdc
*** error reading from address ffffffffffffe3e8 ***
db> ps
pid proc uid ppid pgrp flag stat wmesg wchan cmd
398 ffffff00c4586680 0 392 73 0004002 [CPU 1] quotaon
392 ffffff011f3899c0 0 73 73 0000002 [SLPQ wait 0xffffff011f3899c0]
[SLP] sh
349 ffffff011f1e8340 0 1 349 0000000 [RUNQ] syslogd
330 ffffff011ef44680 0 0 0 0000204 [SLPQ - 0xffffffff80870300]
[SLP] accounting
310 ffffff00c45659c0 0 1 310 0000000 [SLPQ select
0xffffffff80881150][SLP] devd
177 ffffff011f1c7340 0 1 177 0000000 [SLPQ pause
0xffffff011f1c73a8][SLP] adjkerntz
73 ffffff00a987d9c0 0 1 73 0004002 [SLPQ wait 0xffffff00a987d9c0]
[SLP] sh
72 ffffff00aa33e000 0 0 0 0000204 [SLPQ m:w1 0xffffff00c5322000]
[SLP] g_mirror gm2
71 ffffff00aa33e340 0 0 0 0000204 [SLPQ m:w1 0xffffff011f17ba00]
[SLP] g_mirror gm1
70 ffffff00aa33e680 0 0 0 0000204 [SLPQ m:w2 0xffffff00010ad400]
[SLP] g_mirror gm0
69 ffffff011f389000 0 0 0 0000204 [SLPQ - 0xffffffffb3bd4be4]
[SLP] schedcpu
68 ffffff011d71c000 0 0 0 0000204 [SLPQ - 0xffffffff8088c3f8]
[SLP] nfsiod 3
67 ffffff011d71c340 0 0 0 0000204 [SLPQ - 0xffffffff8088c3f0]
[SLP] nfsiod 2
66 ffffff011d71c680 0 0 0 0000204 [SLPQ - 0xffffffff8088c3e8]
[SLP] nfsiod 1
65 ffffff011d71c9c0 0 0 0 0000204 [SLPQ - 0xffffffff8088c3e0]
[SLP] nfsiod 0
64 ffffff011e6df000 0 0 0 0000204 [SLPQ syncer
0xffffffff80873d40][SLP] syncer
63 ffffff011e6df340 0 0 0 0000204 [SLPQ vlruwt
0xffffff011e6df340][SLP] vnlru
62 ffffff011e6df680 0 0 0 0000204 [SLPQ psleep
0xffffffff80881a18][SLP] bufdaemon
61 ffffff011e6df9c0 0 0 0 000020c [SLPQ pgzero
0xffffffff808948a0][SLP] pagezero
60 ffffff011e6a0000 0 0 0 0000204 [SLPQ psleep
0xffffffff80893f6c][SLP] vmdaemon
59 ffffff011e6a0340 0 0 0 0000204 [SLPQ psleep
0xffffffff80893f1c][SLP] pagedaemon
58 ffffff011e6a0680 0 0 0 0000204 [SLPQ - 0xffffff011eec3248]
[SLP] fdc0
57 ffffff011e6a09c0 0 0 0 0000204 [IWAIT] swi0: sio
56 ffffff011f5b9340 0 0 0 0000204 [SLPQ idle 0xffffffff8860d000]
[SLP] aic_recovery1
55 ffffff011f5b9680 0 0 0 0000204 [SLPQ idle 0xffffffff88609000]
[SLP] aic_recovery0
54 ffffff011f5b99c0 0 0 0 0000204 [SLPQ usbevt
0xffffffff88607420][SLP] usb1
53 ffffff011f5de000 0 0 0 0000204 [SLPQ usbtsk
0xffffffff8086eed0][SLP] usbtask
52 ffffff011f5de340 0 0 0 0000204 [SLPQ usbevt
0xffffffff88605420][SLP] usb0
51 ffffff011f5de680 0 0 0 0000204 [IWAIT] swi6: task queue
50 ffffff011f5de9c0 0 0 0 0000204 [IWAIT] swi6:+
9 ffffff011f5df000 0 0 0 0000204 [SLPQ - 0xffffff0000caa900]
[SLP] thread taskq
49 ffffff011f5df340 0 0 0 0000204 [IWAIT] swi5:+
8 ffffff011f5df680 0 0 0 0000204 [SLPQ - 0xffffff0000caac00]
[SLP] acpi_task2
7 ffffff011f5df9c0 0 0 0 0000204 [SLPQ - 0xffffff0000caac00]
[SLP] acpi_task1
6 ffffff011f5879c0 0 0 0 0000204 [SLPQ - 0xffffff0000caac00]
[SLP] acpi_task0
5 ffffff011f577000 0 0 0 0000204 [SLPQ - 0xffffff0000caad00]
[SLP] kqueue taskq
48 ffffff011f577340 0 0 0 0000204 [IWAIT] swi2: cambio
47 ffffff011f577680 0 0 0 0000204 [SLPQ - 0xffffffff8086caa0]
[SLP] yarrow
4 ffffff011f5779c0 0 0 0 0000204 [SLPQ - 0xffffffff8086f808]
[SLP] g_down
3 ffffff011f598000 0 0 0 0000204 [SLPQ - 0xffffffff8086f800]
[SLP] g_up
2 ffffff011f598340 0 0 0 0000204 [SLPQ - 0xffffffff8086f7f0]
[SLP] g_event
46 ffffff011f598680 0 0 0 0000204 [IWAIT] swi3: vm
45 ffffff011f5989c0 0 0 0 000020c [CPU 0] swi4: clock sio
44 ffffff011f5b9000 0 0 0 0000204 [IWAIT] swi1: net
43 ffffff011f597680 0 0 0 0000204 [IWAIT] irq31:
42 ffffff011f5979c0 0 0 0 0000204 [IWAIT] irq30:
41 ffffff011f584000 0 0 0 0000204 [IWAIT] irq29:
40 ffffff011f584340 0 0 0 0000204 [IWAIT] irq28:
39 ffffff011f584680 0 0 0 0000204 [IWAIT] irq27: bge1
38 ffffff011f5849c0 0 0 0 0000204 [IWAIT] irq26: bge0
37 ffffff011f587000 0 0 0 0000204 [IWAIT] irq25: ahd1
36 ffffff011f587340 0 0 0 0000204 [IWAIT] irq24: ahd0
35 ffffff011f587680 0 0 0 0000204 [IWAIT] irq23:
34 ffffff011f58a680 0 0 0 0000204 [IWAIT] irq22:
33 ffffff011f58a9c0 0 0 0 0000204 [IWAIT] irq21:
32 ffffff011f5b5000 0 0 0 0000204 [IWAIT] irq20:
31 ffffff011f5b5340 0 0 0 0000204 [IWAIT] irq19: ohci0 ohci1
30 ffffff011f5b5680 0 0 0 0000204 [IWAIT] irq18:
29 ffffff011f5b59c0 0 0 0 0000204 [IWAIT] irq17:
28 ffffff011f597000 0 0 0 0000204 [IWAIT] irq16:
27 ffffff011f597340 0 0 0 0000204 [IWAIT] irq15: ata1
26 ffffff011f5da9c0 0 0 0 0000204 [IWAIT] irq14: ata0
25 ffffff011f588000 0 0 0 0000204 [IWAIT] irq13:
24 ffffff011f588340 0 0 0 0000204 [IWAIT] irq12:
23 ffffff011f588680 0 0 0 0000204 [IWAIT] irq11:
22 ffffff011f5889c0 0 0 0 0000204 [IWAIT] irq10:
21 ffffff011f58a000 0 0 0 0000204 [IWAIT] irq9: acpi0
20 ffffff011f58a340 0 0 0 0000204 [IWAIT] irq8:
19 ffffff011f5b6340 0 0 0 0000204 [IWAIT] irq7: ppc0
18 ffffff011f5b6680 0 0 0 0000204 [IWAIT] irq6: fdc0
17 ffffff011f5b69c0 0 0 0 0000204 [IWAIT] irq5:
16 ffffff011f5da000 0 0 0 0000204 [IWAIT] irq4: sio0
15 ffffff011f5da340 0 0 0 0000204 [IWAIT] irq3:
14 ffffff011f5da680 0 0 0 0000204 [IWAIT] irq0:
13 ffffff011f5af000 0 0 0 0000204 [IWAIT] irq1: atkbd0
12 ffffff011f5af340 0 0 0 000020c [Can run] idle: cpu0
11 ffffff011f5af680 0 0 0 000020c [Can run] idle: cpu1
1 ffffff011f5af9c0 0 0 1 0004200 [SLPQ wait 0xffffff011f5af9c0]
[SLP] init
10 ffffff011f5b6000 0 0 0 0000204 [SLPQ ktrace
0xffffffff80870900][SLP] ktrace
0 ffffffff8086f960 0 0 0 0000200 [IWAIT] swapper
db> show lockedvnods
Locked vnodes
db>
The kernel config is :
ident OFFICE
include SMP
options QUOTA
options ALT_BREAK_TO_DEBUGGER
options KDB # Enable kernel debugger support.
options DDB # Support DDB.
makeoptions DEBUG=-g
options DEBUG_VFS_LOCKS
options DEBUG_LOCKS
I think that i will boot properly without DEBUG_VFS_LOCKS and can test this if
needed?
Niki
More information about the freebsd-stable
mailing list