kern/148676: kernel panic lockmgr: locking against myself

Markus Wild m.wild at virtualtec.ch
Fri Jul 16 07:40:02 UTC 2010


>Number:         148676
>Category:       kern
>Synopsis:       kernel panic lockmgr: locking against myself
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jul 16 07:40:01 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Markus Wild
>Release:        7.3-STABLE
>Organization:
VirtualTec Solutions AG
>Environment:
FreeBSD zrhcz-ux1.virtualtec.ch 7.3-STABLE FreeBSD 7.3-STABLE #5: Thu Jul 15 19:04:12 CEST 2010     mw at zrhcz-ux1.virtualtec.ch:/usr/obj/usr/src/sys/VTMASTER  amd64

>Description:
This system keeps crashing about every 4 months or so with the following 
panic:

lockmgr: locking against myself

I've exchanged SCHED_ULE yesterday for SCHED_4BSD, but this time the system
crashed after only about 6h. This system runs 20 jails with apache httpds and
mysql servers in each jail, and my gut feeling is the crash is somehow related
to increased disk activity. With SCHED_ULE, after a panic there are usually
soft update inconsistencies that need to be fixed with fsck -y. After the crash
this morning with SCHED_4BSD, the filesystem didn't have any soft update issues.

I had updated the system with cvs right before compiling and installing the 
4BSD scheduler kernel, so kernel sources are current (for RELENG_7).

Unfortunately, most of dmesg output is not usable because it's cluttered with
ipfw logs. Here's some output of kgdb with those ipfw lines omitted:

Unread portion of the kernel message buffer:
panic: lockmgr: locking against myself
cpuid = 1
Uptime: 6h20m11s
Physical memory: 16374 MB
Dumping 3213 MB:

..

Reading symbols from /boot/kernel/accf_data.ko...Reading symbols from /boot/kernel/accf_data.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/accf_data.ko
Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /boot/kernel/accf_http.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/accf_http.ko
Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from /boot/kernel/coretemp.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/coretemp.ko
#0  doadump () at pcpu.h:196
196     pcpu.h: No such file or directory.
        in pcpu.h

(kgdb) where
#0  doadump () at pcpu.h:196
#1  0x0000000000000004 in ?? ()
#2  0xffffffff802e5b29 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#3  0xffffffff802e5f32 in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:574
#4  0xffffffff802d1bda in _lockmgr (lkp=0xffffff0144b00c68, flags=0, interlkp=Variable "interlkp" is not available.
) at /usr/src/sys/kern/kern_lock.c:366
#5  0xffffffff80548b55 in VOP_LOCK1_APV (vop=0xffffffff806eb360, a=0xffffff81d47e6760) at vnode_if.c:1618
#6  0xffffffff803776d5 in _vn_lock (vp=0xffffff0144b00bd0, flags=4098, td=0xffffff02fe701ab0, file=0xffffffff80586b82 "/usr/src/sys/kern/vfs_subr.c", line=2062) at vnode_if.h:851
#7  0xffffffff8036a3cf in vget (vp=0xffffff0144b00bd0, flags=4098, td=0xffffff02fe701ab0) at /usr/src/sys/kern/vfs_subr.c:2062
#8  0xffffffff804ed7c7 in vm_object_reference (object=Variable "object" is not available.
) at /usr/src/sys/vm/vm_object.c:411
#9  0xffffffff802bce64 in kern_execve (td=0xffffff02fe701ab0, args=0xffffff81d47e6b00, mac_p=Variable "mac_p" is not available.
) at /usr/src/sys/kern/kern_exec.c:404
#10 0xffffffff802bdf07 in execve (td=0xffffff02fe701ab0, uap=Variable "uap" is not available.
) at /usr/src/sys/kern/kern_exec.c:207
#11 0xffffffff80518277 in syscall (frame=0xffffff81d47e6c80) at /usr/src/sys/amd64/amd64/trap.c:920
#12 0xffffffff80500a5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:339
#13 0x00000008008b748c in ?? ()
Previous frame inner to this frame (corrupt stack?)


The system runs mostly a GENERIC kernel with the following additions:

options         HZ=1000
options         MSGBUF_SIZE=40960

options         IPFIREWALL              #firewall
options         IPFIREWALL_VERBOSE      #enable logging to syslogd(8)
options         IPFIREWALL_DEFAULT_TO_ACCEPT    #allow everything by default
options         IPFIREWALL_FORWARD      #packet destination changes
options         DUMMYNET
options         IPDIVERT

options         CONSPEED=115200         # speed for serial console              

options         ROUTETABLES=10


Since this system is in production, and already busy, I can't really turn on
any witness options that will slow it down considerably. I'll revert to 
SCHED_ULE tonight, but would consider upgrading to FreeBSD8 if there's a really
good chance that this will solve the issue.



>How-To-Repeat:
I can't provide a recipe. This is a live server with live traffic and thus
all kinds of interactions between applications. From what I determined, the
main backup cycle was completed before the crash, so at least the additional
disk i/o of that cycle didn't directly contribute to the crash.
>Fix:
n/a

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list