RELENG_6 panic under heavy load

Gleb Smirnoff glebius at FreeBSD.org
Thu Nov 16 03:15:28 PST 2006


On Thu, Nov 16, 2006 at 01:24:36PM +0300, Gleb Smirnoff wrote:
T>   I wonder why UMA was suspected to be the problem. Dima gave
T> me access to the core. Here are more details from the trace:

It looks like a race between two threads in one process. Look here:

(kgdb) frame 12
#12 0xd05f4fc1 in _mtx_lock_sleep (m=0xd5dd5498, tid=3583683968, opts=0, file=0x12 <Address 0x12 out of bounds>, line=18) at /usr/src/sys/kern/kern_mutex.c:579
579                     turnstile_wait(&m->mtx_object, mtx_owner(m));
(kgdb) p *m
$10 = {mtx_object = {lo_class = 0xd084e224, lo_name = 0xd080508c "process lock", lo_type = 0xd080508c "process lock", lo_flags = 4390912, lo_list = {
      tqe_next = 0xd5dd56b0, tqe_prev = 0xd5dd5290}, lo_witness = 0xd088a100}, mtx_lock = 3611674882, mtx_recurse = 0}
(kgdb) p ((struct thread *)tid)                
$15 = (struct thread *) 0xd59aad80
(kgdb) p ((struct thread *)(m->mtx_lock & ~(0x1 | 0x2)))
$17 = (struct thread *) 0xd745c900
(kgdb) p ((struct thread *)(m->mtx_lock & ~(0x1 | 0x2)))->td_proc
$18 = (struct proc *) 0xd5dd5430
(kgdb) p ((struct thread *)tid)->td_proc
$19 = (struct proc *) 0xd5dd5430

So, we see that one thread blocks on the lock that is held by an
other thread of the same process. Here they are:

* 134 Thread 100198 (PID=47872: nagios)  doadump () at pcpu.h:165
  133 Thread 100147 (PID=47872: nagios)  sched_switch (td=0xd745c900, newtd=0xd51f7a80, flags=2) at /usr/src/sys/kern/sched_4bsd.c:980

Let's look at the second one:

(kgdb) thread 133
[Switching to thread 133 (Thread 100147)]#0  sched_switch (td=0xd745c900, newtd=0xd51f7a80, flags=2) at /usr/src/sys/kern/sched_4bsd.c:980
980             sched_lock.mtx_lock = (uintptr_t)td;
(kgdb) bt
#0  sched_switch (td=0xd745c900, newtd=0xd51f7a80, flags=2) at /usr/src/sys/kern/sched_4bsd.c:980
#1  0xd0607f46 in mi_switch (flags=2, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:420
#2  0xd0615ecf in maybe_preempt_in_ksegrp (td=0xd59aad80) at kern_switch.c:467
#3  0xd06160c8 in setrunqueue (td=0xd59aad80, flags=0) at kern_switch.c:585
#4  0xd06151e7 in sched_wakeup (td=0xd59aad80) at /usr/src/sys/kern/sched_4bsd.c:996
#5  0xd0608025 in setrunnable (td=0xd59aad80) at /usr/src/sys/kern/kern_synch.c:483
#6  0xd060d78e in thread_unsuspend_one (td=0xd59aad80) at /usr/src/sys/kern/kern_thread.c:972
#7  0xd060d584 in thread_suspend_check (return_instead=0) at /usr/src/sys/kern/kern_thread.c:935
#8  0xd0628a88 in userret (td=0xd745c900, frame=0xf5dd4d38, oticks=1) at /usr/src/sys/kern/subr_trap.c:116
#9  0xd07a6e16 in syscall (frame=
      {tf_fs = 134938683, tf_es = 59, tf_ds = -809566149, tf_edi = 134997504, tf_esi = 134998528, tf_ebp = -813707944, tf_isp = -170046108, tf_ebx = 672261300, tf_edx = 0, tf_ecx = 134969072, tf_eax = 1, tf_trapno = 0, tf_err = 2, tf_eip = 672832335, tf_cs = 51, tf_eflags = 646, tf_esp = -813707972, tf_ss = 59})
    at /usr/src/sys/i386/i386/trap.c:1034
#10 0xd078f38f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE


More information about the freebsd-stable mailing list