amd64/105437: 6.2-BETA3 crashes on amd64

Ruslan Ermilov ru at FreeBSD.org
Sun Nov 12 09:11:47 PST 2006


The following reply was made to PR amd64/105437; it has been noted by GNATS.

From: Ruslan Ermilov <ru at FreeBSD.org>
To: Wojciech Puchar <wojtek at 3miasto.net>
Cc: bug-followup at FreeBSD.org
Subject: Re: amd64/105437: 6.2-BETA3 crashes on amd64
Date: Sun, 12 Nov 2006 20:07:12 +0300

 On Sun, Nov 12, 2006 at 05:09:22PM +0100, Wojciech Puchar wrote:
 > #0  doadump () at pcpu.h:172
 > #1  0x0000000000000004 in ?? ()
 > #2  0xffffffff8025deb3 in boot (howto=260) at 
 > ../../../kern/kern_shutdown.c:409
 > #3  0xffffffff8025e4b6 in panic (fmt=0xffffff003d8fa980 "°\226\217=")
 >     at ../../../kern/kern_shutdown.c:565
 > #4  0xffffffff803e87f2 in trap_fatal (frame=0xffffff003d8fa980, 
 > eva=18446742975230744240)
 >     at ../../../amd64/amd64/trap.c:660
 > #5  0xffffffff803e8d16 in trap (frame=
 >       {tf_rdi = -1098993325056, tf_rsi = 4, tf_rdx = -1098478802560, 
 > tf_rcx = 4, tf_r8 = -1098478802496, tf_r9 = -1098993325056, tf_rax = 2, 
 > tf_rbx = -1098478802560, tf_rbp = 4, tf_r10 = -1098993325056, tf_r11 = 
 > -1264970144, tf_r12 = -1098478802560, tf_r13 = -1098993325056, tf_r14 = 
 > -2141357264, tf_r15 = -1098758394592, tf_trapno = 12, tf_addr = 212, 
 > tf_flags = -2144054761, tf_err = 0, tf_rip = -2144839500, tf_cs = 8, 
 > tf_rflags = 65543, tf_rsp = -1264969928, tf_ss = 16})
 >     at ../../../amd64/amd64/trap.c:238
 > #6  0xffffffff803d640b in calltrap () at 
 > ../../../amd64/amd64/exception.S:168
 > #7  0xffffffff802858b4 in turnstile_setowner (ts=0xffffff001ee4ac00, 
 > owner=0x4)
 >     at ../../../kern/subr_turnstile.c:432
 > #8  0xffffffff80285ebb in turnstile_wait (lock=0xffffff002ce56d20, 
 > owner=0x4)
 >     at ../../../kern/subr_turnstile.c:591
 > #9  0xffffffff80252f39 in _mtx_lock_sleep (m=0xffffff002ce56d20, 
 > tid=18446742975230749056,
 >     opts=1032825216, file=0x4 <Address 0x4 out of bounds>, 
 > line=1032825280)
 >     at ../../../kern/kern_mutex.c:579
 > 
 The line 579 has:
 
 :                 turnstile_wait(&m->mtx_object, mtx_owner(m));
 
 Some references:
 
 : /*
 :  * Internal utility macros.
 :  */
 : #define mtx_unowned(m)  ((m)->mtx_lock == MTX_UNOWNED)
 :  
 : #define mtx_owner(m)    (mtx_unowned((m)) ? NULL \
 :         : (struct thread *)((m)->mtx_lock & MTX_FLAGMASK))
 
 : /*
 :  * State bits kept in mutex->mtx_lock, for the DEFAULT lock type. None of this,
 :  * with the exception of MTX_UNOWNED, applies to spin locks.
 :  */
 : #define MTX_RECURSED    0x00000001      /* lock recursed (for MTX_DEF only) */
 : #define MTX_CONTESTED   0x00000002      /* lock contested (for MTX_DEF only) */
 : #define MTX_UNOWNED     0x00000004      /* Cookie for free mutex */
 : #define MTX_FLAGMASK    ~(MTX_RECURSED | MTX_CONTESTED)
 
 mtx_owner(m) returns the value of "4", which is MUTEX_UNOWNED,
 but if mtx_lock were only MTX_UNOWNED, mtx_unowned() would return
 true, and mtx_owner() would return NULL.  This means that mtx_lock
 has something other than MTX_UNOWNED as well, which is illegal.
 Most likely, it's MTX_DESTROYED (which is defined as (MTX_CONTESTED \
 | MTX_UNOWNED)).  You should print the mutex it to be sure.  So
 it looks like the code is trying to pass a corrupt mutex.
 Please recompile your kernel with the following options:
 
 options         INVARIANTS              # Enable calls of extra sanity checking
 options         INVARIANT_SUPPORT       # Extra sanity checks of internal structures, required by INVARIANTS
 options         WITNESS                 # Enable checks to detect deadlocks and cycles
 options         WITNESS_SKIPSPIN        # Don't run witness on spinlocks for speed
 
 It will run more slowly, but could allow to catch the bug earlier.
 
 It could turn out to be a problem with the IPv6 routing code.
 
 > #10 0xffffffff8033c7ab in nd6_output (ifp=0xffffff003063c000, 
 > origifp=0xffffff003063c000,
 >     m0=0xffffff0001cd6400, dst=0xffffff002e437a60, rt0=0xffffff002b96f630)
 >     at ../../../netinet6/nd6.c:2004
 > #11 0xffffffff80338c12 in ip6_output (m0=0x100010170400120, opt=0x500, 
 > ro=0xffffffffb49a1a00,
 >     flags=0, im6o=0x0, ifpp=0x0, inp=0xffffff0001c304c0) at 
 > ../../../netinet6/ip6_output.c:994
 > 
 I don't understand why "ro" is not NULL here, because tcp_output()
 below calls it with a NULL argument; this is probably due to a
 -O2 compilation.
 
 > #12 0xffffffff80315a6d in tcp_output (tp=0xffffff0010b165e0) at 
 > ../../../netinet/tcp_output.c:1059
 > #13 0xffffffff8031c6a5 in tcp_timer_rexmt (xtp=0xffffff001ee4ac00)
 >     at ../../../netinet/tcp_timer.c:537
 > #14 0xffffffff8026d02a in softclock (dummy=0xffffff001ee4ac00) at 
 > ../../../kern/kern_timeout.c:290
 > #15 0xffffffff802442b6 in ithread_loop (arg=0xffffff00000053c0) at 
 > ../../../kern/kern_intr.c:682
 > #16 0xffffffff80242d03 in fork_exit (callout=0xffffffff80244170 
 > <ithread_loop>,
 >     arg=0xffffff00000053c0, frame=0xffffffffb49a1c50) at 
 > ../../../kern/kern_fork.c:821
 > #17 0xffffffff803d676e in fork_trampoline () at 
 > ../../../amd64/amd64/exception.S:394
 > #18 0x0000000000000000 in ?? ()
 > #19 0x0000000000000000 in ?? ()
 > #20 0x0000000000000001 in ?? ()
 > #21 0x0000000000000000 in ?? ()
 > #22 0x0000000000000000 in ?? ()
 > #23 0x0000000000000000 in ?? ()
 > #24 0x0000000000000000 in ?? ()
 > #25 0x0000000000000000 in ?? ()
 > #26 0x0000000000000000 in ?? ()
 > #27 0x0000000000000000 in ?? ()
 > #28 0x0000000000000000 in ?? ()
 > #29 0x0000000000000000 in ?? ()
 > #30 0x0000000000000000 in ?? ()
 > #31 0x0000000000000000 in ?? ()
 > #32 0x0000000000000000 in ?? ()
 > #33 0x0000000000000000 in ?? ()
 > #34 0x0000000000000000 in ?? ()
 > #35 0x0000000000000000 in ?? ()
 > #36 0x0000000000000000 in ?? ()
 > #37 0x0000000000000000 in ?? ()
 > #38 0x0000000000000000 in ?? ()
 > #39 0x0000000000000000 in ?? ()
 > #40 0x0000000000000000 in ?? ()
 > #41 0x0000000000000000 in ?? ()
 > #42 0x0000000000000000 in ?? ()
 > #43 0x0000000000000000 in ?? ()
 > #44 0x0000000000000000 in ?? ()
 > #45 0x0000000000000000 in ?? ()
 > #46 0x0000000000000000 in ?? ()
 > #47 0x0000000000000000 in ?? ()
 > #48 0x0000000000000000 in ?? ()
 > #49 0x0000000000000000 in ?? ()
 > #50 0x00000000007b4000 in ?? ()
 > #51 0xffffff003d8fa980 in ?? ()
 > #52 0xffffff00000053c0 in ?? ()
 > #53 0x0000000000000001 in ?? ()
 > #54 0xffffff003d8f96b0 in ?? ()
 > #55 0xffffff001ffa4980 in ?? ()
 > #56 0xffffffffb49a1b58 in ?? ()
 > #57 0xffffff003d8fa980 in ?? ()
 > #58 0xffffffff802734db in sched_switch (td=0xffffff00000053c0, newtd=0x0, 
 > flags=0)
 > 
 > then zeroes up to #130
 
 
 Cheers,
 -- 
 Ruslan Ermilov
 ru at FreeBSD.org
 FreeBSD committer


More information about the freebsd-amd64 mailing list