Strange panic on ppc64
Justin Hibbits
jhibbits at freebsd.org
Sun Jun 9 21:21:15 UTC 2013
On Sun, Jun 9, 2013 at 8:47 AM, Nathan Whitehorn <nwhitehorn at freebsd.org>wrote:
> On 06/08/13 17:33, Justin Hibbits wrote:
>
>
>
>
> On Sat, Jun 8, 2013 at 7:54 AM, Nathan Whitehorn <nwhitehorn at freebsd.org>wrote:
>
>> On 06/08/13 09:21, Justin Hibbits wrote:
>>
>>
>>
>>
>> On Wed, Jun 5, 2013 at 9:47 AM, Justin Hibbits <jhibbits at freebsd.org>wrote:
>>
>>> Will do, when I get it panicking again.
>>>
>>> - Justin
>>> On Jun 5, 2013 9:46 AM, "Nathan Whitehorn" <nwhitehorn at freebsd.org>
>>> wrote:
>>>
>>>> On 06/04/13 22:35, Justin Hibbits wrote:
>>>>
>>>>> After a string of seemingly random hangs, I added invariants (but not
>>>>> witness) to my custom kernel config, and I get the following panic,
>>>>> recreated from a fuzzy cell phone picture:
>>>>>
>>>>>
>>>>> [thread pid -1 tid 1006665719 ]
>>>>> Stopped at 0: illegal instruction 0
>>>>> db> panic: mutex ohci1 owned at
>>>>> /usr/home/chmeee/freebsd/head/sys/dev/usb/usb_transfer.c:2280
>>>>> cpuid = 0
>>>>> Uptime: 9h8m1s
>>>>> <my dump code>
>>>>> ...
>>>>> panic: msleep1
>>>>> cpu = 0
>>>>> KDB: enter: panic
>>>>> [ thread pid -1 tid 100665719 ]
>>>>> ....
>>>>>
>>>>> The first question I have is how the hell it got such a strange
>>>>> PID/TID,
>>>>> memory corruption my guess, something is stomping on the pcpu or
>>>>> something,
>>>>> and I think these hangs have only happened since I added a lot more
>>>>> memory
>>>>> (up to 12G from 4G, Andreas Tobler was seeing hangs as well), so it
>>>>> might
>>>>> be something in the moea64 pmap code, but that's pure speculation on my
>>>>> part. Then the other panic messages, owned mutex and panic in
>>>>> msleep1. I
>>>>> enabled more trace code, so hopefully the next time it panics I can
>>>>> collect
>>>>> better data.
>>>>>
>>>>> - Justin
>>>>> _______________________________________________
>>>>> freebsd-ppc at freebsd.org mailing list
>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-ppc
>>>>> To unsubscribe, send any mail to "freebsd-ppc-unsubscribe at freebsd.org"
>>>>>
>>>>
>>>> Could you post the output from show reg? It looks like it tried to jump
>>>> to a null pointer there.
>>>> -Nathan
>>>>
>>>
>> Well, it's hard to do get that output, because I just hit that 'mutex
>> owned' panic, and here's the backtrace:
>>
>>
>>
>> The mutex thing is spurious -- it was already panicing and then paniced
>> again trying to panic. Can you get the backtrace for the original panic (it
>> should be different) and the values of the registers?
>> -Nathan
>>
>
> Here you go:
>
> [ thread pid -1 tid 1006665719 ]
> Stopped at 0: illegal instruction 0
> db:0:kdb.enter.default> show reg
> r0 0
> r1 0
> r2 0xab63d0 M_MACTEMP
> r3 0xbb12e0
> r4 0x741f18 .ofwcall+0xa8
> r5 0
> r6 0xa4f1a8
> r7 0x1
> r8 0x1
> r9 0xc10500 __pcpu
> r10 0x1c35ec0
> r11 0
> r12 0x2000d032
> r13 0x342eb000
> r14 0x10014200
> r15 0xffffffffffffcb58
> r16 0x2
> r17 0x2
> r18 0xffffffffffffcb50
> r19 0
> r20 0xc000000013231478
> r21 0xc00000014c0ce200
> r22 0
> r23 0x64 dbsize+0x10
> r24 0xc00000014c0cdf70
> r25 0xb62cb8 smp_no_rendevous_barrier
> r26 0
> r27 0x741f18 .ofwcall+0xa8
> r28 0x741f18 .ofwcall+0xa8
> r29 0x2000d032
> r30 0x9000000000001032
> r31 0xc0cad8 mac_labeled
> srr0 0x102ca4 k_trap+0x28
> srr1 0x9000000000001032
> lr 0x102c74 u_trap+0x10
> ctr 0xff846d78
> cr 0x2000f1b0
> xer 0
> dar 0xfffffffffffffd60
> dsisr 0x42000000
> 0: illegal instruction 0
> db:0:kdb.enter.default> bt
> Tracing pid -1 tid 1006665719 td 0
> (nothing)
>
>
> Well, that is all kinds of messed up. It appears to have halted while
> handling a userland trap due to an implicit branch caused by bad
> translations when it restores the kernel SRs. Could you see what 'show
> pcpu' does? Does that information look valid at all? I suspect it has
> become corrupted somehow.
> -Nathan
>
>
Here's the full log from dconschat, from bootup to panic. Unfortunately,
not everything I wanted to print would print, and I can't type anything
once it panics, because it panics when reading the keyboard, so I have to
add everything as a ddb enter script. Here's what I've added so far
(doesn't do everything as you can see from the transcript):
script kdb.enter.default=show reg; bt; show pcpu; ps; run lockinfo;
alltrace; show all procs; show files; show malloc; show allchains
- Justin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zhabar.dcons
Type: application/octet-stream
Size: 17373 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-ppc/attachments/20130609/b210e5a4/attachment.obj>
More information about the freebsd-ppc
mailing list