A good backtrace from a head -r317820 powerpc random/periodic panic: execution of garbage at 0x0090a030 (in .hash section) [better bt; more]

Mark Millard markmi at dsl-only.net
Sat May 27 08:17:26 UTC 2017


[I suggest a patch this time.]

On 2017-May-27, at 12:42 AM, Mark Millard <markmi at dsl-only.net> wrote:

> [Top post of the answer to what is wrong. I
> have submitted 219589 for this.]
> 
> TARGET_ARCH=powerpc64 got a fix to
> bugzilla 205458, avoiding inappropriate
> restoration of openfirmware's sprg0
> value.
> 
> It turns out that TARGET_ARCH=powerpc
> needs to detect when it is running on
> powerpc64 and do the same thing if it
> is to avoid trashing memory and running
> unreliably on PowerMac G5's.
> 
> The issue is that for powerpc64 it is
> inappropriate to restore the sprg0
> value to its openfirmware value. This
> is because the FreeBSD real-mode
> handling is involved instead of the
> openfirmware's original virtual mode,
> making openfirware's value simply
> inappropriate.
> 
> Quoting Nathan W. from Comment
> #4 of 205458:
> 
>> Where this explodes is if OF uses an unmapped SLB entry.
>> The SLB fault handler runs in real mode and refers to the
>> PCPU pointer in SPRG0, which blows up the kernel. Having
>> a value of SPRG0 that works for the kernel is less fatal
>> than preserving OF's value in this case.
> 
> I know that part of the code does detect
> the powerpc64 context vs. not and does
> things differently to emulate being powerpc
> like on powerpc64 (such as limiting
> RAM use as a consequence).
> 
> The powerpc64 vs. not status needs to be
> recorded and used to control a sprg0
> restoration choice: avoid restoring
> openfirmware's value on powerpc64;
> otherwise restore it.

I expect that the following patch would
fix the problem:

# svnlite diff /usr/src/sys/powerpc/ofw/ofw_machdep.c
Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c
===================================================================
--- /usr/src/sys/powerpc/ofw/ofw_machdep.c	(revision 317820)
+++ /usr/src/sys/powerpc/ofw/ofw_machdep.c	(working copy)
@@ -147,7 +147,8 @@
 	 * PCPU data cannot be used until this routine is called !
 	 */
 #ifndef __powerpc64__
-	__asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save));
+	if (cpu_features & PPC_FEATURE_64 != PPC_FEATURE_64)
+		__asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save));
 #endif
 }
 #endif

This is based on cpu_features already having had
PPC_FEATURE_64 masked in before this if things
are running on a PowerMac G5 or other powerpc64.


===
Mark Millard
markmi at dsl-only.net

[The original (better) panic evidence. . .]

On 2017-May-26, at 10:29 PM, Mark Millard <markmi at dsl-only.net> wrote:

> [Additional information that does not need to
> interlace with the prior material, so see
> after. A somewhat better backtrace reported
> by ddb. And so on.]
> 
> 
> On 2017-May-26, at 7:14 PM, Mark Millard <markmi at dsl-only.net> wrote:
> 
>> I lucked out and got a vmcore.9 for a random
>> panic that I could manage to backtrace for
>> one of my test builds of -r317820. It appears
>> that not all that much happened before it got
>> the panic so much context was better preserved
>> this time.
>> 
>> (I do not explore from ddb as I've had that
>> panic and mess up the dump just made by
>> replacing it. So this is a manual backtrace
>> from the debug.minidump=0 style vmcore.9
>> file. objdump was used on the
>> /boot/kernel/kernel to find code.)
>> 
>> Being able to see the problem is very
>> sensitive to kernel memory layout. This
>> is why I'm sticking with -r317820 built
>> production style: the kind of context
>> the problem was first observed in.
>> Attempting a debug kernel build simply
>> did not repeat the problem for days
>> (vs. the usual hours for builds like
>> this).
>> 
>> 
>> So below is the backtrace:
>> 
>> (I do not show what trap_fatal calls:
>> starting with trap_fatal and going toward
>> larger memory addresses. . .)
>> 
>> [vmcore.9's
>> offset
>> in file
>> when no
>> 0x prefix]
>> 
>> 013e83b0  df 5e 55 50 00 8f 34 5c  fa 50 05 af fa 50 05 af  |.^UP..4\.P...P..|
>> 0x008f3454 <trap+0x1228> mr      r3,r26
>> 0x008f3458 <trap+0x122c> bl      008f2030 <trap_fatal>
>> 0x008f345c <trap+0x1230> b       008f34c8 <trap+0x129c>
>> 
>> 013e83c0  fa 50 05 af fa 50 05 af  fa 50 05 af fa 50 05 af  |.P...P...P...P..|
>> *
>> 013e83e0  df 5e 54 00 fa 50 05 af  fa 50 05 af fa 50 05 af  |.^T..P...P...P..|
>> 013e83f0  fa 50 05 af fa 50 05 af  00 d1 ca ac df 5e 54 00  |.P...P.......^T.|
>> 013e8400  df 5e 54 20 00 53 3a d0  fa 50 05 af fa 50 05 af  |.^T .S:..P...P..|
>> 013e8410  fa 50 05 af fa 50 05 af  00 d1 ca ac df 5e 54 20  |.P...P.......^T |
>> 013e8420  df 5e 54 d0 00 53 3a d0  00 00 00 00 00 00 00 00  |.^T..S:.........|
>> 013e8430  00 00 00 00 00 00 00 00  00 d1 ca ac 00 00 00 0f  |................|
>> 013e8440  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>> *
>> 013e8460  00 00 00 54 7f ff ff ff  00 00 00 00 ff ff ff aa  |...T............|
>> 013e8470  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>> *
>> 013e8490  00 d4 c4 5c 00 d4 c4 5c  00 17 99 dd 00 17 9a 5c  |...\...\.......\|
>> 013e84a0  00 00 17 9a 00 d4 c4 6c  df 5e 54 ec 00 00 00 54  |.......l.^T....T|
>> 013e84b0  df 5e 54 d0 7f ff ff ff  05 aa 36 c0 05 aa 39 e8  |.^T.......6...9.|
>> 013e84c0  00 00 00 00 00 00 00 00  00 d1 ca ac df 5e 54 d0  |.............^T.|
>> 013e84d0  df 5e 55 80 00 53 3a d0  05 91 d0 00 05 91 d3 28  |.^U..S:........(|
>> 013e84e0  df 5e 55 00 00 00 00 00  00 d1 ca ac 00 00 00 0f  |.^U.............|
>> 013e84f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>> 013e8500  00 00 00 00 00 00 00 00  00 d4 bd ec 00 cb 98 98  |................|
>> 013e8510  00 d4 c4 5c 00 d4 c4 5c  00 17 99 f8 00 17 99 f8  |...\...\........|
>> 013e8520  00 00 17 99 fa 69 b8 79  00 17 9a 0a 00 00 17 99  |.....i.y........|
>> 013e8530  f8 1c 2f cc df 5e 55 88  01 47 d6 c0 01 43 a2 00  |../..^U..G...C..|
>> 013e8540  41 eb 30 00 0a 00 00 00  00 d2 6e 4c df 5e 55 50  |A.0.......nL.^UP|
>> 
>> 013e8550  df 5e 55 80 00 8e 7d 40  df 5e 55 9c 00 00 00 28  |.^U...}@.^U....(|
>> 0x008e7d28 <powerpc_interrupt+0x184> mfmsr   r0
>> 0x008e7d2c <powerpc_interrupt+0x188> or      r0,r0,r9
>> 0x008e7d30 <powerpc_interrupt+0x18c> mtmsr   r0
>> 0x008e7d34 <powerpc_interrupt+0x190> isync
>> 0x008e7d38 <powerpc_interrupt+0x194> mr      r3,r25
>> 0x008e7d3c <powerpc_interrupt+0x198> bl      008f222c <trap>
>> 0x008e7d40 <powerpc_interrupt+0x19c> lwz     r11,0(r1)
>> 
>> 013e8560  00 00 00 00 00 00 17 9a  06 40 c2 a8 01 43 a2 00  |......... at ...C..|
>> 013e8570  41 eb 30 00 0a 00 00 00  00 00 00 00 00 08 10 32  |A.0............2|
>> 
>>                                    r0          r1
>> 013e8580  df 5e 56 40 00 10 08 f8  00 00 00 04 df 5e 56 40  |.^V at .........^V@|
>> The struct trapframe starts is 0x 013e8588 in the
>> vmcore.9 file. The 0x00108f8 is as shown below:
>> 
>> 0x001008ec <k_trap+0x118> isync
>> 0x001008f0 <trapagain> addi    r3,r1,8
>> 0x001008f4 <trapagain+0x4> bl      008e7ba4 <powerpc_interrupt>
>> 0x001008f8 <trapexit> mfmsr   r3
>> 0x001008fc <trapexit+0x4> andi.   r3,r3,32767
>> 
>> 013e8590  01 47 d6 c0 00 00 00 28  01 47 c6 c0 00 00 00 04  |.G.....(.G......|
>>             r2          r3           r4         r5
>> 
>> 013e85a0  00 00 00 04 00 00 00 0f  00 00 00 00 00 d4 c0 3c  |...............<|
>>             r6          r7           r8         r9
>> 
>> 013e85b0  01 47 d6 c0 df 5e 56 80  14 3d 60 da 00 00 00 00  |.G...^V..=`.....|
>>            r10         r11          r12         r13
>> 
>> 013e85c0  00 d4 bd ec 00 cb 98 98  00 d4 c4 5c 00 d4 c4 5c  |...........\...\|
>>            r14         r15          r16         r17
>> 
>> 013e85d0  00 17 99 f8 00 17 99 f8  00 00 17 99 fa 69 b8 79  |.............i.y|
>>            r18         r19          r20         r21
>> 
>> 013e85e0  00 17 9a 0a 00 00 17 99  f8 1c 2f cc 00 00 17 9a  |........../.....|
>>            r22         r32          r24         r25
>> 
>> 013e85f0  06 40 c2 a8 01 43 a2 00  00 eb a7 80 01 47 d6 c0  |. at ...C.......G..|
>>            r26         r27          r28         r29
>> 
>> 013e8600  00 d1 ca ac df 5e 56 40  00 53 5a d0 20 00 90 44  |.....^V at .SZ. ..D|
>>            r30         r31           lr          cr
>> 
>>            xer         ctr          srr0     srr1
>> 013e8610  00 00 00 00 00 00 00 00  00 90 a0 30 00 08 10 32  |...........0...2|
>> Note: objdump shows no code at 0x0090a030. 0x0090a030 is
>> inside what objdump -x reports as the section .hash (Idx
>> 2).
>> 
>>            exc         dar         dsisr    (booke dbcer0)
>> 013e8620  00 00 07 00 41 eb 30 00  0a 00 00 00 01 47 c6 c0  |....A.0......G..|
>> The 0x00007000 above is the framep->exc (exception code)
>> (program in this case).
>> 
>> 013e8630  00 eb a7 80 01 47 60 10  00 d1 ca ac df 5e 56 40  |.....G`......^V@|
>> 
>> At this point the above does not match the below
>> part of the stack trace.
>> 
>> The lr part of struct trapframe was: 0x00535ad0 so
>> showing around that:
>> 
>> 00535ab8 <sched_affinity> stwu    r1,-32(r1)
>> 00535abc <sched_affinity+0x4> mflr    r0
>> 00535ac0 <sched_affinity+0x8> stw     r29,20(r1)
>> 00535ac4 <sched_affinity+0xc> stw     r30,24(r1)
>> 00535ac8 <sched_affinity+0x10> stw     r31,28(r1)
>> 00535acc <sched_affinity+0x14> stw     r0,36(r1)
>> 00535ad0 <sched_affinity+0x18> mr      r31,r1
>> 00535ad4 <sched_affinity+0x1c> mr      r29,r3
>> 
>> Back to the stack backtrace. . .
>> 
>> 013e8640  df 5e 56 80 00 53 59 dc  00 17 99 f8 00 17 99 f8  |.^V..SY.........|
>> 0x005359c8 <sched_add+0x18c> bl      008ea420 <spinlock_exit>
>> 0x005359cc <sched_add+0x190> mr      r3,r28
>> 0x005359d0 <sched_add+0x194> mr      r4,r27
>> 0x005359d4 <sched_add+0x198> mr      r5,r25
>> 0x005359d8 <sched_add+0x19c> bl      005356ec <tdq_add>
>> 0x005359dc <sched_add+0x1a0> mfsprg  r9,0
>> 
>> I show around 0x005356ec
>> from the sched_add+0x19c bl that is above
>> because the routine is not referenced
>> in the stack tracce but the above indicates
>> that it should have been called:
>> 0x005356ec <tdq_add> stwu    r1,-32(r1)
>> 0x005356f0 <tdq_add+0x4> mflr    r0
>> 0x005356f4 <tdq_add+0x8> stw     r28,16(r1)
>> 0x005356f8 <tdq_add+0xc> stw     r29,20(r1)
>> 0x005356fc <tdq_add+0x10> stw     r30,24(r1)
>> 0x00535700 <tdq_add+0x14> stw     r31,28(r1)
>> 0x00535704 <tdq_add+0x18> stw     r0,36(r1)
>> . . .
>> 
>> Back to the stack backtrace again. . .
>> 
>> 013e8650  00 00 17 99 fa 69 b8 79  00 17 9a 0a 00 00 00 04  |.....i.y........|
>> 013e8660  f8 1c 2f cc 00 00 17 9a  06 40 c2 a8 01 43 a2 00  |../...... at ...C..|
>> 013e8670  01 47 c6 c0 01 47 60 10  00 d1 b4 30 df 5e 56 80  |.G...G`....0.^V.|
>> 
>> 013e8680  df 5e 56 b0 00 4a 87 8c  00 d2 5b 10 00 00 00 04  |.^V..J....[.....|
>> 0x004a8780 <intr_event_schedule_thread+0xc4> mr      r3,r28
>> 0x004a8784 <intr_event_schedule_thread+0xc8> li      r4,4
>> 0x004a8788 <intr_event_schedule_thread+0xcc> bl      0053583c <sched_add>
>> 0x004a878c <intr_event_schedule_thread+0xd0> lwz     r9,0(r28)
>> 
>> 013e8690  df 5e 56 b0 00 00 17 9a  06 40 c2 a8 01 43 a2 00  |.^V...... at ...C..|
>> 013e86a0  00 00 00 00 01 46 81 c0  00 d1 b4 30 df 5e 56 b0  |.....F.....0.^V.|
>> 
>> 013e86b0  df 5e 56 f0 00 4a 97 0c  00 00 00 00 00 00 00 04  |.^V..J..........|
>> 0x004a9700 <swi_sched+0xa4> bl      005000ec <critical_exit>
>> 0x004a9704 <swi_sched+0xa8> mr      r3,r27
>> 0x004a9708 <swi_sched+0xac> bl      004a86bc <intr_event_schedule_thread>
>> 0x004a970c <swi_sched+0xb0> lwz     r11,0(r1)
>> 
>> 013e86c0  df 5e 56 e0 01 47 d6 c0  01 47 d6 c0 01 45 4d 40  |.^V..G...G...EM@|
>> 013e86d0  df 5e 56 f0 00 8e a4 44  06 40 c2 a8 00 00 17 9a  |.^V....D. at ......|
>> 013e86e0  78 00 00 00 00 e9 56 00  00 d1 c8 20 df 5e 56 f0  |x.....V.... .^V.|
>> 
>> 013e86f0  df 5e 57 50 00 51 79 6c  df 5e 58 78 01 47 d6 c0  |.^WP.Qyl.^Xx.G..|
>> 0x00517960 <callout_process+0x420> lwz     r3,264(r29)
>> 0x00517964 <callout_process+0x424> li      r4,0
>> 0x00517968 <callout_process+0x428> bl      004a965c <swi_sched>
>> 0x0051796c <callout_process+0x42c> lwz     r11,0(r1)
>> 
>> 013e8700  01 47 d7 b8 00 00 00 00  00 d1 ab 24 00 00 00 04  |.G.........$....|
>> 013e8710  00 c9 66 bc 00 c4 5d 48  00 c9 66 bc 00 d4 c5 3c  |..f...]H..f....<|
>> 013e8720  00 d0 53 00 00 eb a7 80  00 00 00 01 00 00 00 00  |..S.............|
>> 013e8730  df 5e 59 8c 00 00 00 00  df 5e 58 78 00 00 17 99  |.^Y......^Xx....|
>> 013e8740  f8 1c 2f cc d0 01 dd 00  00 d2 5b 10 df 5e 57 50  |../.......[..^WP|
>> 
>> 013e8750  df 5e 57 a0 00 8a b2 70  df 5e 57 60 df 5e 57 60  |.^W....p.^W`.^W`|
>> 0x008ab264 <handleevents+0x2a4> mr      r3,r27
>> 0x008ab268 <handleevents+0x2a8> mr      r4,r28
>> 0x008ab26c <handleevents+0x2ac> bl      00517540 <callout_process>
>> 0x008ab270 <handleevents+0x2b0> li      r3,0
>> 
>> 013e8760  df 5e 57 a0 df 5e 58 78  05 86 37 00 00 00 00 04  |.^W..^Xx..7.....|
>> 013e8770  00 00 00 00 05 9b f2 00  00 c9 66 bc 01 47 d6 c0  |..........f..G..|
>> 013e8780  df 5e 59 8c 00 f6 1d 10  00 00 17 99 f8 1c 2f cc  |.^Y.........../.|
>> 013e8790  d0 01 dd 00 d0 01 dd 30  00 d2 5b 10 df 5e 57 a0  |.......0..[..^W.|
>> 
>> 013e87a0  df 5e 58 20 00 8a d1 10  00 d2 6e 5c df 5e 57 b0  |.^X ......n\.^W.|
>> 0x008ad100 <timercb+0x4b8> mr      r3,r26
>> 0x008ad104 <timercb+0x4bc> mr      r4,r27
>> 0x008ad108 <timercb+0x4c0> li      r5,0
>> 0x008ad10c <timercb+0x4c4> bl      008aafc0 <handleevents>
>> 0x008ad110 <timercb+0x4c8> lwz     r11,0(r1)
>> 
>> 013e87b0  df 5e 57 e0 00 4a 96 00  00 00 17 99 00 00 00 00  |.^W..J..........|
>> 013e87c0  f8 1c 2f cc 0a 3f a0 12  df 5e 58 78 05 86 37 00  |../..?...^Xx..7.|
>> 013e87d0  01 48 b1 00 05 86 37 80  00 d4 bd ec 00 cb 98 98  |.H....7.........|
>> 013e87e0  00 c9 66 bc 00 c4 5d 48  00 c9 66 bc 00 d4 c5 3c  |..f...]H..f....<|
>> 013e87f0  df 5e 59 e0 00 eb a7 80  00 c9 66 bc 01 47 d6 c0  |.^Y.......f..G..|
>> 013e8800  df 5e 59 8c df 5e 58 78  01 47 d6 c0 00 00 00 00  |.^Y..^Xx.G......|
>> 013e8810  00 f6 1d 10 00 00 00 01  00 d2 6b c8 df 5e 58 20  |..........k..^X |
>> 
>> 013e8820  df 5e 58 40 00 8e 1e 48  00 00 00 00 00 eb a7 80  |.^X at ...H........|
>> So 0x13e8820 from vmcore.9 has start of struct trapeframe.
>> The 0x8e1e48 is from:
>> 0x008e1e34 <decr_intr+0xe0> lwz     r0,56(r28)
>> 0x008e1e38 <decr_intr+0xe4> mtctr   r0
>> 0x008e1e3c <decr_intr+0xe8> mr      r3,r28
>> 0x008e1e40 <decr_intr+0xec> lwz     r4,64(r28)
>> 0x008e1e44 <decr_intr+0xf0> bctrl
>> 0x008e1e48 <decr_intr+0xf4> addi    r29,r29,-1
>> 
>> 013e8830  01 47 d7 94 00 00 00 01  00 d2 6e 4c df 5e 58 40  |.G........nL.^X@|
>> 013e8840  df 5e 58 70 00 8e 7c 9c  00 d1 ca ac df 5e 58 50  |.^Xp..|......^XP|
>> 013e8850  00 cd f0 74 00 00 00 01  00 00 00 01 00 eb a7 80  |...t............|
>> 013e8860  41 eb 30 00 0a 00 00 00  00 00 00 00 00 00 90 32  |A.0............2|
>> 013e8870  df 5e 59 30 00 10 08 f8  00 04 90 32 df 5e 59 30  |.^Y0.......2.^Y0|
>> 013e8880  01 47 d6 c0 00 00 00 00  0d 0b bf e3 00 00 00 00  |.G..............|
>> 013e8890  0d 0b bf e3 00 19 eb 7c  00 00 00 00 00 00 00 44  |.......|.......D|
>> 013e88a0  01 fc a0 55 00 00 90 32  d0 01 dd 00 00 00 00 00  |...U...2........|
>> 013e88b0  00 d4 bd ec 00 cb 98 98  00 c9 66 bc 00 c4 5d 48  |..........f...]H|
>> 013e88c0  00 c9 66 bc 00 d4 c5 3c  df 5e 59 e0 00 eb a7 80  |..f....<.^Y.....|
>> 013e88d0  00 c9 66 bc 01 47 d6 c0  df 5e 59 8c 00 00 00 01  |..f..G...^Y.....|
>> 013e88e0  00 00 00 01 00 eb a7 80  00 00 00 00 00 8e 3b f8  |..............;.|
>> 
>>                                     srr0
>> 013e88f0  00 d2 6b f0 df 5e 59 30  00 8e 3c 14 40 00 00 42  |..k..^Y0..<. at ..B|
>> The 0x00833c14 from the trap frame is the srr0 (conceptual lr):
>> 0x008e3c04 <cpu_idle_60x+0xc> stw     r31,28(r1)
>> 0x008e3c08 <cpu_idle_60x+0x10> stw     r0,36(r1)
>> 0x008e3c0c <cpu_idle_60x+0x14> mr      r31,r1
>> 0x008e3c10 <cpu_idle_60x+0x18> bcl-    20,4*cr7+so,008e3c14 <cpu_idle_60x+0x1c>
>> 0x008e3c14 <cpu_idle_60x+0x1c> mflr    r30
>> 0x008e3c18 <cpu_idle_60x+0x20> lwz     r0,-32(r30)
>> 
>> 013e8900  20 00 00 00 00 8e 3b f8  00 8e 3c 80 00 00 90 32  | .....;...<....2|
>> 
>>            exc
>> 013e8910  00 00 09 00 41 eb 30 00  0a 00 00 00 00 00 00 00  |....A.0.........|
>> The 0x00009000 above is the framep->exc (exception code)
>> (decrementer in this case).
>> 
>> 013e8920  eb 10 25 b5 55 7e c4 22  00 00 00 00 00 00 00 04  |..%.U~."........|
>> 
>> 013e8930  df 5e 59 50 00 00 00 01  00 00 00 01 00 eb a7 80  |.^YP............|
>> (trap [above] means no lr filled in here)
>> 
>> 013e8940  00 00 00 00 00 d4 ca 34  00 d2 6b f0 df 5e 59 50  |.......4..k..^YP|
>> 
>> 013e8950  df 5e 59 70 00 8e 31 9c  00 00 00 02 00 eb a7 80  |.^Yp..1.........|
>> 0x008e318c <cpu_idle+0x48> bl      008ad8b8 <cpu_idleclock>
>> 0x008e3190 <cpu_idle+0x4c> lwz     r29,0(r29)
>> 0x008e3194 <cpu_idle+0x50> mtctr   r29
>> 0x008e3198 <cpu_idle+0x54> bctrl
>> 0x008e319c <cpu_idle+0x58> bl      008ad7b8 <cpu_activeclock>
>> 
>> 013e8960  00 f2 d5 fc 00 00 00 01  00 d1 ca ac df 5e 59 70  |.............^Yp|
>> 
>> 013e8970  df 5e 5a 50 00 53 6e 7c  fa 50 05 af fa 50 05 af  |.^ZP.Sn|.P...P..|
>> 0x00536e78 <sched_idletd+0x4d0> bl      008e3144 <cpu_idle>
>> 0x00536e7c <sched_idletd+0x4d4> stw     r28,136(r27)
>> 
>> 
>> 013e8980  fa 50 05 af fa 50 05 af  fa 50 05 af ff ff ff fe  |.P...P...P......|
>> 013e8990  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>> 013e89a0  ff ff ff ff ff ff ff ff  ff ff ff ff fa 50 05 af  |.............P..|
>> 013e89b0  fa 50 05 af 00 00 00 02  ff ff ff ff 00 00 01 f0  |.P..............|
>> 013e89c0  ff ff ff fe ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>> 013e89d0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>> 013e89e0  ff ff ff fe ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>> 013e89f0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
>> 013e8a00  df 5e 5a 20 fa 50 05 af  00 00 00 00 00 00 00 00  |.^Z .P..........|
>> 013e8a10  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>> *
>> 013e8a30  00 00 00 00 00 53 69 a8  df 5e 5a 98 00 00 00 00  |.....Si..^Z.....|
>> 013e8a40  01 47 96 e0 01 47 d6 c0  00 d1 b3 70 df 5e 5a 50  |.G...G.....p.^ZP|
>> 
>> 013e8a50  df 5e 5a 80 00 4a 3c b4  df 5e 5a 60 df 5e 5a 60  |.^Z..J<..^Z`.^Z`|
>> 0x004a3ca0 <fork_exit+0xe4> bl      008ea420 <spinlock_exit>
>> 0x004a3ca4 <fork_exit+0xe8> mr      r3,r27
>> 0x004a3ca8 <fork_exit+0xec> mr      r4,r26
>> 0x004a3cac <fork_exit+0xf0> mtctr   r25
>> 0x004a3cb0 <fork_exit+0xf4> bctrl
>> 0x004a3cb4 <fork_exit+0xf8> lwz     r0,108(r28)
>> 
>> 013e8a60  df 5e 5a 80 00 00 00 00  00 00 00 00 00 00 00 00  |.^Z.............|
>> 013e8a70  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
>> 
>> 013e8a80  00 00 00 00 00 8f 18 d0  00 53 69 a8 00 00 00 00  |.........Si.....|
>> 0x008f18c0 <fork_trampoline> lwz     r3,8(r1)
>> 0x008f18c4 <fork_trampoline+0x4> lwz     r4,12(r1)
>> 0x008f18c8 <fork_trampoline+0x8> lwz     r5,16(r1)
>> 0x008f18cc <fork_trampoline+0xc> bl      004a3bbc <fork_exit>
>> 0x008f18d0 <fork_trampoline+0x10> addi    r1,r1,16
>> 0x008f18d4 <fork_trampoline+0x14> b       001008f8 <trapexit>
>> 
>> 
>> So that is an example context for the failure.
>> 
>> It has taken weeks to get this. It may be some
>> time before I get another for comparison/contrast.
>> 
>> And sticking with the same build vs. trying some more
>> to find a better way to get more evidence is not
>> obvious to me at this point.
> 
> I should have mentioned that this is
> TARGET_ARCH=powerpc but used on a so-called PowerMac
> G5 "Quad Core".
> 
> I've had 2 more examples, with the same 0x0090a030
> srr0 and such (vmcore.0 and vmcore.1) (I have added
> one more little block of code for detecting an earlier
> problem symptom not being currently seen so the build
> is slight different from the earlier report.)
> 
> Looking around in vmcore.0 I find 3 examples of
> "00 90 a0 30" in areas overlapping with objdump -x's
> coverage. . .
> 
> 00c77fd0  00 00 00 00 00 00 00 00  00 90 a0 30 00 08 10 32  |...........0...2|
>                                  [         ]
> 
> (from sorted objdump -x output:)
> 00c775a8 l     O .data  00000dc0 seqprog
> 00c78368 l     O .data  0000000c seeprom_long_ewen
> 
>                                  [         ]
> 00f65820  41 eb 30 00 0a 00 00 00  00 90 a0 30 00 08 10 32  |A.0........0...2|
> . . .
> 00f65870  00 90 a0 30 00 08 10 32  00 00 00 00 df 5d 30 00  |...0...2.....]0.|
>         [         ]
> 
> (from sorted objdump -x output:)
> 00f64780 g     O .bss   00020000 __pcpu
> 00f84780 l     O .bss   00000004 ap_letgo
> 
> 
> For vmcore.1 I went ahead and tried exploring
> some with ddb --and had no new panics. . .
> (Hand transcriptions of pictures)
> 
> fatal kernel trap:
> exception = 0x700 (program)
> srr0      = 0x90a030
> srr1      = 0x81032
> lr        = 0x535ad0
> curthread = 0x147d6c0
> pic = 11, comm = idle: cpu0
> 
> [ thread pid 11 tid 100003 ]
> Stopped at _etext+0xb8fc: illegal instruction 0
> db> bt
> 0xdf5e55d0: at sched_wakeup+0xa4
> 0xdf5e55f0: at setrunnable+0x9c
> 0xdf5e5610: at sleepq_resume_thread+0x17c
> 0xdf5e5640: at sleepq_timeout+0xc8
> 0xdf5e5680: at softclock_call_cc+0x1f0
> 0xdf5e56f0: at callout_process+0x27c
> 0xdf5e57a0: at timercb+0x4c4
> 0xdf5e5820: at decr_intr+0xf0
> 0xdf5e5840: at powerpc_interrupt_0xf4
> 0xdf5e5870: at kernel DECR trap
>              by cpu_idle_60x+0x88 (so: srr0)
>              srr1=0x9032
>              r1  =0xdf5e5930
>              cr  =0x40000042
>              xer =0x20000000
>              ctr =0x8e3bf8
> saved LR(0xfffffffd) is invalid.
> 
> db> show reg (but reformatted
> r0   =0x4
> r1   =0xdf5e5590
> r2   =0x147d6c0
> r3   =0x54       testppc64size+0x34
> r4   =0x591d000
> r5   =0
> r6   =0
> r7   =0xf
> r8   =0
> r9   =0xd4c03c   cold
> r10  =0x147d6c0
> r11  =0xdf5e55d0
> r12  =0
> r13  =0
> r14  =0xd4bdec   sdt_probe_func
> r15  =0xcb9898   std_lockstat___spin__release
> r16  =0xd4c45c   callwheelmask
> r17  =0xd4c45c   callwheelmask
> r18  =0x55925
> r19  =0x559a4
> r20  =0x559      dsmisssize+0x469
> r21  =0x591d000
> r22  =0x566430   sleeppq_timeout
> r23  =0x114      dsmisssize+0x24
> r24  =0
> r25  =0
> r26  =0x1
> r27  =0
> r28  =0xeba780   tdq_cpu
> r29  =0x147d6c0
> r30  =0xd1caac
> r31  =0xdf5e5590
> srr0 =0x90a030
> srr1 =0x81032
> lr   =0x535ad0   shed_affinity+0x18
> ctr  =0
> cr   =0x20009034
> xer  =0
> dar  =0x419df5d4
> dsisr=0x24000000
> _etext+0xb8fc: illegal instruction 0
> 
> 
> Just for completeness: acttrace also showed:
> 
> Tracing command less pid 1144 tid 100150 td 0x5bc8a20 (CPU 3)
> 0xd25a59f0: at powrpc_dispatch_intr+0xc8
> 0xd25a5a20: at openpic_dispatch+0x90
> 0xd25a5a50: at powerpc_interrupt+0xc0
> 0xd25a5a80: at user EXI trap
>              by 0x181c68c (so: ssr0)
>              r1 =0xffffdb30
>              cr =0x44020624
>              xer=0
>              ctr=0x41989570
> 
> Tracing command idle pid 11 tid 100004 td 0x147d360 (CPU 1)
> saved LR(0x4c) in invalid
> 
> Tracing command idle pid 11 tid 100005 td 0x147d360 (CPU 2)
> saved LR(0x4c) in invalid




More information about the freebsd-ppc mailing list