Instability likely related to new pmap on Cubieboard A10

Svatopluk Kraus onwahe at gmail.com
Wed Sep 2 14:47:39 UTC 2015


On Tue, Sep 1, 2015 at 3:01 PM, Dmitry Marakasov <amdmi3 at amdmi3.ru> wrote:
> * Svatopluk Kraus (onwahe at gmail.com) wrote:
>
>> >> Thanks. Meantime, I tried most recent HEAD on pandaboard and
>> >> beaglebone black and no problem there. Do you have enabled INVARIANTS
>> >> and INVARIANT_SUPPORT in your config?
>> >
>> > I've enabled them at some point - at least last two runs had these
>> > enabled. Any other way I could help? Maybe I should check if it was
>> > new pmap commit which caused this, and if not, bisect it?
>> >
>>
>> Can you try attached semi-debug patch, please? I want to be sure that
>> problem is not on patched place.
>
> Sorry for delay, I was short on time last week, and then I was busy with
> setting up tftp/nfs netboot for my cubieboard. Now it finally works
> and I'd say it's pretty cool when I can test another build without
> plugging sd card around. Unfortunately, with this setup panic doesn't
> reproduce: there are just around 10 sh(1) segfaults during init, and
> then it boots into somehow usable state. Only once I've had panic with
> your latest patch applied:
>
> https://people.freebsd.org/~amdmi3/pmap4.log
>
> With my new netboot, I plan to try to bisect it; for panic debugging I
> guess I'll have to get back to plugging SD around. If you want me to do
> more panic tests, could we please revisit which patches should be
> applied cause I'm kinda lost in them.
>


Okay then, here is my summary: I thought that segmentation faults and
panic(s) are caused by two different things. Now, it looks that they
are not. The logs of old configuration with sd card, you provided,
shows that something is corrupting kernel memory. Three logs and three
quite different corruptions. So now, I would like to focus to
segmentation faults which points to some corruption too. As you are
only one who reports this kind of problem, it's probably related to
your hardware or the way how your system is booted.

Thus, if you would be so kind:

(A1) Use only one hardware configuration now. I like to investigate
the segmentation faults, so you may use netboot.
(A2) Start with clean, up to date kernel without any patches to learn
how the system behaves.
(A3) You can try to  use old pmap, however, the result has no
relevance. Note that with old pmap, the system memory layout is
different, the system timing is different (for example, when a process
is forking), and pointless cache and TLB operation are done in
addition.
(A4) Apply attached patch, enable KTR, set KTR_MASK to KTR_TRAP, and
when system breaks to debugger, send me output from the following
commands:

"show ktr" ... at least 10 lines but more is better ;)
"show pmap /u"

(A5) If you type "continue", you can repeat step A4 and send me info
from more segmentation faults at once.
(A6) If you got panic, send me kernel file together with panic backtrace.

(B) Another thing you could try is to omit in you configuration as
many devices as possible. Mainly the ones which use DMA. For example,
boot from sd card without network driver compiled in and vice versa.

(C) If you think that there was kernel revision including new pmap
which worked without problems, confirm that please and tell me which
one it was.

Svata
-------------- next part --------------
Index: sys/arm/arm/trap-v6.c
===================================================================
--- sys/arm/arm/trap-v6.c	(revision 287394)
+++ sys/arm/arm/trap-v6.c	(working copy)
@@ -167,7 +167,27 @@
 	{abort_fatal,	"Undefined Code (0x40F)"}
 };
 
+static void
+cpu_tracesigexit(struct thread *td)
+{
+	struct trapframe *tf;
 
+	tf = td->td_frame;
+	if (tf == NULL)
+		return;
+
+	CTR3(KTR_TRAP, "pc 0x%08x usr_lr 0x%08x usr_sp 0x%08x",
+	    tf->tf_pc, tf->tf_usr_lr, tf->tf_usr_sp);
+	CTR3(KTR_TRAP, "spsr 0x%08x svc_lr 0x%08x svc_sp 0x%08x",
+	    tf->tf_spsr, tf->tf_svc_lr, tf->tf_svc_sp);
+	CTR5(KTR_TRAP, "r0 0x%08x r1 0x%08x r2 0x%08x r3 0x%08x r4 0x%08x",
+	    tf->tf_r0, tf->tf_r1, tf->tf_r2, tf->tf_r3, tf->tf_r4);
+	CTR5(KTR_TRAP, "r5 0x%08x r6 0x%08x r7 0x%08x r8 0x%08x r9 0x%08x",
+	    tf->tf_r5, tf->tf_r6, tf->tf_r7, tf->tf_r8, tf->tf_r9);
+	CTR3(KTR_TRAP, "r10 0x%08x r11 0x%08x r12 0x%08x",
+	    tf->tf_r10, tf->tf_r11, tf->tf_r12);
+}
+
 static __inline void
 call_trapsignal(struct thread *td, int sig, int code, vm_offset_t addr)
 {
@@ -176,6 +196,9 @@
 	CTR4(KTR_TRAP, "%s: addr: %#x, sig: %d, code: %d",
 	   __func__, addr, sig, code);
 
+	cpu_tracesigexit(td);
+	breakpoint();
+
 	/*
 	 * TODO: some info would be nice to know
 	 * if we are serving data or prefetch abort.


More information about the freebsd-arm mailing list