[Bug 225321] dtrace/powerpc64: System crash

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Wed Feb 21 00:28:26 UTC 2018


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225321

--- Comment #3 from Breno Leitao <breno.leitao at gmail.com> ---
I did some further debugs, and this is what I found now:

1) The very first problem is that we are not able to leave kdb, if we just get
in into kdb, and 'c' continue, we get:

db> c   

fatal kernel trap:

   exception       = 0x400 (instruction storage interrupt)
   virtual address = 0x426f6f7420666c60
   srr0            = 0x426f6f7420666c60 (0x426f6f7420666c60)
   srr1            = 0x8000000040001032
   lr              = 0x426f6f7420666c61 (0x426f6f7420666c61)
   curthread       = 0x1441460
          pid = 0, comm = 


We are going to an address that means "boot fla", which is clearly wrong.

Debugging the code, I found exactly what is happening.

On dbtrap->dbleave, we call FRAME_LEAVE(), which does leaves SRR0 = 0x6616r8
and SRR1 = 0x8000000000001032 and then call rfid. SRR0 and SRR1 are sane.

SRR0 is part of the kdb_enter function, as showed:

0x0000000000661658 in kdb_enter (why=0xf68968 "bootflags", msg=0x80 "")

The kdb_enter() function follows:

void
kdb_enter(const char *why, const char *msg)
{

        if (kdb_dbbe != NULL && kdb_active == 0) {
                if (msg != NULL)
                        printf("KDB: enter: %s\n", msg);
                kdb_why = why;
                breakpoint();
                kdb_why = KDB_WHY_UNSET;
        }
}

After the rfid we get into 0xf68968 that executes:

   0x0000000000661668 <+112>:   ld      r0,16(r1)
   0x000000000066166c <+116>:   mtlr    r0
   ...
   0x000000000066167c <+132>:   blr

In this case, lr has 0x426f6f7420666c61, which is clearly wrong and causes the
trap.

It appears that r1 does not point to the proper stack, since it contains:

r1 = 0xf68958

0xf68958:       0x6b65726e      0x656c6e61      0x6d650000      0x00000000
0xf68968:       0x626f6f74      0x666c6167      0x73000000      0x00000000
0xf68978:       0x426f6f74      0x20666c61

Anyway, clearly the kdb_enter() has a wrong stack pointer (r1) after calling
breakpoint.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-ppc mailing list