how to find out what the other CPU is doing
Randall Stewart
rrs at cisco.com
Wed Feb 7 20:03:17 UTC 2007
John Baldwin wrote:
> On Sunday 28 January 2007 07:38, Randall Stewart wrote:
>> All:
>>
>> Ok, I did not get an answer to this.. and of course
>> I hit the bug again (which I now figured out how to
>> fix :-D)
>>
>> So let me explain what I did.. so that way I
>> can come back and find this email later when it
>> someday happens again ;-) (and for anyone else
>> curious).
>>
>> 1) I had to do this from DDB ... I could not find a
>> way in kgdb.
>>
>> 2) When you stop the machine in ddb (at least in i386) it
>> dumps BOTH CPU's info in something called
>> stoppcbs[num-cpus]
>> 3) Its an array of struct pcb .. which has all the info
>> you need to get started.
>> 4) With a trusty x/ stoppcbs you can work your way through
>> and gather the info you need.. For x86 the second CPU
>> started at stoppcbs+0x270 .. if you don't want to look
>> at all those 0's (of course the offset could change and
>> will vary from CPU type to CPU type :-D)
>> 5) Dig out the ebp from here. You can look at the IP
>> but it will be in some NMI stop CPU routine.
>> 6) You can use the bp to trace backward through the stack
>> and figure out the running stack trace... I went back
>> to kgdb after getting the ebp (with CPU still spinning away).
>> 7) You have to go several frames back to get by all the NMI
>> stuff before you find your guilty party :-)
>>
>> There might be a better way to do this.. and I am thinking
>> about adding a machine dependent trace that can take
>> a ebp argument (if one does not already exist in kgdb.. I
>> suppose I need to poke around in the macro's a bit).. anyway
>> its primitive .. but it allows you to find that spinning
>> kernel routine :-)
>
> When you use 'thread/tid/proc' in kgdb it uses stoppcbs[] automatically, so
> you can do 'proc 437' and do 'bt' to get a trace as I explained earlier. ddb
> can also do this for you as 'tr' in ddb can take a pid or tid as an argument,
> so in ddb you can do 'tr 437' to trace proc 437. Note if you want to use
> the 'tid' in kgdb you use 'tid <tid>'. 'proc' takes PIDs not TIDs in kgdb.
>
Hmm..
I tried that the first time I had a crash in kgdb (I did not do anything
in DDB) and it did not work for me... I flustered around with it
for a very very long time too.
Maybe I have an old kgdb or something but I could not get it to work :-(
R
--
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 <or> 803-317-4952 (cell)
More information about the freebsd-current
mailing list