With out ddb and kdb set 7.3-RELEASE amd64 does not boot.

Garrett Cooper gcooper at FreeBSD.org
Fri Jan 7 22:26:10 UTC 2011


On Fri, Jan 7, 2011 at 2:22 PM, Mark Saad <nonesuch at longcount.org> wrote:
> On Fri, Jan 7, 2011 at 4:56 PM, Garrett Cooper <gcooper at freebsd.org> wrote:
>> On Fri, Jan 7, 2011 at 1:20 PM, Mark Saad <nonesuch at longcount.org> wrote:
>>> Hello hackers@,
>>>  I have a good question that I cant find an answer for. I believe
>>> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit
>>> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page
>>> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE
>>> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 .
>>>
>>> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the
>>> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this
>>> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC
>>> kernel using patches sources and tried to boot and I got the same
>>> crash.
>>>
>>>  Next I rebuilt the kernel with KDB and DDB to see if I could get a
>>> core-dump of the system. I also set loader.conf to
>>>
>>> kernel="kernel.DEBUG"
>>> kern.dumpdev="/dev/da0s1b"
>>>
>>> Next I pxebooted  the box and the system does not crash on boot up, it
>>> will easily load a nfs root and work fine. So I copied my debug
>>> kernel, and loader.conf to the local disk and rebooted and it boots
>>> fine from the local disk .
>>>
>>> Rebooting the server and running off the local disks and debug kernel,
>>> I cant find any issues.
>>>
>>> Reboot the box into a GENERIC 7.3-RELEASE-p4 kernel and it crashes
>>>
>>> With this error
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>> cpuid = 0; apic id = 00
>>> fault virtual address   = 0x0
>>> fault code                 = supervisor write data, page not present
>>> instruction pointer     = 0x8:0xffffffff800070fa
>>> stack pointer            = 0x10:0xffffffff8153cbe0
>>> frame pointer            = 0x10:0xffffffff8153cc50
>>> code segment          = base 0x0, limit 0xfffff, type 0x1b
>>>                              = DPL 0, pres 1, long 1, def32 0, gran 1
>>> processor eflags      = interrupt enabled, resume, IOPL = 0
>>> current process       = 0 (swapper)
>>> [thread pid 0 tid 100000 ]
>>> Stopped at      bzero+0xa:     repe stosq       %es:(%rdi)
>>>
>>>
>>> What do I do , has anyone else seen anything like this ?
>>
>>    What are the messages before that on the kernel console and what
>> are your drivers loaded on a stable system?
>> Thanks,
>> -Garrett
>>
> Garrett
>  The last 4 lines of the verbose boot up of the generic kernel are
> all from sio1

    Is sio1 pointing to a generic UART, or is it something more
special like the HP lights-out SOL interface?
    Simple test might be to disable the sio/uart driver in the kernel
and see if things worked.
Thanks,
-Garrett


More information about the freebsd-hackers mailing list