Enabling DDB prevent kernel from panicing

Mark Saad nonesuch at longcount.org
Mon Jan 10 22:59:33 UTC 2011


All
This was originally posted to hackers@

I have a good question that I cant find an answer for. I believe
found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit
kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page
fault while in kernel mode " . The hardware works fine in 7.2-RELEASE
amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 .

In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the
stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this
issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC
kernel using patches sources and tried to boot and I got the same
crash.

 Next I rebuilt the kernel with KDB and DDB to see if I could get a
core-dump of the system. I also set loader.conf to

kernel="kernel.DEBUG"
kern.dumpdev="/dev/da0s1b"

Next I pxebooted  the box and the system does not crash on boot up, it
will easily load a nfs root and work fine. So I copied my debug
kernel, and loader.conf to the local disk and rebooted and it boots
fine from the local disk .

Rebooting the server and running off the local disks and debug kernel,
I cant find any issues.

Reboot the box into a GENERIC 7.3-RELEASE-p4 kernel and it crashes

With this error

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code                 = supervisor write data, page not present
instruction pointer     = 0x8:0xffffffff800070fa
stack pointer            = 0x10:0xffffffff8153cbe0
frame pointer            = 0x10:0xffffffff8153cc50
code segment          = base 0x0, limit 0xfffff, type 0x1b
                             = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags      = interrupt enabled, resume, IOPL = 0
current process       = 0 (swapper)
[thread pid 0 tid 100000 ]
Stopped at      bzero+0xa:     repe stosq       %es:(%rdi)


It was recommended to comment out the sio hints in /boot/device.hints
I did this and I can properly boot a GENERIC 7.3-RELEASE kernel.

I reran this same test using 7.4-RC1 the system boots with out any
changes to anything.

So my question, does anyone know what changed in stable/7 after the
creation of 7.3-RELEASE that could have
fixed this or does anyone know what  could be causing this issue. The
sio code does not look like its been changed in
a long while . Do we still need s the hits for the sio ports anyway
does omitting them from the hints file cause any
major issues, I can use the serial port for a console and to connect
to to other serial devices with out any issues.

-- 

mark saad | nonesuch at longcount.org


More information about the freebsd-stable mailing list