With out ddb and kdb set 7.3-RELEASE amd64 does not boot.
Mark Saad
nonesuch at longcount.org
Fri Jan 7 22:22:07 UTC 2011
On Fri, Jan 7, 2011 at 4:56 PM, Garrett Cooper <gcooper at freebsd.org> wrote:
> On Fri, Jan 7, 2011 at 1:20 PM, Mark Saad <nonesuch at longcount.org> wrote:
>> Hello hackers@,
>> I have a good question that I cant find an answer for. I believe
>> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit
>> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page
>> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE
>> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 .
>>
>> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the
>> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this
>> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC
>> kernel using patches sources and tried to boot and I got the same
>> crash.
>>
>> Next I rebuilt the kernel with KDB and DDB to see if I could get a
>> core-dump of the system. I also set loader.conf to
>>
>> kernel="kernel.DEBUG"
>> kern.dumpdev="/dev/da0s1b"
>>
>> Next I pxebooted the box and the system does not crash on boot up, it
>> will easily load a nfs root and work fine. So I copied my debug
>> kernel, and loader.conf to the local disk and rebooted and it boots
>> fine from the local disk .
>>
>> Rebooting the server and running off the local disks and debug kernel,
>> I cant find any issues.
>>
>> Reboot the box into a GENERIC 7.3-RELEASE-p4 kernel and it crashes
>>
>> With this error
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 00
>> fault virtual address = 0x0
>> fault code = supervisor write data, page not present
>> instruction pointer = 0x8:0xffffffff800070fa
>> stack pointer = 0x10:0xffffffff8153cbe0
>> frame pointer = 0x10:0xffffffff8153cc50
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process = 0 (swapper)
>> [thread pid 0 tid 100000 ]
>> Stopped at bzero+0xa: repe stosq %es:(%rdi)
>>
>>
>> What do I do , has anyone else seen anything like this ?
>
> What are the messages before that on the kernel console and what
> are your drivers loaded on a stable system?
> Thanks,
> -Garrett
>
Garrett
The last 4 lines of the verbose boot up of the generic kernel are
all from sio1
sio1: port may not be enabled
sio1: irq maps: 0 0 0 0
sio1: prob failed tests(s): 4
sio1 at port 0x2f8-0x2ff irq 3 on isa0
Then the crash .
No extra kernel modules are loaded
Here is my pciconf
hostb0 at pci0:0:0:0: class=0x060000 card=0x32000e11 chip=0x35908086
rev=0x0c hdr=0x00
vendor = 'Intel Corporation'
device = 'E7520 Server Memory Controller Hub'
class = bridge
subclass = HOST-PCI
pcib1 at pci0:0:2:0: class=0x060400 card=0x00000000 chip=0x35958086
rev=0x0c hdr=0x01
vendor = 'Intel Corporation'
device = 'E752x Memory Controller Hub PCIe Port A0'
class = bridge
subclass = PCI-PCI
pcib2 at pci0:0:4:0: class=0x060400 card=0x00000000 chip=0x35978086
rev=0x0c hdr=0x01
vendor = 'Intel Corporation'
device = 'E752x Memory Controller Hub PCIe Port B0'
class = bridge
subclass = PCI-PCI
pcib5 at pci0:0:6:0: class=0x060400 card=0x00000000 chip=0x35998086
rev=0x0c hdr=0x01
vendor = 'Intel Corporation'
device = 'E752x Memory Controller Hub PCIe Port C0'
class = bridge
subclass = PCI-PCI
pcib6 at pci0:0:28:0: class=0x060400 card=0x00000000 chip=0x25ae8086
rev=0x02 hdr=0x01
vendor = 'Intel Corporation'
device = 'Hub Interface to PCI-X Bridge (6300ESB)'
class = bridge
subclass = PCI-PCI
uhci0 at pci0:0:29:0: class=0x0c0300 card=0x32010e11 chip=0x25a98086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'USB 1.1 UHCI Controller *1 (6300ESB)'
class = serial bus
subclass = USB
uhci1 at pci0:0:29:1: class=0x0c0300 card=0x32010e11 chip=0x25aa8086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'USB 1.1 UHCI Controller *2 (6300ESB)'
class = serial bus
subclass = USB
none0 at pci0:0:29:4: class=0x088000 card=0x32010e11 chip=0x25ab8086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'Watchdog Timer (6300ESB)'
class = base peripheral
ioapic0 at pci0:0:29:5: class=0x080020 card=0x32010e11 chip=0x25ac8086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = '6300ESB I/O Advanced Programmable Interrupt Controller'
class = base peripheral
subclass = interrupt controller
ehci0 at pci0:0:29:7: class=0x0c0320 card=0x32010e11 chip=0x25ad8086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'USB 2.0 EHCI Controller (6300ESB)'
class = serial bus
subclass = USB
pcib7 at pci0:0:30:0: class=0x060400 card=0x00000000 chip=0x244e8086
rev=0x0a hdr=0x01
vendor = 'Intel Corporation'
device = '82801 Family (ICH2/3/4/5/6/7/8/9,63xxESB) Hub
Interface to PCI Bridge'
class = bridge
subclass = PCI-PCI
isab0 at pci0:0:31:0: class=0x060100 card=0x00000000 chip=0x25a18086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = '6300ESB LPC Inteface Controller'
class = bridge
subclass = PCI-ISA
atapci0 at pci0:0:31:1: class=0x01018a card=0x32010e11 chip=0x25a28086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'IDE Controller (6300ESB)'
class = mass storage
subclass = ATA
pcib3 at pci0:6:0:0: class=0x060400 card=0x00000000 chip=0x03298086
rev=0x09 hdr=0x01
vendor = 'Intel Corporation'
device = 'PCI Express-to-PCI Express Bridge A (6700PXH)'
class = bridge
subclass = PCI-PCI
pcib4 at pci0:6:0:2: class=0x060400 card=0x00000000 chip=0x032a8086
rev=0x09 hdr=0x01
vendor = 'Intel Corporation'
device = 'PCI Express-to-PCI Express Bridge B (6700PXH)'
class = bridge
subclass = PCI-PCI
ciss0 at pci0:2:1:0: class=0x010400 card=0x40910e11 chip=0x00460e11
rev=0x01 hdr=0x00
vendor = 'Compaq Computer Corp (Now owned by Hewlett-Packard)'
device = 'Smart Array 64xx/6i Controller'
class = mass storage
subclass = RAID
bge0 at pci0:2:2:0: class=0x020000 card=0x00d00e11 chip=0x164814e4
rev=0x10 hdr=0x00
vendor = 'Broadcom Corporation'
device = 'NetXtreme Dual Gigabit Adapter (BCM5704)'
class = network
subclass = ethernet
bge1 at pci0:2:2:1: class=0x020000 card=0x00d00e11 chip=0x164814e4
rev=0x10 hdr=0x00
vendor = 'Broadcom Corporation'
device = 'NetXtreme Dual Gigabit Adapter (BCM5704)'
class = network
subclass = ethernet
vgapci0 at pci0:1:3:0: class=0x030000 card=0x001e0e11 chip=0x47521002
rev=0x27 hdr=0x00
vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
device = 'ATI On-Board VGA for HP Proliant 350 G3 (Rage XL PCI)'
class = display
subclass = VGA
none1 at pci0:1:4:0: class=0x088000 card=0xb2060e11 chip=0xb2030e11
rev=0x01 hdr=0x00
vendor = 'Compaq Computer Corp (Now owned by Hewlett-Packard)'
device = 'Integrated Lights Out Processor (iLo)'
class = base peripheral
none2 at pci0:1:4:2: class=0x088000 card=0xb2060e11 chip=0xb2040e11
rev=0x01 hdr=0x00
vendor = 'Compaq Computer Corp (Now owned by Hewlett-Packard)'
device = 'Integrated Lights Out Processor (iLo)'
class = base peripheral
and here is my /var/run/dmesg.boot
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.3-RELEASE-p4 #1: Fri Jan 7 18:24:07 UTC 2011
root at about-bsd:/usr/obj/usr/src/sys/DEBUG amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3600.15-MHz K8-class CPU)
Origin = "GenuineIntel" Id = 0xf43 Stepping = 3
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x659d<SSE3,DTES64,MON,DS_CPL,EST,TM2,CNXT-ID,CX16,xTPR>
AMD Features=0x20000800<SYSCALL,LM>
TSC: P-state invariant
Logical CPUs per core: 2
usable memory = 4214386688 (4019 MB)
avail memory = 4051181568 (3863 MB)
ACPI APIC Table: <HP 00000083>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
cpu0 (BSP): APIC ID: 0
cpu1 (AP/HT): APIC ID: 1
cpu2 (AP): APIC ID: 6
cpu3 (AP/HT): APIC ID: 7
ioapic1: Changing APIC ID to 9
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
ioapic3 <Version 2.0> irqs 72-95 on motherboard
kbd1 at kbdmux0
acpi0: <HP P54> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x908-0x90b on acpi0
pcib0: <ACPI Host-PCI bridge> on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci13: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci6: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 0.0 on pci6
pci7: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 0.2 on pci6
pci10: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci3: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci2: <ACPI PCI bus> on pcib6
ciss0: <HP Smart Array 6i> port 0x4000-0x40ff mem
0xfdff0000-0xfdff1fff,0xfdf80000-0xfdfbffff irq 24 at device 1.0 on
pci2
ciss0: [ITHREAD]
bge0: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem
0xfdf70000-0xfdf7ffff irq 25 at device 2.0 on pci2
miibus0: <MII bus> on bge0
brgphy0: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus0
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
bge0: Ethernet address: 00:17:a4:a7:a3:fc
bge0: [ITHREAD]
bge1: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem
0xfdf60000-0xfdf6ffff irq 26 at device 2.1 on pci2
miibus1: <MII bus> on bge1
brgphy1: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus1
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
bge1: Ethernet address: 00:17:a4:a7:a3:fb
bge1: [ITHREAD]
uhci0: <UHCI (generic) USB controller> port 0x2000-0x201f irq 16 at
device 29.0 on pci0
uhci0: [GIANT-LOCKED]
...skipping...
device_attach: est2 attach returned 6
p4tcc2: <CPU Frequency Thermal Control> on cpu2
cpu3: <ACPI CPU> on acpi0
est3: <Enhanced SpeedStep Frequency Control> on cpu3
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 122900001229
device_attach: est3 attach returned 6
p4tcc3: <CPU Frequency Thermal Control> on cpu3
orm0: <ISA Option ROMs> at iomem
0xc0000-0xc7fff,0xc8000-0xcbfff,0xee000-0xeffff on isa0
ppc0: cannot reserve I/O port range
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
sio1: [FILTER]
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
acd0: CDROM <HL-DT-ST GCR-8240N/2.03> at ata0-master PIO4
SMPd:a 0A Pa tC PcUi s#s01 bLuasu nc0h etda!rg
et 0 lun 0
da0: <COMPAQ RAID 1 VOLUME OK> Fixed Direct Access SCSI-4 device
da0: 135.168MB/s transfers
da0: Command Queueing EnabledS
MdPa:0 :A P6 9C4P5U9 M#B 2( 1L4a2u2n5c3h2e8d0!
512 byte sectors: 255H 32S/T 17433C)
SMP: AP CPU #3 Launched!
Trying to mount root from ufs:/dev/da0s1a
WARNING: /var was not properly dismounted
--
mark saad | nonesuch at longcount.org
More information about the freebsd-hackers
mailing list