kern/74156: SMP crashes
O. Hartmann
ohartman at web.de
Sat Nov 20 04:30:29 PST 2004
>Number: 74156
>Category: kern
>Synopsis: SMP crashes
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Nov 20 12:30:28 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator: O. Hartmann
>Release: FreeBSD 5.3-RELEASE/FreeBSD 5.3-STABLE
>Organization:
Department for Geophysic Johannes Gutenberg-Universitaet Mainz
>Environment:
FreeBSD edda.geo.uni-mainz.de 5.3-RELEASE-p1 FreeBSD 5.3-RELEASE-p1 #74: Fri Nov 19 17:05:11 UTC 2004 root at edda.geo.uni-mainz.de:/usr/obj/usr/src/sys/EDDA i386
>Description:
While in SMP mode utilizing two 1GHz Intel PIII CPUs FreeBSD crashes after a whi
le. I reportet this kind of crash many times in the bug report and I was advised
to deliver more informations about this error. I will do again a full report.
The Crash only occurs when using two CPUs on the same hardware. Disabling SMP in
/boot/loader.conf.local via kern.smp.disabled="1" keeps the system stable for d
ays and weeks (longest uptime: 13 days under load with FreeBSD 5.3-RELEASE). My
first reports on this crash related to two 866 Mhz CPUs with different steppings
, changing to two 1GHz P3 with the same stepping results in the same crash behav
iour. I will append a mptable -verbose -dmesg output!
This is the crash message I caught:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic = 00
fault virtual address = 0x1c
fault code = supervisor write, page not present
instruction pointer = 0x8:0xc062ac76
stack pointer = 0x10:0xe4e2d7ac
frame pointer = 0x10:0xe4e2d7c4
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0 = DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 44 (swi5: clock sio)
[thread 100042]
Stopped at ref +0x16: lock cmpxchgl %edx, 0x1c(%edx)
mptable -verbose -dmesg:
===============================================================================
MPTable, version 2.0.15
looking for EBDA pointer @ 0x040e, found, searching EBDA @ 0x0009f000
searching CMOS 'top of mem' @ 0x0009ec00 (635K)
searching default 'top of mem' @ 0x0009fc00 (639K)
searching BIOS @ 0x000f0000
MP FPS found in BIOS @ physical addr: 0x000f5270
-------------------------------------------------------------------------------
MP Floating Pointer Structure:
location: BIOS
physical address: 0x000f5270
signature: '_MP_'
length: 16 bytes
version: 1.4
checksum: 0xe3
mode: Virtual Wire
-------------------------------------------------------------------------------
MP Config Table Header:
physical address: 0x000f4e60
signature: 'PCMP'
base table length: 276
version: 1.4
checksum: 0x0d
OEM ID: 'OEM00000'
Product ID: 'PROD00000000'
OEM table pointer: 0x00000000
OEM table size: 0
entry count: 26
local APIC address: 0xfee00000
extended table length: 124
extended table checksum: 198
-------------------------------------------------------------------------------
MP Config Base Table Entries:
--
Processors: APIC ID Version State Family Model Step Flags
3 0x11 BSP, usable 6 8 6 0x387fb
ff
0 0x11 AP, usable 6 8 6 0x387fb
ff
--
Bus: Bus ID Type
0 PCI
1 PCI
2 ISA
--
I/O APICs: APIC ID Version State Address
2 0x11 usable 0xfec00000
3 0x11 usable 0xfec01000
--
I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN#
ExtINT conforms conforms 2 0 2 0
INT conforms conforms 2 1 2 1
INT conforms conforms 2 0 2 2
INT conforms conforms 2 3 2 3
INT conforms conforms 2 4 2 4
INT conforms conforms 2 6 2 6
INT conforms conforms 2 7 2 7
INT conforms conforms 2 8 2 8
INT conforms conforms 2 12 2 12
INT conforms conforms 2 13 2 13
INT conforms conforms 2 14 2 14
INT conforms conforms 2 15 2 15
INT active-lo level 0 15:A 3 14
INT active-lo level 2 9 2 9
INT active-lo level 1 3:A 3 6
INT active-lo level 1 5:A 3 8
INT active-lo level 1 5:B 3 9
--
Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN#
ExtINT active-hi edge 2 0 255 0
NMI active-hi edge 2 0 255 1
-------------------------------------------------------------------------------
MP Config Extended Table Entries:
--
System Address Space
bus ID: 0 address type: I/O address
address base: 0x0
address range: 0x10000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0x40000000
address range: 0xbebe0000
--
System Address Space
bus ID: 0 address type: prefetch address
address base: 0xfebe0000
address range: 0xe9420000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0xe8000000
address range: 0x18000000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0xa0000
address range: 0x20000
--
Bus Heirarchy
bus ID: 2 bus info: 0x01 parent bus ID: 0
--
Compatibility Bus Address
bus ID: 0 address modifier: add
predefined range: 0x00000000
--
Compatibility Bus Address
bus ID: 0 address modifier: add
predefined range: 0x00000001
-------------------------------------------------------------------------------
dmesg output:
Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.3-RELEASE-p1 #74: Fri Nov 19 17:05:11 UTC 2004
root at edda.geo.uni-mainz.de:/usr/obj/usr/src/sys/EDDA
ACPI APIC Table: <ASUS CUR-DLS >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Pentium III (1000.04-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x686 Stepping = 6
Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,
MMX,FXSR,SSE>
real memory = 1073721344 (1023 MB)
avail memory = 1041166336 (992 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID: 3
cpu1 (AP): APIC ID: 0
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
netsmb_dev: loaded
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <ASUS CUR-DLS> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <display, VGA> at device 7.0 (no driver attached)
isab0: <PCI-ISA bridge> port 0xe800-0xe80f at device 15.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <ServerWorks ROSB4 UDMA33 controller> port 0xd400-0xd40f,0x376,0x170-0x177,0x3f6,0x1f0-
0x1f7 at device 15.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
ohci0: <OHCI (generic) USB controller> mem 0xfc000000-0xfc000fff irq 9 at device 15.2 on pci0
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
ugen0: OmniVision OV511+ Camera, rev 1.00/1.00, addr 2
pcib1: <ACPI Host-PCI bridge> on acpi0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 0xd000-0xd03f mem 0xfb800000-
0xfb81ffff irq 22 at device 3.0 on pci1
em0: Ethernet address: 00:07:e9:14:8f:7b
em0: Speed:N/A Duplex:N/A
sym0: <1010-33> port 0xb800-0xb8ff mem 0xfa800000-0xfa801fff,0xfb000000-0xfb0003ff irq 24 at dev
ice 5.0 on pci1
sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: handling phase mismatch from SCRIPTS.
sym0: [GIANT-LOCKED]
sym1: <1010-33> port 0xb400-0xb4ff mem 0xf9800000-0xf9801fff,0xfa000000-0xfa0003ff irq 25 at dev
ice 5.1 on pci1
sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: open drain IRQ line driver, using on-chip SRAM
sym1: using LOAD/STORE-based firmware.
sym1: handling phase mismatch from SCRIPTS.
sym1: [GIANT-LOCKED]
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
ppc0: <ECP parallel printer port> port 0x778-0x77a,0x378-0x37f irq 7 drq 3 flags 0x8 on acpi0
ppc0: Generic chipset (ECP-only) in ECP mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
orm0: <ISA Option ROMs> at iomem 0xd0000-0xd3fff,0xc0000-0xca7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <8 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fb0 at vga0
Timecounters tick every 2.000 msec
acd0: DVDR <NEC DVD RW ND-3500AG/2.16> at ata0-master UDMA33
Waiting 5 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
da0 at sym0 bus 0 target 0 lun 0
da0: <IBM IC35L018UWD210-0 S5BS> Fixed Direct Access SCSI-3 device
da0: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled
da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da1 at sym0 bus 0 target 1 lun 0
da1: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device
da1: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled
da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da2 at sym0 bus 0 target 2 lun 0
da2: <FUJITSU MAJ3182MP 5207> Fixed Direct Access SCSI-3 device
da2: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled
da2: 17429MB (35694904 512 byte sectors: 255H 63S/T 2221C)
cd0 at ata0 bus 0 target 0 lun 0
cd0: <_NEC DVD_RW ND-3500AG 2.16> Removable CD-ROM SCSI-0 device
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/da0s1a
em0: Link is up 100 Mbps Full Duplex
pflog0: promiscuous mode enabled
===============================================================================
>How-To-Repeat:
Use ASUS CUR-DLS mainboard with FreeBSD 5.3 and utilize two CPUs and the built-in VGA (ATI RAGE XL) with 16 bit colours and Xorg 6.7.0
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list