kern/73228: FreeBSD 5.3-RC1 hangs w/ SMP

O. Hartmann ohartman at mail.uni-mainz.de
Thu Oct 28 01:00:49 PDT 2004


>Number:         73228
>Category:       kern
>Synopsis:       FreeBSD 5.3-RC1 hangs w/ SMP
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Oct 28 08:00:48 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     O. Hartmann
>Release:        FreeBSD 5.3-RC1 i386
>Organization:
Department of Geophysics, Universitaet Mainz
>Environment:
System: FreeBSD edda.geo.uni-mainz.de 5.3-RC1 FreeBSD 5.3-RC1 #44: Thu Oct 28 06:20:39 UTC 2004 root at edda.geo.uni-mainz.de:/usr/obj/usr/src/sys/EDDA i386


	Machine is SMP, SCHED_4BSD, either PREEMPTION on or off, fxp0 and em0, em0 disabled due to
	problems, ALTQ code enabled in kernel but no queue used in pf, using pf as firewall. Dmesg output
	follows. Mainboard is Asus CUR-DLS with BIOS Rev. 1009 (most recent). X11 is Xorg (most recent
	from CVS/repository), 1280x1024x16 (4MB video built in memory).
>Description:
	FreeBSD 5.3 since BETA4 crashes on my dual PIII/866Mhz machine under heavy load and spontanously.
	To ensure that power supply unit is not an issue, I changed the PSU to a 460 W Enermax brand-new
	type.

	Crashes or freezes are always the same: either graphics gets distorted or get stuck, but 
	mousepointer still remains moveable. Changing to console via ALT-CTRL-F1 doesn't work. This
	state remains forever.
	Or the screen freezes and for a while it remains frozen and then the machine reboots. In very rare
	scenarios I get a trap message but it hides away very fast.

	These hangs only occure when using both CPUs. Leaving one CPU unused levaes the system operationable
	for days (with or without PREEMPTION enabled, with SMP, with apic).

	SCHED_ULE triggers a crash very fast.
	SCHED_4BSD w/ SMP and PREEMPTION also seems to trigger a crash rapidly.

	When disabling serial ports and parallel port in BIOS, FreeBSD kernel hang/freeze after
	showing up this message:
	Timecounters tick every 1.000 msec

	Waiting 5 seconds for SCSI devices to settle
	(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
	(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
	<freeze>

	The frozen state is uninterruptable! I get rid of this freezing state when enabling serial
	and parallel ports!

	I would like to mention, that OHCI crashes both in SMP and UP with a weird error message
	reporting a likely hardware failure, but Win2000 does its operational work on this device
	and mainboard!

	On BIOS startup screen I recognized that both mass storage controllers (built in SCSI)
	use IRQ 9, the em0 (64 Bit PCI Intel EtherPro NIC, see dmesg) also shares the same IRQ.

	FreeBSD 5.3-BETA4 with GENERIC SMP kernel worked properly on this hardware, problems occured
	later when doing regular cvsupdates. See other error reports. Base installation was done
	via BETA4 installation CD ROMs. 

	This is dmesg-output:

Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 5.3-RC1 #44: Thu Oct 28 06:20:39 UTC 2004
    root at edda.geo.uni-mainz.de:/usr/obj/usr/src/sys/EDDA
ACPI APIC Table: <ASUS   CUR-DLS >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Pentium III (866.70-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x68a  Stepping = 10
  Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
real memory  = 1073721344 (1023 MB)
avail memory = 1041166336 (992 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  3
 cpu1 (AP): APIC ID:  0
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
netsmb_dev: loaded
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <ASUS CUR-DLS> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
fxp0: <Intel 82559 Pro/100 Ethernet> port 0xd800-0xd83f mem 0xfd800000-0xfd8fffff,0xfe000000-0xfe000fff irq 20 at device 2.0 on pci0
miibus0: <MII bus> on fxp0
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: Ethernet address: 00:e0:18:05:73:f4
ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0xd400-0xd4ff mem 0xfd000000-0xfd000fff irq 17 at device 4.0 on pci0
ahc0: [GIANT-LOCKED]
aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs
pcm0: <AudioPCI ES1373-8> port 0xd000-0xd03f irq 19 at device 6.0 on pci0
pcm0: <Cirrus Logic CS4297A AC97 Codec>
pcm0: [GIANT-LOCKED]
pci0: <display, VGA> at device 7.0 (no driver attached)
isab0: <PCI-ISA bridge> port 0xe800-0xe80f at device 15.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <ServerWorks ROSB4 UDMA33 controller> port 0xb400-0xb40f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
pcib1: <ACPI Host-PCI bridge> on acpi0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 0xb000-0xb03f mem 0xfa800000-0xfa81ffff irq 22 at device 3.0 on pci1
em0: Ethernet address: 00:07:e9:14:8f:7b
em0:  Speed:N/A  Duplex:N/A
sym0: <1010-33> port 0xa800-0xa8ff mem 0xf9800000-0xf9801fff,0xfa000000-0xfa0003ff irq 24 at device 5.0 on pci1
sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: handling phase mismatch from SCRIPTS.
sym0: [GIANT-LOCKED]
sym1: <1010-33> port 0xa400-0xa4ff mem 0xf8800000-0xf8801fff,0xf9000000-0xf90003ff irq 25 at device 5.1 on pci1
sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: open drain IRQ line driver, using on-chip SRAM
sym1: using LOAD/STORE-based firmware.
sym1: handling phase mismatch from SCRIPTS.
sym1: [GIANT-LOCKED]
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
ppc0: <ECP parallel printer port> port 0x778-0x77a,0x378-0x37f irq 7 drq 3 flags 0x8 on acpi0
ppc0: Generic chipset (ECP-only) in ECP mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
orm0: <ISA Option ROMs> at iomem 0xd0000-0xd3fff,0xc0000-0xca7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <8 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fb0 at vga0
Timecounters tick every 1.000 msec
Waiting 5 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
da0 at sym0 bus 0 target 0 lun 0
da0: <IBM IC35L018UWD210-0 S5BS> Fixed Direct Access SCSI-3 device 
da0: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled
da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da1 at sym0 bus 0 target 1 lun 0
da1: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device 
da1: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled
da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da2 at sym0 bus 0 target 2 lun 0
da2: <FUJITSU MAJ3182MP 5207> Fixed Direct Access SCSI-3 device 
da2: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enabled
da2: 17429MB (35694904 512 byte sectors: 255H 63S/T 2221C)
cd0 at ahc0 bus 0 target 0 lun 0
cd0: <TEAC CD-ROM CD-532S 1.0A> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present
cd1 at ahc0 bus 0 target 1 lun 0
cd1: <PLEXTOR CD-R   PX-R412C 1.07> Removable CD-ROM SCSI-2 device 
cd1: 10.000MB/s transfers (10.000MHz, offset 8)
cd1: Attempt to query device size failed: NOT READY, Medium not present
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/da0s1a

>How-To-Repeat:
	Install FreeBSD 5.3-RC1 on Asus CUR-DLS w/ SCSI equipment. Use SMP/SCHED_4BSD
	kernel, use PREEMPTION and ALTQ code in kernel but without queues in pf.conf.
	Don't know how to say more how to trigger hangs.
	UP (SMP/apic in kernel still enabled, smp disbaled by oid in /boot/loader.conf.local
	clears the state!)
>Fix:

	


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list