i386/73244: [SMP] single user mode hangs, SMP freezes

O. Hartmann ohartman at web.de
Thu Oct 28 08:50:30 PDT 2004


>Number:         73244
>Category:       i386
>Synopsis:       [SMP] single user mode hangs, SMP freezes
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Oct 28 15:50:29 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     O. Hartmann
>Release:        FreeBSD 5.3-RC1
>Organization:
Department of Geophysics, Johannes Gutenberg-Universitaet, Mainz
>Environment:
FreeBSD edda.geo.uni-mainz.de 5.3-RC1 FreeBSD 5.3-RC1 #47: Thu Oct 28 10:30:05 UTC 2004 root at edda.geo.uni-mainz.de:/usr/obj/usr/src/sys/EDDA  i386

>Description:
FreeBSD 5.3-RC1 seems to have harsh problems with SMP on maybe my specific hardware (ASUS CUR-DLS main PCB).
Enabling SMP, PREEMPTION, ADAPTIVE_GIANT, APIC and ALTQ-related things in kernel config file _and_ enabling SMP at boot time causes the machine to freeze several minutes, hours depending on load. The symptomes are weird: first GUI freezes (tried several type, windowmaker, fvwm2, based on most recent Xorg stuff from ports). SOmetimes everything gets stuck, sometimes I can move mousepointer around, open existing xterm icons (but no new xterm), type in some commands, but get not executed(!). The after a while all these things get finaly stuck forever. Sometimes this final dead occurs immediately, everything freezes. At earlier time (BETA6-7) the computer rebooted after a while, this disappeared with RC1. I can trigger this behaviour very fast while using xlock -remote -mod atlantis or doing several tasks on very large maps (170MBytes) im ImageMagick.

All the above mentioned crashing-factors are not worth to be mentioned when disabling SMP with   

kern.smp.disabled="1"

set in loader.conf.local! FreeBSD is under non-SMP conditions really stable for days (also under heavy load), w/ or w/o PREEMPTION, w/ or w/o ACPI enabled, w/ or w/o ALTQ code enabled. All those harsh problems occurs when both CPUs are in use!
I swapped both CPUs, I changed memory, I changed power supply but nothing of those tasks had effect anyway.

When disabling both serial ports, FreeBSD kernel (UP AND SMP, w/ ACPI and w/o ACPI) doesn't boot across the SymbiosLogic bus reset message and gets stuck!

I also have no chance to run a single user kernel! Evertime I try to do the updating/upgrading stuff in single user mode as recommended, filesystem operations get stuck after a while. Ctrl-t reports some process are in state (biord), this is sometimes sh, make or cc depending on what I'm just doing. Many times fsck gets also stuck! In multiuser mode all these operations work fine, especially with UP, but also for a while in SMP.

My equipment (see also dmesg/mptable output that follows):

ASUS CUR-DLS main PCB, two PIII/866, 2x 512 MB reg. ECC (working well, did a lot of memtest stuff and burnCPU). I also used a Intel NIC em0, 64Bit GBit, but this adaptor seems to trigger very fast a crash, but also only in SMP.

The mentioned boot problems when disabling serial ports seems to be very similar to earlier problems of FreeBSD 4.X with a TYAN 2500 main PCB, which also utilize the RCC/ServerWorks LE/HE3 chipsets. I ran into massive problems using an AMI MEGA RAID 1600 controller together with a GBit NIC from Intel (the same as I use now in the CUR-DLS). The problem was solved by using one specific PCI slot for the NIC (weird!).

Here is the output of 
mptable -dmesg -verbose -grope

MPTable, version 2.0.15
mptable: illegal option -- f
usage: mptable [-dmesg] [-verbose] [-grope] [-help]
edda# mptable -dmesg -verbose -grope 

===============================================================================

MPTable, version 2.0.15

 looking for EBDA pointer @ 0x040e, found, searching EBDA @ 0x0009f000
 searching CMOS 'top of mem' @ 0x0009ec00 (635K)
 searching default 'top of mem' @ 0x0009fc00 (639K)
 searching BIOS @ 0x000f0000

 MP FPS found in BIOS @ physical addr: 0x000f5270

-------------------------------------------------------------------------------

MP Floating Pointer Structure:

  location:                     BIOS
  physical address:             0x000f5270
  signature:                    '_MP_'
  length:                       16 bytes
  version:                      1.4
  checksum:                     0xe3
  mode:                         Virtual Wire

-------------------------------------------------------------------------------

MP Config Table Header:

  physical address:             0x000f4e60
  signature:                    'PCMP'
  base table length:            300
  version:                      1.4
  checksum:                     0x75
  OEM ID:                       'OEM00000'
  Product ID:                   'PROD00000000'
  OEM table pointer:            0x00000000
  OEM table size:               0
  entry count:                  29
  local APIC address:           0xfee00000
  extended table length:        124
  extended table checksum:      200

-------------------------------------------------------------------------------

MP Config Base Table Entries:

--
Processors:     APIC ID Version State           Family  Model   Step    Flags
                 3       0x11    BSP, usable     6       8       10      0x387fb
ff
                 0       0x11    AP, usable      6       8       6       0x387fb
ff
--
Bus:            Bus ID  Type
                 0       PCI   
                 1       PCI   
                 2       ISA   
--
I/O APICs:      APIC ID Version State           Address
                 2       0x11    usable          0xfec00000
                 3       0x11    usable          0xfec01000
--
I/O Ints:       Type    Polarity    Trigger     Bus ID   IRQ    APIC ID PIN#
                ExtINT   conforms    conforms        2     0          2    0
                INT      conforms    conforms        2     1          2    1
                INT      conforms    conforms        2     0          2    2
                INT      conforms    conforms        2     3          2    3
                INT      conforms    conforms        2     4          2    4
                INT      conforms    conforms        2     6          2    6
                INT      conforms    conforms        2     7          2    7
                INT      conforms    conforms        2     8          2    8
                INT      conforms    conforms        2    12          2   12
                INT      conforms    conforms        2    13          2   13
                INT      conforms    conforms        2    14          2   14
                INT      conforms    conforms        2    15          2   15
                INT     active-lo       level        0   2:A          3    4
                INT     active-lo       level        0   4:A          3    1
                INT     active-lo       level        0   6:A          3    3
                INT     active-lo       level        0  15:A          3   14
                INT     active-lo       level        2     9          2    9
                INT     active-lo       level        1   3:A          3    6
                INT     active-lo       level        1   5:A          3    8
                INT     active-lo       level        1   5:B          3    9
--
Local Ints:     Type    Polarity    Trigger     Bus ID   IRQ    APIC ID PIN#
                ExtINT  active-hi        edge        2     0        255    0
                NMI     active-hi        edge        2     0        255    1

-------------------------------------------------------------------------------

MP Config Extended Table Entries:

--
System Address Space
 bus ID: 0 address type: I/O address
 address base: 0x0
 address range: 0x10000
--
System Address Space
 bus ID: 0 address type: memory address
 address base: 0x40000000
 address range: 0xbebc0000
--
System Address Space
 bus ID: 0 address type: prefetch address
 address base: 0xfebc0000
 address range: 0xe9440000
--
System Address Space
 bus ID: 0 address type: memory address
 address base: 0xe8000000
 address range: 0x18000000
--
System Address Space
 bus ID: 0 address type: memory address
 address base: 0xa0000
 address range: 0x20000
--
Bus Heirarchy
 bus ID: 2 bus info: 0x01 parent bus ID: 0
--
Compatibility Bus Address
 bus ID: 0 address modifier: add
 predefined range: 0x00000000
--
Compatibility Bus Address
 bus ID: 0 address modifier: add
 predefined range: 0x00000001

-------------------------------------------------------------------------------

dmesg output:

Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.3-RC1 #47: Thu Oct 28 10:30:05 UTC 2004
    root at edda.geo.uni-mainz.de:/usr/obj/usr/src/sys/EDDA
ACPI APIC Table: <ASUS   CUR-DLS >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Pentium III (866.71-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x68a  Stepping = 10
  Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CM
OV,PAT,PSE36,PN,MMX,FXSR,SSE>
real memory  = 1073721344 (1023 MB)
avail memory = 1041166336 (992 MB)
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
netsmb_dev: loaded
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <ASUS CUR-DLS> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
fxp0: <Intel 82559 Pro/100 Ethernet> port 0xd800-0xd83f mem 0xfd800000-0xfd8ffff
f,0xfe000000-0xfe000fff irq 20 at device 2.0 on pci0
miibus0: <MII bus> on fxp0
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: Ethernet address: 00:e0:18:05:73:f4
ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0xd400-0xd4ff mem 0xfd000000-0xfd00
0fff irq 17 at device 4.0 on pci0
ahc0: [GIANT-LOCKED]
aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs
pcm0: <AudioPCI ES1373-8> port 0xd000-0xd03f irq 19 at device 6.0 on pci0
pcm0: <Cirrus Logic CS4297A AC97 Codec>
pcm0: [GIANT-LOCKED]
pci0: <display, VGA> at device 7.0 (no driver attached)
isab0: <PCI-ISA bridge> port 0xe800-0xe80f at device 15.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <ServerWorks ROSB4 UDMA33 controller> port 0xb400-0xb40f,0x376,0x170-0x
177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
pcib1: <ACPI Host-PCI bridge> on acpi0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 0xb000-0xb03f
 mem 0xfa800000-0xfa81ffff irq 22 at device 3.0 on pci1
em0: Ethernet address: 00:07:e9:14:8f:7b
em0:  Speed:N/A  Duplex:N/A
sym0: <1010-33> port 0xa800-0xa8ff mem 0xf9800000-0xf9801fff,0xfa000000-0xfa0003
ff irq 24 at device 5.0 on pci1
sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: handling phase mismatch from SCRIPTS.
sym0: [GIANT-LOCKED]
sym1: <1010-33> port 0xa400-0xa4ff mem 0xf8800000-0xf8801fff,0xf9000000-0xf90003
ff irq 25 at device 5.1 on pci1
sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: open drain IRQ line driver, using on-chip SRAM
sym1: using LOAD/STORE-based firmware.
sym1: handling phase mismatch from SCRIPTS.
sym1: [GIANT-LOCKED]
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
ppc0: <ECP parallel printer port> port 0x778-0x77a,0x378-0x37f irq 7 drq 3 flags
 0x8 on acpi0
ppc0: Generic chipset (ECP-only) in ECP mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
orm0: <ISA Option ROMs> at iomem 0xd0000-0xd3fff,0xc0000-0xca7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <8 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fb0 at vga0
Timecounter "TSC" frequency 866705726 Hz quality 800
Timecounters tick every 1.000 msec
Waiting 5 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
da0 at sym0 bus 0 target 0 lun 0
da0: <IBM IC35L018UWD210-0 S5BS> Fixed Direct Access SCSI-3 device 
da0: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enable
d
da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da1 at sym0 bus 0 target 1 lun 0
da1: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device 
da1: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enable
d
da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da2 at sym0 bus 0 target 2 lun 0
da2: <FUJITSU MAJ3182MP 5207> Fixed Direct Access SCSI-3 device 
da2: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing Enable
d
da2: 17429MB (35694904 512 byte sectors: 255H 63S/T 2221C)
cd0 at ahc0 bus 0 target 0 lun 0
cd0: <TEAC CD-ROM CD-532S 1.0A> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present
cd1 at ahc0 bus 0 target 1 lun 0
cd1: <PLEXTOR CD-R   PX-R412C 1.07> Removable CD-ROM SCSI-2 device 
cd1: 10.000MB/s transfers (10.000MHz, offset 8)
cd1: Attempt to query device size failed: NOT READY, Medium not present
Mounting root from ufs:/dev/da0s1a
pflog0: promiscuous mode enabled
fxp0: promiscuous mode enabled
fxp0: promiscuous mode disabled

===============================================================================




>How-To-Repeat:
Use FreeBSD 5.3-BETA4 up to FREEBSD 5.3-RC1 on CUR-DLS 
>Fix:
Dont use SMP on the above mentioned main PCB.
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-i386 mailing list