kern/108651: option PREEMPTION causes machine hangs

Brendon Meyer bmeyer at mesoft.com.au
Thu Feb 1 09:40:20 UTC 2007


>Number:         108651
>Category:       kern
>Synopsis:       option PREEMPTION causes machine hangs
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Feb 01 09:40:19 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     Brendon Meyer
>Release:        6.2-STABLE
>Organization:
MicroElectronic SOFTworks Pty Ltd
>Environment:
FreeBSD tarja.sydney.mesoft.com.au 6.2-STABLE FreeBSD 6.2-STABLE #0: Tue Jan 30 08:13:02 EST 2007     bmeyer at tarja.sydney.mesoft.com.au:/usr/src/sys/i386/compile/TARJANPE  i386
>Description:
The kernel configuration 'option PREEMPTION' when used with a TYAN 2462 based board, along with a Intel gigabit NIC and a 3WARE 9500S-8 will quickly lock up with "watchdog timeout" errors on the NIC card.  

A snippet of the dmesg output is as follows:
Jan 30 05:30:09 tarja kernel: em0: watchdog timeout -- resetting

If left alone, the machine becomes catatonic soon after.  

Machine is based on a TYAN2462 dual processor AMD Athlon board (with 2 processors).  Machine has a generic Sil0680 based ATA RAID controller that the system runs from and the primary data storage is handled by a 3WARE 9500S-8 SATA RAID controller and primary network IO is handled by a Intel Gigabit NIC card.  

Output from a 'pciconf -lv'

agp0 at pci0:0:0:  class=0x060000 card=0x00000000 chip=0x700c1022 rev=0x11 hdr=0x00
    vendor   = 'Advanced Micro Devices (AMD)'
    device   = 'AMD-762 CPU to PCI Bridge (SMP chipset)'
    class    = bridge
    subclass = HOST-PCI
pcib1 at pci0:1:0: class=0x060400 card=0x00000000 chip=0x700d1022 rev=0x00 hdr=0x01
    vendor   = 'Advanced Micro Devices (AMD)'
    device   = 'AMD-762 CPU to PCI Bridge (AGP 4x)'
    class    = bridge
    subclass = PCI-PCI
isab0 at pci0:7:0: class=0x060100 card=0x00000000 chip=0x74101022 rev=0x02 hdr=0x00
    vendor   = 'Advanced Micro Devices (AMD)'
    device   = 'AMD-766 PCI to ISA/LPC Bridge'
    class    = bridge
    subclass = PCI-ISA
atapci0 at pci0:7:1:       class=0x01018a card=0x00000000 chip=0x74111022 rev=0x01 hdr=0x00
    vendor   = 'Advanced Micro Devices (AMD)'
    device   = 'AMD-766 Enhanced IDE Controller'
    class    = mass storage
    subclass = ATA
none0 at pci0:7:3: class=0x068000 card=0x00000000 chip=0x74131022 rev=0x01 hdr=0x00
    vendor   = 'Advanced Micro Devices (AMD)'
    device   = 'AMD-766 Power Management Controller'
    class    = bridge
ohci0 at pci0:7:4: class=0x0c0310 card=0x00000000 chip=0x74141022 rev=0x07 hdr=0x00
    vendor   = 'Advanced Micro Devices (AMD)'
    device   = 'AMD-766 USB OpenHCI Host Controller'
    class    = serial bus
    subclass = USB
twa0 at pci0:9:0:  class=0x010400 card=0x100213c1 chip=0x100213c1 rev=0x00 hdr=0x00
    vendor   = '3ware Inc.'
    device   = '9000 series SATA/PATA Storage Controller'
    class    = mass storage
    subclass = RAID
em0 at pci0:11:0:  class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
    vendor   = 'Intel Corporation'
    device   = 'PRO/1000 GT'
    class    = network
    subclass = ethernet
atapci1 at pci0:12:0:      class=0x010400 card=0x36801095 chip=0x06801095 rev=0x02 hdr=0x00
    vendor   = 'Silicon Image Inc (Was: CMD Technology Inc)'
    device   = 'SiI 0680 (Was: PCI-0680) Ultra ATA133 EIDE Controller'
    class    = mass storage
    subclass = RAID
ahc0 at pci0:13:0: class=0x010000 card=0x246210f1 chip=0x00cf9005 rev=0x01 hdr=0x00
    vendor   = 'Adaptec Inc'
    device   = 'AIC-7899P Ultra160 SCSI Host Adapter'
    class    = mass storage
    subclass = SCSI
ahc1 at pci0:13:1: class=0x010000 card=0x246210f1 chip=0x00cf9005 rev=0x01 hdr=0x00
    vendor   = 'Adaptec Inc'
    device   = 'AIC-7899P Ultra160 SCSI Host Adapter'
    class    = mass storage
    subclass = SCSI
none1 at pci0:14:0:        class=0x030000 card=0x80081002 chip=0x47521002 rev=0x27 hdr=0x00
    vendor   = 'ATI Technologies Inc'
    device   = 'Rage XL PCI'
    class    = display
    subclass = VGA
xl0 at pci0:15:0:  class=0x020000 card=0x246210f1 chip=0x980010b7 rev=0x78 hdr=0x00
    vendor   = '3COM Corp, Networking Division'
    device   = '3C980-TX Fast EtherLink XL Server Adapter2'
    class    = network
    subclass = ethernet
xl1 at pci0:16:0:  class=0x020000 card=0x246210f1 chip=0x980010b7 rev=0x78 hdr=0x00
    vendor   = '3COM Corp, Networking Division'
    device   = '3C980-TX Fast EtherLink XL Server Adapter2'
    class    = network
    subclass = ethernet


>How-To-Repeat:
Build a generic kernel, connect to say a remote NFS server and using either, pax, cpio or something else along those lines that is capable of generating large amounts of traffic, start copying files.  

Initially all will be fine.  Not long after though, all network IO will stop and you will start getting the watchdog timeout messages on the console for the NIC card.  

At this point, the NIC is dead.  It isn't revivable.  

If you *leave* the machine on its own, it will actually become completely catatonic soon afterwards.  

If you elect to attempt to shut the machine down, during the shutdown process it will attempt to flush buffers but *will* fail miserably.  It will eventually give up but when the machine reboots you have all the joys of fsck.  
>Fix:
While a fix is unknown at this stage, by building a kernel without the 'option PREEMPTION' the problem doesn't appear to manifest itself (at least to date, I have not been able to re-create the problem without the PREEMPTION option).  


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list