Patch available for shared em interrupts (Re: em, bge, network problems survey.)

Scott Long scottl at samsco.org
Sat Oct 14 04:43:08 UTC 2006


Mike Tancsa wrote:
> At 10:34 PM 10/5/2006, Kris Kennaway wrote:
> 
>> Based on successful testing on a machine with shared em interrupt, the
>> following patch should work around the problem *in that case*.
>>
>> Note that this patch will not help you if you are not using the em
>> driver, or if you are seeing the problem with non-shared em interrupt
>> (I have investigated on such outlier, which seems to be a problem with
>> a particular model of em hardware and not a generic problem with the
>> driver).
>>
>> Please let Scott and I know whether or not this patch works for you
>> (in addition to the information previously requested, if you have not
>> already sent it).  Unfortunately it is only a workaround, but it
>> points to an underlying problem with fast interrupt handlers on a
>> shared irq that can be studied separately.
> 
> I ran into a em0 timeout on a box I just started testing. The patch 
> seems to fix the issue.
> (before the patch)
> Oct 13 21:42:56 am64 kernel: em0: watchdog timeout -- resetting
> Oct 13 21:42:56 am64 kernel: em0: link state changed to DOWN
> Oct 13 21:42:58 am64 kernel: em0: link state changed to UP
> 
> dmesg with patch
> 
> Copyright (c) 1992-2006 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 6.2-PRERELEASE #2: Fri Oct 13 22:28:38 EDT 2006
>     mdtancsa at am64.sentex.ca:/usr/obj/usr/src/sys/up
> ACPI APIC Table: <A M I  OEMAPIC >
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.71-MHz K8-class CPU)
>   Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3
>   
> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> 
> 
>   Features2=0x649d<SSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,<b14>>
>   AMD Features=0x20000800<SYSCALL,LM>
>   Logical CPUs per core: 2
> real memory  = 3481198592 (3319 MB)
> avail memory = 3360186368 (3204 MB)
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> ioapic1 <Version 2.0> irqs 24-47 on motherboard
> ioapic2 <Version 2.0> irqs 48-71 on motherboard
> kbd1 at kbdmux0
> acpi0: <A M I 7221BK1E> on motherboard
> acpi_bus_number: can't get _ADR
> acpi_bus_number: can't get _ADR
> acpi0: Power Button (fixed)
> acpi0: reservation of 500, 10 (4) failed
> acpi0: reservation of 560, 20 (4) failed
> Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0: <ACPI CPU> on acpi0
> acpi_throttle0: <ACPI CPU Throttling> on cpu0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pci0: <display, VGA> at device 2.0 (no driver attached)
> pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
> pci2: <ACPI PCI bus> on pcib1
> pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci2
> pci4: <ACPI PCI bus> on pcib2
> pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci2
> pci3: <ACPI PCI bus> on pcib3
> 3ware device driver for 9000 series storage controllers, version: 
> 3.60.02.012
> twa0: <3ware 9000 series Storage Controller> port 0xef80-0xefbf mem 
> 0xfebff000-0xfebfffff irq 53 at device 2.0 on pci3
> twa0: [GIANT-LOCKED]
> twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-4LP, 4 
> ports, Firmware FE9X 3.01.01.028, BIOS BE9X 3.01.00.024
> uhci0: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-A> port 
> 0xcc00-0xcc1f irq 23 at device 29.0 on pci0
> uhci0: [GIANT-LOCKED]
> usb0: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-A> on uhci0
> usb0: USB revision 1.0
> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> uhci1: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-B> port 
> 0xcc80-0xcc9f irq 19 at device 29.1 on pci0
> uhci1: [GIANT-LOCKED]
> usb1: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-B> on uhci1
> usb1: USB revision 1.0
> uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub1: 2 ports with 2 removable, self powered
> uhci2: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-C> port 
> 0xcd00-0xcd1f irq 18 at device 29.2 on pci0
> uhci2: [GIANT-LOCKED]
> usb2: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-C> on uhci2
> usb2: USB revision 1.0
> uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub2: 2 ports with 2 removable, self powered
> ehci0: <Intel 82801FB (ICH6) USB 2.0 controller> mem 
> 0xfe9ff800-0xfe9ffbff irq 23 at device 29.7 on pci0
> ehci0: [GIANT-LOCKED]
> usb3: EHCI version 1.0
> usb3: companion controllers, 2 ports each: usb0 usb1 usb2
> usb3: <Intel 82801FB (ICH6) USB 2.0 controller> on ehci0
> usb3: USB revision 2.0
> uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
> uhub3: 6 ports with 6 removable, self powered
> pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci1: <ACPI PCI bus> on pcib4
> em0: <Intel(R) PRO/1000 Network Connection Version - 6.1.4> port 
> 0xdf80-0xdfbf mem 0xfeae0000-0xfeafffff irq 18 at device 3.0 on pci1
> em0: Ethernet address: 00:0e:0c:4b:15:eb
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <Intel ICH6 UDMA100 controller> port 
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376 at device 31.1 on pci0
> ata0: <ATA channel 0> on atapci0
> ata1: <ATA channel 1> on atapci0
> atapci1: <Intel ICH6 SATA150 controller> port 
> 0xcf80-0xcf87,0xcf00-0xcf03,0xce80-0xce87,0xce00-0xce03,0xcd80-0xcd8f 
> mem 0xfe9ffc00-0xfe9fffff irq 19 at device 31.2 on pci0
> ata2: <ATA channel 0> on atapci1
> ata3: <ATA channel 1> on atapci1
> pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
> acpi_button0: <Power Button> on acpi0
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> sio0: configured irq 4 not in bitmap of probed irqs 0
> sio0: port may not be enabled
> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on 
> acpi0
> sio0: type 16550A
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
> sio1: type 16550A
> fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 
> on acpi0
> fdc0: [FAST]
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> orm0: <ISA Option ROMs> at iomem 
> 0xc9800-0xcafff,0xcb000-0xcbfff,0xcc000-0xccfff,0xdc000-0xdffff on isa0
> ppc0: cannot reserve I/O port range
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> Timecounter "TSC" frequency 2992709460 Hz quality 800
> Timecounters tick every 1.000 msec
> ad0: 38166MB <Seagate ST340014A 3.06> at ata0-master UDMA100
> acd0: DVDR <AOPEN 8X8 DVD Dual AAN/1.4A> at ata0-slave UDMA33
> da0 at twa0 bus 0 target 0 lun 0
> da0: <AMCC 9550SX-4LP DISK 3.01> Fixed Direct Access SCSI-3 device
> da0: 100.000MB/s transfers
> da0: 152566MB (312455168 512 byte sectors: 255H 63S/T 19449C)
> Trying to mount root from ufs:/dev/ad0s1a
> [am64]# vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                           4          0
> irq6: fdc0                             9          0
> irq14: ata0                         6274          1
> irq18: em0 uhci2                  127128         25
> irq53: twa0                       188226         37
> cpu0: timer                      9911543       1999
> Total                           10233184       2064
> [am64]#
> 
> em0 at pci1:3:0:   class=0x020000 card=0x34448086 chip=0x10768086 rev=0x05 
> hdr=0x00
>     vendor   = 'Intel Corporation'
>     device   = '82547EI Gigabit Ethernet Controller'
>     class    = network
>     subclass = ethernet
> 
> The Intel board has the latest BIOS update as well, HTT disabled in the 
> BIOS.  If helpful, I can hook this box up to the netperf cluster which 
> has remote power and serial console access (including to the BIOS)
> 
>         ---Mike

Mike,

I have a new patch that I hope addresses the actual bug, instead of 
shuffling the timing.  Would you be willing to test it?  I can't 
guarantee that it's safe for production use yet, though.  It seems
to work, but it might set your dog on fire too.

Scott



More information about the freebsd-stable mailing list