Weird "ignoring syn" problem

Bill Moran wmoran at collaborativefusion.com
Tue Jun 12 15:19:58 UTC 2007


Brief update to add another item to the list of things I've tried:
*) The problem occurs whether the em device is polling or not.

In response to Bill Moran <wmoran at collaborativefusion.com>:
>
> This one has got me pretty befuddled.
> 
> We're seeing some really odd behaviour with FreeBSD ignoring SYN packets.
> I've been trying to diagnose this for a couple of weeks now, and my current
> guess is that there's something wrong with the em driver.  Here's a narrowed
> down list of what I've ruled out:
> *) I've done my best to eliminate other network components as the problem.
>    My theory at this point is that it can't possibly be any other network
>    hardware, based on the tcpdump show below.
> *) The problem occurred on both FreeBSD 6.1 and FreeBSD 6.2-p3.
> *) The problem does not appear to be tied to CPU usage -- the CPU is nearly
>    idle when the problem occurs.
> *) I can now reproduce it pretty easily, so I'll know when it's fixed.
> *) The system exhibiting the problem is running 15 jails, but they are
>    idle 95% of the time.  The problem initially occurred inside one of
>    the jails, but I just recreated it outside the jail (on the host) and
>    it's _easier_ to reproduce outside the jail.
> *) The problem occurred with both GENERIC, and the SMP kernel (this is a
>    dual-CPU, hyperthreaded system)
> *) I've tested and the behavior occurs both with a dynamically generated
>    file (from PHP) or from a static file.
> 
> The nature of the beast is that we've got a SOAP application running under
> Apache and PHP.  This application is subject to many requests in rapid
> succession, such that load can be simulated by the following loop:
> 
> while true; do fetch http://192.168.121.250/test.php; done
> 
> The problem is that occasionally, the Apache server machine just ignores
> SYN packets.  Take the following tcpdump output for example:
> 
> 13:34:17.312296 IP web04-v100.cust00.pitbpa1.priv.collaborativefusion.com.54808 > anchor-is00.is.pitbpa1.priv.collaborativefusion.com.http: S 2645061726:2645061726(0) win 65535 <mss 1380,nop,wscale 1,nop,nop,timestamp 2690201156 0,sackOK,eol>
> 13:34:20.312398 IP web04-v100.cust00.pitbpa1.priv.collaborativefusion.com.54808 > anchor-is00.is.pitbpa1.priv.collaborativefusion.com.http: S 2645061726:2645061726(0) win 65535 <mss 1380,nop,wscale 1,nop,nop,timestamp 2690204156 0,sackOK,eol>
> 13:34:23.512626 IP web04-v100.cust00.pitbpa1.priv.collaborativefusion.com.54808 > anchor-is00.is.pitbpa1.priv.collaborativefusion.com.http: S 2645061726:2645061726(0) win 65535 <mss 1380,nop,wscale 1,nop,nop,timestamp 2690207356 0,sackOK,eol>
> 
> This is the _only_ traffic on port 80 during the test.  It looks like the
> kernel has ignored the initial syn packet and two duplicates.  I've seen it
> take as long as 45 seconds to establish a connection, and this causes
> ugly performance problems, as well as frequent timeouts on the client end.
> The only clue I've found so far is this output from netstat -s.
> 
>         153099 syncache entries added
>                 6184 retransmitted
>                 6491 dupsyn
>                 0 dropped
>                 150923 completed
>                 0 bucket overflow
>                 0 cache overflow
>                 235 reset
>                 1941 stale
>                 0 aborted
>                 0 badack
>                 0 unreach
>                 0 zone failures
> 
> Unfortunately, I've been unable to determine how to fix the problem.  Any
> advice is welcome.
> 
> Details:
> Copyright (c) 1992-2007 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 6.2-RELEASE-p3 #2: Thu Jun  7 21:37:54 UTC 2007
>     root at is00:/usr/obj/usr/src/sys/SMP
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Xeon(TM) CPU 3.00GHz (2992.71-MHz 686-class CPU)
>   Origin = "GenuineIntel"  Id = 0xf43  Stepping = 3
>   Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>   Features2=0x641d<SSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,<b14>>
>   AMD Features=0x20100000<NX,LM>
>   Logical CPUs per core: 2
> real memory  = 2147221504 (2047 MB)
> avail memory = 2096107520 (1999 MB)
> ACPI APIC Table: <DELL   PE BKC  >
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
>  cpu2 (AP): APIC ID:  6
>  cpu3 (AP): APIC ID:  7
> ioapic0: Changing APIC ID to 8
> ioapic1: Changing APIC ID to 9
> ioapic1: WARNING: intbase 32 != expected base 24
> ioapic2: Changing APIC ID to 10
> ioapic2: WARNING: intbase 64 != expected base 56
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> ioapic1 <Version 2.0> irqs 32-55 on motherboard
> ioapic2 <Version 2.0> irqs 64-87 on motherboard
> kbd1 at kbdmux0
> ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
> acpi0: <DELL PE BKC> on motherboard
> acpi0: Power Button (fixed)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> cpu2: <ACPI CPU> on acpi0
> cpu3: <ACPI CPU> on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
> pci2: <ACPI PCI bus> on pcib2
> amr0: <LSILogic MegaRAID 1.53> mem 0xd80f0000-0xd80fffff,0xdfde0000-0xdfdfffff irq 46 at device 14.0 on pci2
> amr0: delete logical drives supported by controller
> amr0: <LSILogic PERC 4e/Si> Firmware 521X, BIOS H430, 256MB RAM
> pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
> pci3: <ACPI PCI bus> on pcib3
> em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0xecc0-0xecff mem 0xdfbe0000-0xdfbfffff irq 37 at device 11.0 on pci3
> em0: Ethernet address: 00:04:23:c8:ff:f4
> em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0xec80-0xecbf mem 0xdfbc0000-0xdfbdffff irq 38 at device 11.1 on pci3
> em1: Ethernet address: 00:04:23:c8:ff:f5
> pcib4: <ACPI PCI-PCI bridge> at device 4.0 on pci0
> pci4: <ACPI PCI bus> on pcib4
> pcib5: <ACPI PCI-PCI bridge> at device 5.0 on pci0
> pci5: <ACPI PCI bus> on pcib5
> pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
> pci6: <ACPI PCI bus> on pcib6
> em2: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0xdcc0-0xdcff mem 0xdf8e0000-0xdf8fffff irq 64 at device 7.0 on pci6
> em2: Ethernet address: 00:13:72:4f:71:23
> pcib7: <ACPI PCI-PCI bridge> at device 0.2 on pci5
> pci7: <ACPI PCI bus> on pcib7
> em3: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port 0xccc0-0xccff mem 0xdf6e0000-0xdf6fffff irq 65 at device 8.0 on pci7
> em3: Ethernet address: 00:13:72:4f:71:24
> pcib8: <ACPI PCI-PCI bridge> at device 6.0 on pci0
> pci8: <ACPI PCI bus> on pcib8
> uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xace0-0xacff irq 16 at device 29.0 on pci0
> uhci0: [GIANT-LOCKED]
> usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
> usb0: USB revision 1.0
> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xacc0-0xacdf irq 19 at device 29.1 on pci0
> uhci1: [GIANT-LOCKED]
> usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
> usb1: USB revision 1.0
> uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub1: 2 ports with 2 removable, self powered
> uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xaca0-0xacbf irq 18 at device 29.2 on pci0
> uhci2: [GIANT-LOCKED]
> usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
> usb2: USB revision 1.0
> uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub2: 2 ports with 2 removable, self powered
> ehci0: <Intel 82801EB/R (ICH5) USB 2.0 controller> mem 0xdff00000-0xdff003ff irq 23 at device 29.7 on pci0
> ehci0: [GIANT-LOCKED]
> usb3: EHCI version 1.0
> usb3: companion controllers, 2 ports each: usb0 usb1 usb2
> usb3: <Intel 82801EB/R (ICH5) USB 2.0 controller> on ehci0
> usb3: USB revision 2.0
> uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
> uhub3: 6 ports with 6 removable, self powered
> uhub4: vendor 0x413c product 0xa001, class 9/0, rev 2.00/0.00, addr 2
> uhub4: multiple transaction translators
> uhub4: 2 ports with 2 removable, self powered
> pcib9: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci9: <ACPI PCI bus> on pcib9
> pci9: <unknown> at device 5.0 (no driver attached)
> pci9: <unknown> at device 5.1 (no driver attached)
> pci9: <unknown> at device 5.2 (no driver attached)
> atapci0: <SiI 0680 UDMA133 controller> port 0xbcf0-0xbcf7,0xbce4-0xbce7,0xbcd8-0xbcdf,0xbcd0-0xbcd3,0xbc70-0xbc7f mem 0xdf3fec00-0xdf3fecff irq 23 at device 6.0 on pci9
> ata2: <ATA channel 0> on atapci0
> ata3: <ATA channel 1> on atapci0
> pci9: <display, VGA> at device 13.0 (no driver attached)
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> atapci1: <Intel ICH5 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
> ata0: <ATA channel 0> on atapci1
> ata1: <ATA channel 1> on atapci1
> fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
> fdc0: [FAST]
> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
> sio0: type 16550A
> pmtimer0 on isa0
> orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xec000-0xeffff on isa0
> atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> ppc0: parallel port not found.
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> ukbd0: Dell DRAC4, rev 1.10/0.00, addr 2, iclass 3/1
> kbd2 at ukbd0
> ums0: Dell DRAC4, rev 1.10/0.00, addr 2, iclass 3/1
> ums0: 3 buttons and Z dir.
> Timecounters tick every 1.000 msec
> acd0: CDROM <TEAC CD-ROM CD-224E-N/3.AB> at ata0-master UDMA33
> device_attach: afd0 attach returned 6
> acd1: CDROM <VIRTUALCDROM DRIVE/> at ata2-slave PIO3
> amr0: delete logical drives supported by controller
> amrd0: <LSILogic MegaRAID logical drive> on amr0
> amrd0: 34680MB (71024640 sectors) RAID 1 (optimal)
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #2 Launched!
> Trying to mount root from ufs:/dev/amrd0s1a
> 
> 
> -- 
> Bill Moran
> Collaborative Fusion Inc.
> http://people.collaborativefusion.com/~wmoran/
> 
> wmoran at collaborativefusion.com
> Phone: 412-422-3463x4023
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> 
> 
> 
> 
> 
> 


-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran at collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************


More information about the freebsd-net mailing list