Strange things on GBit / 1000->100 / net.inet.tcp.inflight.*

Raphael H. Becker rabe at
Fri Sep 17 01:44:03 PDT 2004

Hi *,

one of our subnets is on a GBit-Switch since last week.
The nodes on the subnet are:

2x Dell PE350,  RELENG_4_10, fxp{0,1}, 100baseTX <full-duplex>
3x Dell PE2650, RELENG_5 (BETA4), bge0, 1000baseTX <full-duplex>
1x Dell PE2650, RELENG_4_10, bge1, 1000baseTX <full-duplex> 

The switch is a "NETGEAR Model GS516T Copper Gigabit Switch" [1]

To test transfer und throughput every system has a running ftpd and a
1GByte-file in /pub/1GB or a 250M file for the small boxes.

Every system is able to send and receive data with full speed
(>10.5MBytes/sec on 100MBit, >70-90MBytes/sec(!) on GBit)

I use wget for testing:
wget -O - --proxy=off >/dev/null

The 3 5.x-Boxes on GBit transfer up to ~93MBytes(!) per second to 
each other (serving the file from cache, 2 parallel sessions).

The two PE350 boxes transfer data with >10MBytes/sec to each other.

FTP from a 5.3 (PE2650,GBit) to 4.10 (PE350,100MBit) fails, throughput 
around 200kBytes to 750kBytes/sec !! 

Same two hosts, ftp in other direction (100->1000) is running

I tested with another PE2650, running 4.10-RELEASE, ftp 1000->100 works
fine, >10MBytes/sec, stable!!

The difference must be the OS, the hardware is more or less the same

the 4.10-BOX:
bge1: <Broadcom BCM5701 Gigabit Ethernet, ASIC rev. 0x105> mem 0xfcd00000-0xfcd0ffff irq 17 at device 8.0 on pci3
bge1: Ethernet address: 00:06:5b:f7:f9:00
miibus1: <MII bus> on bge1

one of the the 5.3-Boxes:
bge0: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xfcf10000-0xfcf1ffff irq 28 at device 6.0 on pci3
miibus0: <MII bus> on bge0
bge0: Ethernet address: 00:0d:56:bb:9c:25

My guess: The 5.3-Boxes send bigger TCP-Windows than our switch has
buffer for each port resulting in massive packetloss or something like
that. The sender is "too fast" for the switch or the switch isn't able
to convert from 1000MBit to 100MBit under heavy load

I fiddled around with net.inet.tcp.inflight.max. A rebooted system
has a value of "net.inet.tcp.inflight.max: 1073725440", i trimmed that
down in steps, testing and searching for effects. 

A value < ~75000 for ~.max limits the throughput 1000->1000 MBit
The transfer 1000->100MBit works for values <11583 (around 7MByte/sec),
>=11584 the throughput cuts, about 200kByte/sec.

A max throughput 1000->100MBit is for a value ~.max around 7800-8200.
With this value the GBit-to-GBit transfer is around 18.5MBytes/sec and

Using the "edge" of ~.max=11583 the GBit-to-GBit transfer is at 31MBytes/sec.

I have no idea what is wrong or broken. Maybe the switch (too small buffer) 
or the "inflight bandwith delay"-algorithm or something else. I guess ther's 
no physical problem with cables or connectors or ports on the switch
(1000MBit works great for 1000MBit only).

I'm willing to test patches or other cases as long as I don't need to
change hardware.

Need more detailed info on a subject?
Any idea? Tuning? Patches? Pointers?

Raphael Becker

PS: [1]

More information about the freebsd-current mailing list