excessive TCP dulplicate acks revisted

Gregory Wright gwright at antiope.com
Mon Nov 19 10:54:55 PST 2007

> Gregory Wright wrote:
>> On Nov 11, 2007, at 5:23 PM, Andre Oppermann wrote:
>>> Gregory Wright wrote:
>>>> On Nov 10, 2007, at 10:28 AM, Andre Oppermann wrote:
>>>> Hi Andre,
>>>> I also took a look at the bge (4) driver in 7.0-BETA2.  As far  
>>>> as I can tell,
>>>> it does not support TSO (there is no ioctl supporting TSO enable/ 
>>>> disable
>>>> as there is for the em(4) driver).
>>>> Might the chip --- a BCM5704_B0 --- not be completely  
>>>> initialized?  This
>>>> might explain why the machine with the BCM5714_B3 chips works,  
>>>> while
>>>> the other machine shows the duplicate ACK bug.
>>> Perhaps.  Do you see the duplicate ACKs in a tcpdump on both the  
>>> sender
>>> and the receiver?  If you see it on the sender too, then it must  
>>> be a
>>> bug in our network stack or the driver (by requeuing the same packet
>>> over and over again).
>>> --Andre
>> The logs show that the duplicate ACKs are generated only by the
>> receiver.  I suspect a bug in the driver, perhaps the ACK packet
>> is not being removed from the TX buffer ring.  Examining the  
>> transmitted
>> packets should be enough to rule out a network stack problem.  Is
>> there any debugging infrastructure I can use or do I just have to
>> hack in on my own?
> We don't have an infrastructure to deal with this kind of driver
> problems.  You have to instrument the driver code to report stuck
> mbufs.

Hi Andre,

I have some additional information that indicates this is a driver bug.
There was a report to one of the Gentoo linux mailing lists of the same
problem with BCM5704s, in which everything worked at 1 Gb/s, but
duplicate ACKs were seen at 100 Mb/s.  Link to the message:


The report said that the problem was solved by upgrading the linux
kernel from 2.6.17 to 2.6.18.  I've compared the tg3 drivers in the two
releases are were quite a few changes, so it will take a while to track
down what the key fix was.

So the bug in the bge driver for these chips can likely be fixed.

Thanks for your help.

Best Wishes,

