slow writes on nfs with bge devices

Mon Jan 22 10:50:54 UTC 2007

On Mon, 22 Jan 2007, Bruce Evans wrote:

> % ...
>
> Then it got to here with only relatively minor glitches:
>
> % 16:41:12.947085 IP phiplex.bde.org.1735978781 > besplex.bde.org.nfs: 1472 
> write fh 1028,575312/94220 8192 bytes @ 0x2d6000
> % 16:41:12.947086 IP phiplex.bde.org > besplex.bde.org: udp
> % 16:41:12.947087 IP phiplex.bde.org > besplex.bde.org: udp
> % 16:41:12.947088 IP phiplex.bde.org > besplex.bde.org: udp
> % 16:41:12.947089 IP phiplex.bde.org > besplex.bde.org: udp
> % 16:41:12.947090 IP phiplex.bde.org > besplex.bde.org: udp
> % 16:41:12.947861 IP phiplex.bde.org.1735978421 > besplex.bde.org.nfs: 1472 
> write fh 1028,575312/94220 8192 bytes @ 0x6000
>
> It's having problems writing 0x6000.  This is a retry.  Just for unreported
> packet loss?  Perhaps my switch or cable is the problem.

Nevermind / 2.  The switch or cable certain has problems.  The only problem
in FreeBSD or the bge driver, if any, may be that errors are not detected
except as completely lost packets.

Swapping cables gives the following strange behaviours:
- fxp <-> bge (5705) works perfectly (at only 100 MBps of course) with
   a direct link (11.83e6 B/S), but not through the switch (cheap 1Gbps
   switch). 
- fxp <-> bge (5701) works better with a direct link, but imperfectly
   with the same cables that work for the 5705.  nfs write speed increases
   from ~1 MB/S to ~8 MB/S.  The link is autonegotiated correctly as
   100baseTX full-duplex.  With this configuration, errors seem to be
   detected correctly.  nfs writes cause input errors (fxp -> bge at a
   an almost constant rate of almost exactly 1 per 500 packets.  (500
   may be significant since it the bge rx ring sizes is 512.)
- on reconnecting the 5701 through the switch, no errors were detected
   when the link is autonegotited to 1000BaseTX full-duplex, but when it
   is manually configured to 100baseTX full-duplex, some input errors but
   at a highly variable rate not nearly as many as high 1 in 500.  A
   few packet blasting tests on the 5701 didn't cause any input errors
   at 100baseTX.  I haven't figured out the type of the input errors.
   No output errors were reported for any of this.
- when the 5701 was in the server, exhaustive packet blasting tests at
   1000basetX caused a large number of input errors whenever the rx
   interrupt moderation was configured to just a little more aggressively
   than the default, in condtions where the tx is very active.  (The
   rx ring is supposed to have size 512, but had an effective size of
   only 20 under tx load, apparently due to misconfiguration of contention
   with tx resources.)  Packets were just dropped on input.

Bruce