serious networking (em) performance (ggate and NFS)
gurney_j at resnet.uoregon.edu
Mon Nov 22 13:31:11 PST 2004
Sean McNeil wrote this message on Mon, Nov 22, 2004 at 12:14 -0800:
> On Mon, 2004-11-22 at 11:34 +0000, Robert Watson wrote:
> > On Sun, 21 Nov 2004, Sean McNeil wrote:
> > > I have to disagree. Packet loss is likely according to some of my
> > > tests. With the re driver, no change except placing a 100BT setup with
> > > no packet loss to a gigE setup (both linksys switches) will cause
> > > serious packet loss at 20Mbps data rates. I have discovered the only
> > > way to get good performance with no packet loss was to
> > >
> > > 1) Remove interrupt moderation
> > > 2) defrag each mbuf that comes in to the driver.
> > Sounds like you're bumping into a queue limit that is made worse by
> > interrupting less frequently, resulting in bursts of packets that are
> > relatively large, rather than a trickle of packets at a higher rate.
> > Perhaps a limit on the number of outstanding descriptors in the driver or
> > hardware and/or a limit in the netisr/ifqueue queue depth. You might try
> > changing the default IFQ_MAXLEN from 50 to 128 to increase the size of the
> > ifnet and netisr queues. You could also try setting net.isr.enable=1 to
> > enable direct dispatch, which in the in-bound direction would reduce the
> > number of context switches and queueing. It sounds like the device driver
> > has a limit of 256 receive and transmit descriptors, which one supposes is
> > probably derived from the hardware limit, but I have no documentation on
> > hand so can't confirm that.
> I've tried bumping IFQ_MAXLEN and it made no difference. I could rerun
And the default for if_re is RL_IFQ_MAXLEN which is already 512... As
is mentioned below, the card can do 64 segments (which usually means 32
packets since each packet usually has a header + payload in seperate
> this test to be 100% certain I suppose. It was done a while back. I
> haven't tried net.isr.enable=1, but packet loss is in the transmission
> direction. The device driver has been modified to have 1024 transmit
> and receive descriptors each as that is the hardware limitation. That
> didn't matter either. With 1024 descriptors I still lost packets
> without the m_defrag.
hmmm... you know, I wonder if this is a problem with the if_re not
pulling enough data from memory before starting the transmit... Though
we currently have it set for unlimited... so, that doesn't seem like it
would be it..
> The most difficult thing for me to understand is: if this is some sort
> of resource limitation why will it work with a slower phy layer
> perfectly and not with the gigE? The only thing I could think of was
> that the old driver was doing m_defrag calls when it filled the transmit
> descriptor queues up to a certain point. Understanding the effects of
> m_defrag would be helpful in figuring this out I suppose.
maybe the chip just can't keep the transmit fifo loaded at the higher
speeds... is it possible vls is doing a writev for multisegmented UDP
packet? I'll have to look at this again...
> > It would be interesting on the send and receive sides to inspect the
> > counters for drops at various points in the network stack; i.e., are we
> > dropping packets at the ifq handoff because we're overfilling the
> > descriptors in the driver, are packets dropped on the inbound path going
> > into the netisr due to over-filling before the netisr is scheduled, etc.
> > And, it's probably interesting to look at stats on filling the socket
> > buffers for the same reason: if bursts of packets come up the stack, the
> > socket buffers could well be being over-filled before the user thread can
> > run.
> Yes, this would be very interesting and should point out the problem. I
> would do such a thing if I had enough knowledge of the network pathways.
> Alas, I am very green in this area. The receive side has no issues,
> though, so I would focus on transmit counters (with assistance).
John-Mark Gurney Voice: +1 415 225 5579
"All that I will do, has been done, All that I have, has not."
More information about the freebsd-stable