cvs commit: src/sys/dev/bge if_bge.c

Bruce Evans bde at zeta.org.au
Sun Dec 24 01:43:32 PST 2006


On Sun, 24 Dec 2006, Scott Long wrote:

> Bruce Evans wrote:
>> On Sat, 23 Dec 2006, Robert Watson wrote:
>> 
>>> On Sat, 23 Dec 2006, John Polstra wrote:
>>> 
>>>>> That said, dropping and regrabbing the driver lock in the rxeof routine 
>>>>> of any driver is bad.  It may be safe to do, but it incurs horrible 
>>>>> performance penalties.  It essentially allows the time-critical, high 
>>>>> priority RX path to be constantly preempted by the lower priority 
>>>>> if_start or if_ioctl paths.  Even without this preemption and priority 
>>>>> inversion, you're doing an excessive number of expensive lock ops in the 
>>>>> fast path.
>> 
>> It's not very time-critical or high priority for bge or any other device
>> that has a reasonably large rx ring.  With a ring size of 512 and an rx
>> interrupt occuring not too near the end (say at half way), you have 256
>> packet times to finish processing the interrupt.  For normal 1518 byte
>> packets at 1Gbps, 256 packet times is about 3 mS.  bge's rx ring size
>> is actually larger than 512 for most hardware.
>
> Speed testing full size packets doesn't tax the hardware or the OS, nor
> does it represent real world scenarios.  Testing minimum sized packets
> isn't terribly real-world either, but it represents the worst-case for
> the hardware and the OS and thus is a good standard for extrapolating
> performance potential.  And, 1Gb isn't terribly interesting these days
> either, 10Gb is.

I left calculating the timing for minimum size packets as an exercise.
It's also not very interesting, because there are no device/bus/CPU
combinations that can keep up with it, even at 1Gbps, due to overhead
issues that push the latency issues further off.  The maximum packet
rate in theory for 1Gbps is 2700 kpps (46 byte packets).  I believe
the maximum packet rate in practice for anyGbps with a single 3-4GHz
CPU is about 1000 kpps.  With an rx ring size of 1024 and an interrupt
when the ring has 24 descriptors in it (the default is 10, which is
far too small, but that is another bug), this leaves 1 mS of latency
to space.  3 mS was eternity, and 1 mS is still eternity.

Bruce


More information about the cvs-src mailing list