svn commit: r357051 - head/sys/dev/bge

Fri Jan 24 03:09:20 UTC 2020

On Thu, 23 Jan 2020, Gleb Smirnoff wrote:

> On Thu, Jan 23, 2020 at 08:33:58PM -0500, Ryan Stone wrote:
> R> > Because at interrupt level we can batch multiple packets in a single epoch.
> R> > This speeds up unfiltered packet forwarding performance by 5%.
> R> >
> R> > With driver level pfil hooks I would claim even more improvement, because before
> R> > the change we needed to enter epoch twice - once for filtering, than for ether_input.
> R> >
> R> > Epoch isn't a layer, it is a synchronisation primitive, so I disagree about
> R> > statement on layering violation.
> R>
> R> Epoch is a synchronization primitive, but the net_epoch is absolutely
> R> a part of the networking layer.  If we need better batching then the
> R> correct solution is to introduce a batched interface for drivers to
> R> push packets up the stack, not to mess around at the interrupt layer.
>
> Such interface of course would be valuable but will not cover case
> when an interrupt arrives during processing of previous one. So its
> batching possiblities are limited compared to interrupt level batching.
>
> And I already noted that ether_input() isn't the only way to enter
> the network stack.
>
> R> Note that the only reason why this works for mlx4/mlx5 is that
> R> linuxkpi *always* requests a INTR_TYPE_NET no matter what driver is
> R> running.  This means that all drm graphics driver interrupts are now
> R> running under the net_epoch:
> R>
> R> https://svnweb.freebsd.org/base/head/sys/compat/linuxkpi/common/include/linux/interrupt.h?revision=352205&view=markup#l103

The historical reason is that linuxkpi was originally made to support ofed 
and there was no real way to get this information from the driver.

>
> Well, it is not my fault that a video driver requests INTR_TYPE_NET
> interrupt. I mean, you can't put this as a rationale against using
> network epoch for all interrrupts that declare theirselves as network
> interrupts. However, this is harmless.

While we don't have a policy strictly requiring reviews it is the norm to 
have substantial changes socialized and reviewed.  I appreciate the work 
that you are doing but it likely should've been discussed somewhere 
more publicly.  I apologized if I missed it but I don't see reference to 
anything.

Architecturally I am more concerned with the coarseness of net_epoch and 
the duration of hold becoming a resource utilization problem in high 
turn-over workloads.  Like short connection tcp.  Has anyone done 
substantial testing here?  epoch as it is today will hold every free 
callback for a minimum of several clock ticks and a maximum of 2x the 
duration of the longest epoch section time.  With preemption, etc. this 
could be 100s of ms of PCBs held.

Thanks,
Jeff

>
> -- 
> Gleb Smirnoff
>