svn commit: r304436 - in head: . sys/netinet

Slawa Olhovchenkov slw at zxy.spb.ru
Fri Aug 26 15:13:27 UTC 2016


On Fri, Aug 26, 2016 at 04:01:14PM +0100, Bruce Simpson wrote:

> Slawa,
> 
> I'm afraid this may be a bit of a non-sequitur. Sorry.. I seem to be 
> missing something. As I understand it this thread is about Ryan's change 
> to netinet for broadcast.
> 
> On 26/08/16 15:49, Slawa Olhovchenkov wrote:
> > On Sun, Aug 21, 2016 at 03:04:00AM +0300, Slawa Olhovchenkov wrote:
> >> On Sun, Aug 21, 2016 at 12:25:46AM +0100, Bruce Simpson wrote:
> >>> Whilst I agree with your concerns about multipoint, I support the
> >>> motivation behind Ryan's original change: optimize the common case.
> >>
> >> Oh, common case...
> >> I am have pmc profiling for TCP output and see on this SVG picture and
> >> don't find any simple way.
> >> You want to watch too?
> >
> > At time peak network traffic (more then 25K connections, about 20Gbit
> > total traffic) half of cores fully utilised by network stack.
> >
> > This is flamegraph from one core: http://zxy.spb.ru/cpu10.svg
> > This is same, but stack cut of at ixgbe_rxeof for more unified
> > tcp/ip stack view http://zxy.spb.ru/cpu10u.svg
> ...
> 
> I appreciate that you've taken the time to post a flamegraph (a 
> fashionable visualization) of relative performance in the FreeBSD 
> networking stack.
> 
> Sadly, I am mostly out of my depth for looking at stack wide performance 
> for the moment; for the things I look at involving FreeBSD at work just 
> at the moment, I would not generally go down there except for specific 
> performance issues (e.g. with IEEE 1588).
> 
> It sounds as though perhaps you should raise a wider discussion about 
> your results on -net. I would caution you however that the Function 
> Boundary Trace (FBT) provider for DTrace can introduce a fair amount of 
> noise to the raw performance data because of the trap mechanism it uses. 
> This ruled it out for one of my own studies requiring packet-level accuracy.
> 
> Whilst raw pmc(4) profiles may require more post-processing, they will 
> provide less equivocal data (and a better fix) on the hot path, due also 
> to being sampled effectively on a PMC interrupt (a gather stage- poll 
> core+uncore MSRs), not purely a software timer interrupt.

Thanks for answer, I am now try to start discussion on -net.


More information about the svn-src-head mailing list