svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys

Tue Jan 20 15:01:36 UTC 2015

On Tue, Jan 20, 2015 at 10:37:52AM +0100, Hans Petter Selasky wrote:
> On 01/20/15 10:00, Konstantin Belousov wrote:
> > On Tue, Jan 20, 2015 at 08:58:34AM +0100, Hans Petter Selasky wrote:
> >> On 01/20/15 08:51, Konstantin Belousov wrote:
> >>> On Tue, Jan 20, 2015 at 05:30:25AM +0100, Hans Petter Selasky wrote:
> >>>> On 01/19/15 22:59, Adrian Chadd wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Would you please check what the results of this are with CPU specific
> >>>>> callwheels?
> >>>>>
> >>>>> I'm doing some 10+ gig traffic testing on -HEAD with RSS enabled (on
> >>>>> ixgbe) and with this setup, the per-CPU TCP callwheel stuff is
> >>>>> enabled. But all the callwheels are now back on clock(0) and so is the
> >>>>> lock contention. :(
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> Like stated in the manual page, callout_reset_curcpu/on() does not work
> >>>> with MPSAFE callouts any more!
> >>> I.e. you 'fixed' some undeterminate bugs in callout migration by not
> >>> doing migration at all anymore.
> >>>
> >>>>
> >>>> You need to use callout_init_{mtx,rm,rw} and remove the custom locking
> >>>> inside the callback in the TCP stack to get it working like before!
> >>>
> >>> No, you need to do this, if you think that whole callout KPI must be
> >>> rototiled.  It is up to the person who modifies the KPI, to ensure that
> >>> existing code is not broken.
> 
> Hi,
> 
> It is not very hard to update existing callout clients and you can do it 
> too, if you need the extra bits of performance.
I want to avoid regressions, and avoid breaking other' people work.

> 
> Are there more API's than the TCP stack which you think needs an update 
> and are performance critical?
I did not performed any analysis.  More, I naturally expect that such
analysis and demonstration that there is no regression, is the duty
of the person who proposes the change.

> 
> >>>
> >>> As I understand, currently we are back to the one-cpu callouts.
> >>> Do other people consider this situation acceptable ?
> 
> For the TCP stack - yes, but not for other clients like cv_timedwait() 
> and such.
> 
> If you think you have a better way to solve the callout problems, please 
> tell me! In order for a callout to change its CPU you need a lock to 
> protect which CPU the callout is on. Instead of introducing a third lock 
> in the callout path, which will be a congestion point, to protect 
> against changing the CPU number, I decided that we will use the client's 
> mutex and the MPSAFE implies the client doesn't have any mutex. So it 
> won't work with callout clients which use the CALLOUT_MPSAFE flag. 
> Honestly CALLOUT_MPSAFE should not be used, because it leads to extra 
> complexity in the clients catching the race when tearing down the 
> callouts and any pending callbacks.
This is your opinion.

I did fixed some bugs in the callout migration code, and I am not
sure that requiring rototiling of almost all KPI consumers (and leaving
unconverted consumers to pre-cpu state) is the only possible solution.
But again, since it is you who brought the change into the tree, it is
your duty to present a valid proof why this is the only possible way
to solve bugs (which bugs ?).

> 
> >>
> >> Please read the callout 9 manual page first.
> >
> > Assume I read it.  How this changes any of my points above ?
> > """
> > A change in the CPU selection cannot happen if this function is
> > re-scheduled inside a callout function. Else the callback function given
> > by the func argument will be executed on the same CPU like previously
> > done.
> > """
> > You cannot do this without fixing consumers.
> >
> 
> The code simply needs an update. It is not broken in any ways - right? 
> If it is not broken, fixing it is not that urgent.
Isn't it obvious ?  If callouts no longer migrate to non-BSP, this is
the regression. I am sorry for you attitude.