11.0 stuck on high network load

Slawa Olhovchenkov slw at zxy.spb.ru
Fri Oct 14 10:02:25 UTC 2016


On Fri, Oct 14, 2016 at 11:48:38AM +0200, Julien Charbon wrote:

> >>> Also, using dtrace too complex in production (need complex startup
> >>> under screen and capture output) and for many peoples.
> >>> kdb_backtrace() have too less administrative overhead.
> >>
> >>  I still think it is overkill.  The main goal of this change is to fix a
> >> quite tricky and old TCP stack locking issue.  Let's try to do that
> >> first, it is complex enough by itself.
> >>
> >>  Once the fix is validated and pushed, feel free to propose your own
> >> patch/review to add kdb_backtrace(), log(), etc.. to get other devs
> >> point of view.
> >>
> >>  I don't remember who said: "Never ever optimize error cases"...
> > 
> > This is not optimeze error cases, this is error recovery and
> > diagnostic of error cases in other subsystems.
> 
>  Sure, I guess this quote is more geared toward:  "Always spend 50x more
> time on improving the main path than the error path".
> 
> > Currently FreeBSD internals too complex for just always trust on
> > correct of other subsystem or do panic on any incosystency.
> > 
> > INVARIANTS too expensive now (20Gbit drops to 8Gbits).
> 
>  I do agree.  I am not expert enough to see all the side effects of
> calling kdb_backtrace() from the TCP stack, might be way too slow,
> tricky in interruption context, etc.  You can see that  kdb_backtrace()

I think about this. This is example take from netgraph and this
similar case (about interruption context and etc). Occurrence to rare
(one per day, may be one per two hour) for any overhead.
OK, I am see you point: you expirence don't allow to put this code and
need separete review and commit. Right, np.

> is rarely called in the kernel source.  That's why it is better if you
> propose a review on adding this line to get comments from other devs on
> just this question.
> 
> > PS: I am applay patch. Wait till monday.
> > 
> > Thanks very match for this hard work!
> 
>  No problem, thanks for your time.  But it is not over yet:  We have to
> wait for final test.

Currently system don't use Chelsio TOE, after monday I am update
system with Chelsio TOE. With chelsio I am see this occurrence very
rare, one in few month.


More information about the freebsd-stable mailing list