svn commit: r295136 - in head: sys/kern sys/netinet sys/sys usr.bin/netstat

Tue Feb 2 22:18:54 UTC 2016

On 2 February 2016 at 13:21, Alfred Perlstein <alfred at freebsd.org> wrote:
>
>
> On 2/2/16 1:09 PM, Slawa Olhovchenkov wrote:
>>
>> On Tue, Feb 02, 2016 at 12:35:47PM -0800, Alfred Perlstein wrote:
>>
>>>> I would second John's comment on the necessity of the change though,
>>>> if one already have 32K of *backlogged* connections, it's probably not
>>>> very useful to allow more coming in.  It sounds like the application
>>>> itself is seriously broken, and unless expanding the field have some
>>>> performance benefit, I don't think it should stay.
>>>
>>> Imagine a hugely busy image board like 2ch.net, if there is a single
>>> hiccup, it's very possible to start dropping connections.
>>
>> In reality start dropping connections in any case: nobody will be
>> infinity wait of accept (user close browser and go away, etc).
>>
>> Also, if you have more then 4K backloged connections -- you have
>> problem, you can't process all connections request and in next second
>> you will be have 8K, after next second -- 12K and etc.
>>
> Thank you Slawa,
>
> I am pretty familiar with what you are describing which are "cascade
> failures", however in order to understand why such a change makes sense I
> can give you a little early history lesson on a project I developed under
> FreeBSD, and then explain why such a project would probably not work with
> FreeBSD as a platform today (we would have to use Linux or custom patches).
>
> Here is that use case:
>
> Back in 1999 I wrote a custom webserver using FreeBSD that was processing
> over 1500 connections per second.
>
> What we were doing was tracking web hits using "hidden gifs".  Now this was
> 1999 with only 100mbit hardware and a pentium 400mhz.  Mind you I was doing
> this with cpu to spare, so having an influx of additional hits was OK.
>
> Meaning I could easily deal with backlog.
>
> Now what was important about this case was that EVERY time we served the
> data we were able to monitize it and pay for my salary at the time which was
> working on SMP for FreeBSD and a bunch of other patches.  Any lost hits /
> broken connections would easily cost us money, which in turn meant less time
> on FreeBSD and less time fixing things to scale.
>
> In our case the user would not really know if our "page" didn't load because
> we were just an invisible gif.
>
> So back to the example, let's scale that out to today's numbers.
>
> 100mbps -> 10gigE, so that would be 1500 conn/sec -> 150,000 conn/sec.  so
> basically at 0.20 of a second of any sort of latency I will be overflowing
> the listen queue and dropping connections.
>
> Now when you still have CPU to spare because connections *are* precious,
> then the model makes sense to slightly over-provision the servers to allow
> for somebacklog to be processed.
>
> So, in today's day and age, it really does make sense to allow for buffering
> more than 32k connections, particularly if the developer knows what he is
> doing.
>
> Does this help explain the reasoning?

Just to add to this: the VM system under ridiculous load (like say,
deciding it can dirty most of your half-terabyte of RAM and get behind
in writing stuff to disk) can cause the system to pause for little
pieces of time. It sucks, but it happens. 0.20 seconds isn't all that
long.

And that's at 150,000 conn/sec. There's TCP locking work that will
hopefully increase that value..

-a