cvs commit: src/sys/netinet ip_icmp.c tcp.h tcp_input.c tcp_subr.ctcp_usrreq.c tcp_var.h

Fri Jan 9 10:10:12 PST 2004

On Fri, 9 Jan 2004, Andre Oppermann wrote:
> silby at silby.com wrote:
> >
> > > David Xu wrote:
> > >>
> > >> I got following messages when I am running mysql stress test suite,
> > >> and the test can not be completed.
> > >>
> > >> "too many small tcp packets from 128.0.0.1:20672, av. 91byte/packet,
> > >> dropping connection"
> > >
> > > You can set net.inet.tcp.minmssoverload to a higher value than the
> > > default of 1,000.  I suggest trying with 2,000 as next step and see if
> > > it still overloads.
> > >
> > > Appearently my default of 1,000 pps is fine for normal use but too low
> > > for some edge cases.
> > >
> > > Could you check the MySQL source code if it has a setsockopt() setting
> > > the TCP_NODELAY option?  That would help to explain a lot.
> >
> > This might nerf the protection a bit, but could reduce the packet counter
> > once for each socket write the local machine does?  That should protect
> > chatty applications, but still detect those that are just flooding data to
> > a bulk service like ftp or smtp.

This is exactly what I was worried about.  I know of several applications
that send/receive lots of small packets as a control connection,
especially over localhost.  Most are a sort of RPC mechanism where
TCP_NODELAY is set to make sure the request gets to the server
immediately and is not queued according to Nagle.

> It doesn't help in this case as we don't have any control over the sender
> and thus don't know whether he has set TCP_NODELAY.

Perhaps you didn't understand Mike?  You don't care if TCP_NODELAY is set
on their side, all you care about is the packet equilibrium.  If you send
data in response to receiving a segment, the net equilibrium is preserved.
The real behavior you want to detect is someone sending a lot of small
chunks of data that the application could process as larger chunks.  If
the application waits until it has a full "record" before responding, you
can distinguish the degenerate case by the application's response rate.

> I suspect that the database(s) are setting TCP_NODELAY and do a write()
> for every record they have retrieved from the query result.  Yet one more
> who has been fooled by the name "TCP_NODELAY".  The database would be
> better off and have more performance not using nodelay and let Nagle do
> its work.

In the case above, the small packets are coming from an ephemeral port so
they are likely the query packets, not the response.  But if you
subtracted one from the counter each time the database responded with
data, it's likely the request/response rate would be roughly constant.

The database did not set TCP_NODELAY, the client did.  Since the query is
a small request and you need a response before you can send the next
request (assuming it's not doing transaction logging), you do want
TCP_NODELAY on the client.

-Nate