BAD state/State failure with large number of requests
Rolf Grossmann
rg at progtech.net
Thu Sep 28 16:01:02 PDT 2006
Hi,
thank you very much for your fast response.
Daniel Hartmeier wrote:
> The client is not honouring the 2MSL quiet period, the time it should
> wait before re-using the same source port to connect to the same
> destination address/port, as required by the TCP RFCs.
>
> The reason for that is quite likely that it has run out of random high
> source ports. The range used should be about 49152-65536 (sysctl
> net.inet.ip.portrange.*), and 10,000 connections is getting close. The
> client stack can either make ap fail in connect(2), or re-use source ports
> and violate the RFCs in this case.
You're absolutely correct, that seems to be my problem. Increasing the
range allows me to get a lot more requests through.
> Not sure if this is a realistic test, i.e. whether you see the very same
> problem in production (with 'BAD state' messages for SYN packets), it
> would only occur if one client is establishing connections to the same
> server port at high concurrency and/or rate. If not, I'd say the test is
> simply flawed, and you need multiple clients to simulate realistically.
I've been suspecting that the test is flawed, but I couldn't put my
finger on it. However, I also need a way to actually test my
application with a lot of requests and I wouldn't want to buy another
server farm for that ;)
> pf keeps state entries around for a while after a connection has been
> closed (to catch packets related to the old connection that might arrive
> late), the timeout is tcp.closed, 90s by default. You can make pf purge
> such state entries sooner by lowering this timeout.
That timeout seems awfully long to me. Is there some standard that
mandates such a long timeout? At least for testing I will definitely
lower that, too.
Thanks again, Rolf.
More information about the freebsd-pf
mailing list