BAD state/State failure with large number of requests

Thu Sep 28 16:17:32 PDT 2006

On 9/28/06, Daniel Hartmeier <daniel at benzedrine.cx> wrote:
>
> On Thu, Sep 28, 2006 at 11:30:48PM +0200, Rolf Grossmann wrote:
>
> > Sep 28 23:56:56 balancer kernel: pf: BAD state: TCP 10.1.1.2:8080
> 10.25.0.41:8080 10.25.0.100:52209 [lo=2341692840 high=2341759447 win=33304
> modulator=0 wscale=1] [lo=2919421554 high=2919488162 win=33304 modulator=0
> wscale=1] 9:9 S seq=2345137961 ack=2919421554 len=0 ackskew=0 pkts=6:5
> dir=in,fwd
> > Sep 28 23:56:56 balancer kernel: pf: State failure on: 1       | 5
>
> This means there is an existing state entry from an old (and already
> closed) connection, and the client is re-using its source port 52209 for
> a new connection attempt (it's a SYN packet that triggered the log
> message).
>
> The client is not honouring the 2MSL quiet period, the time it should
> wait before re-using the same source port to connect to the same
> destination address/port, as required by the TCP RFCs.
>
> The reason for that is quite likely that it has run out of random high
> source ports. The range used should be about 49152-65536 (sysctl
> net.inet.ip.portrange.*), and 10,000 connections is getting close. The
> client stack can either make ap fail in connect(2), or re-use source ports
> and violate the RFCs in this case.
>
> Not sure if this is a realistic test, i.e. whether you see the very same
> problem in production (with 'BAD state' messages for SYN packets), it
> would only occur if one client is establishing connections to the same
> server port at high concurrency and/or rate. If not, I'd say the test is
> simply flawed, and you need multiple clients to simulate realistically.
>
> pf keeps state entries around for a while after a connection has been
> closed (to catch packets related to the old connection that might arrive
> late), the timeout is tcp.closed, 90s by default. You can make pf purge
> such state entries sooner by lowering this timeout.
>
> This most likely has nothing to do with rdr and load-balancing. The
> difference between enabling and disabling your rdr rule is basically
> that of filtering statefully vs. statelessly. Your 'pass all' rule does
> not create state, while the rdr will automatically create state.
>
> Daniel

I ran into this problem using, specifically, PHP applications connecting to
a remote MySQL server (both FreeBSD).  The scripts ran roughly every 60s and
opened way too many connections (bad code), so obviously 90s was too long
for tcp.closed if the script used up the random high source ports and then
something else tried to connect from a reused port and the state hadn't
expired.  Scaling down tcp.closed made perfect sense in this case...but the
scripts were rewritten too.

The part that confused me was that the connections failed immediately -- it
turns out that PF sends a RST upon state mismatch during the intial
handshake, as opposed to dropping the packets and letting the connection
time out.

If I understood something I heard before correctly, FreeBSD's networking
stack does something special when the connection rate gets really high to
avoid such re-use, but it's been a while since I read about it so I can't
recall the details of the adaptive behavior.