rdr pass for proto tcp sometimes creates states with expire time zero and so breaking connections

Sat Oct 27 12:22:39 UTC 2018

Thanks very much for answer especially for the hint to openbsd.

> I wonder if there’s an integer overflow in the of_state_expires()
> calculation.
> The OpenBSD people have a cast to u_int64_t in their version:
> 
> |timeout = (u_int64_t)timeout * (end - states) / (end - start);
> |
> 
> Perhaps this would fix your problem? (Untested, not even compiled)
> 
> |        if (end && states > start && start < end) {
>                 if (states < end) {
>                     timeout = (uint64_t)timeout * (end - states) / (end - start);
>                         return (state->expire + timeout;
>                 }
>                 else
>                         return (time_uptime);
>         }
>         return (state->expire + timeout);

I can confirm the patch of the openbsd people adding the uint64_t cast
makes sense. If
        timeout * (end - states)
becomes bigger than UINT32_MAX (I am on i386) the cast prevents the
overflow of this product and the result of the adaptive calculation will
always be correct.

Example: start=6000, end=12000, timeout=86400 * 5 (5 days), states=100
         result 140972, result with cast patch 856800.

In the problem I have reported for states of "rdr pass" rules I see
start=6000, end=12000, timeout=86400 and (obviously erroneous, probably
negative) states=0xffffffd0. Therefore the condition "states < end" is
false and instead of doing the adaptive calculation the function
pf_states_expires() returns time_update.

In the code snippet

  start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START];
  if (start) {
         end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END];
         states = counter_u64_fetch(state->rule.ptr->states_cur);
  } else {
         start = V_pf_default_rule.timeout[PFTM_ADAPTIVE_START];
         end = V_pf_default_rule.timeout[PFTM_ADAPTIVE_END];
         states = V_pf_status.states;
  }

I see start=6000, so the variable states with value 0xffffffd0 is hold
in a counter variable. Based on these facts there are two questions
(please notice I have not any adaptive statements in my pf.conf):

1. Why I see start=6000 for states of "rdr pass" rules ? For all states
of filter rules I see always start=0 and therefore a counter variable is
never used.

2, The counter variable states_cur is changed exclusiv by the macros
STATE_INC/DEC_COUNTERS, why one of them can be negative ?

The answer of question 1 I can give myself: All states of filter rules
include a pointer to the underlying rule which includes counter
variables too. A state of a "rdr pass" rule does not have such an
underlying rule, so the global pf_default_rule is used. That means in
the first line of the code snippet above we have

    state->rule.ptr == pf_default_rule

and the value of timeout[PFTM_ADAPTIVE_END] indeed is initialized with
6000. Further the counter variable for states_cur of pf_default_rule is
used für all "rdr/nat/binat pass" rules together. This was a little bit
suprising for me, but I think this is intended behaviour. Correct ?

I still don't have an answer for question 2. My problem is best
described in the log of commit 263029:

  Once pf became not covered by a single mutex, many counters in it
  became race prone. Some just gather statistics, but some are later
  used in different calculations.

  A real problem was the race provoked underflow of the states_cur
  counter on a rule. Once it goes below zero, it wraps to UINT32_MAX.
  Later this value is used in pf_state_expires() and any state created
  by this rule is immediately expired.

  Thus, make fields states_cur, states_tot and src_nodes of struct
  pf_rule be counter(9)s.

I run FreeBSD 10 r338093 and of course have this and the following
commits included.

Are there any hints why the counter pf_default_rule->states_cur
could get a negative value ?

 ---
Andreas Longwitz