pfsync after reboot does not synchronize
Scott Ullrich
sullrich at gmail.com
Mon Jun 5 18:51:05 PDT 2006
On 6/5/06, David DeSimone <fox at verio.net> wrote:
> I tried posting some messages about PF to the freebsd-net mailing list,
> but they seemed to be ignored. So I thought I would try sending my
> questions here.
>
> I am trying to figure out why pfsync does not seem to work correctly
> when one of my cluster nodes reboots.
>
> When I reboot one of the cluster members, the state tables do appear to
> synchronize, sort of, and populate with some of the same connection
> states, but not all of them.
>
> That is "pfctl -ss" on both cluster members will show a different number
> of state entries. Vastly different if the new member has only been up
> for a minute or two.
>
> In particular, long-lived, extant connections (such as IRC server
> connections) seem to never show up in the rebooted member's state table,
> even though the connections continue to update their state on the
> current carp master.
>
> I figured that doing ifconfig down/up would send some sort of "full
> sync" message between the two members, to cause the entire state table
> to be sent in bulk. Eventually I learned that the method to do this is
> to use "ifconfig syncdev" to force a bulk update:
>
> ifconfig pfsync0 syncdev fxp0 # $pfsync_syncdev
>
> When I perform the above command, I see the following debug output (when
> PF is configured at "misc" or "loud" debug level):
>
> On the cluster member receiving the requests:
>
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
> pfsync: received bulk update request
>
> On the cluster member making the request (where syncdev was just
> ifconfig'd):
>
> pfsync: requesting bulk update
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: received bulk update start
> pfsync: failed to receive bulk update status
>
> After performing this manual action, I find the state table is much
> better populated, and the two firewalls appear to be synchronized.
> However, the messages above bother me. It looks to me like the cluster
> member making the request repeats it over and over again, and finally
> gives up after PFSYNC_MAX_BULKTRIES (12) attempts. Shouldn't that be
> something that only happens in exceptional conditions? Yet, I can make
> it happen every time, even on a test cluster with no traffic (and thus
> an almost empty state table).
>
> Does anyone have any insight as to why I see these problems?
>
> 1. Why does pfsync synchronize the state tables when I use the
> "ifconfig syncdev" trick to force a bulk update, yet it does
> not do this when the system is booting up?
>
> 2. Why does pfsync keep repeating the bulk update request and then give
> up? What message is not getting through?
>
>
> The two cluster members have a direct cross-cable between them. My PF
> policy has these settings:
>
> set skip on pfsync0
>
> pass quick on fxp0 proto pfsync # $pfsync_syncdev
I have also seen this problem with pfSense. To get around the problem
I set the advskew to 200 on the host and wait 30 seconds to give
everything time to sync. I am really not sure what is causing it but
it may be related to the pfsync hold down timer? At any rate we
worked around the problem and I wanted to readdress it after our 1.0
release. I am glad someone else is also seeing the problem.
Let me know if anyone needs more information.
Scott
More information about the freebsd-pf
mailing list