[Bug 235005] r342051 "pfsync: Performance improvement" breaks CARP when used with pfsync
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Wed Jan 16 19:16:13 UTC 2019
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235005
Bug ID: 235005
Summary: r342051 "pfsync: Performance improvement" breaks CARP
when used with pfsync
Product: Base System
Version: 12.0-STABLE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: kern
Assignee: bugs at FreeBSD.org
Reporter: thomas at gibfest.dk
After quite a few buildworlds+kernel and reboots I've managed to isolate base
r342051 "pfsync: Performance improvement" as the reason why lagg stopped
working for me.
I've been building a couple of carp+pf routers/firewalls, originally with
12-BETA2 but they were recently upgraded to 12-STABLE base r342254 which is
when both carp nodes started being MASTER instead of one MASTER and one BACKUP
node.
The notes from my bisecting are below. All tests are with the same
configuration. As you can see, base r342051 is the commit where it broke.
12-STABLE base r339946 MASTER/BACKUP
12-STABLE base r341100 MASTER/BACKUP
12-STABLE base r341677 MASTER/BACKUP
12-STABLE base r341965 MASTER/BACKUP
12-STABLE base r342037 MASTER/BACKUP
12-STABLE base r342050 MASTER/BACKUP
12-STABLE base r342051 MASTER/MASTER
12-STABLE base r342055 MASTER/MASTER
12-STABLE base r342073 MASTER/MASTER
12-STABLE base r342109 MASTER/MASTER
12-STABLE base r342254 MASTER/MASTER
I've further confirmed pfsync to be at fault, when pfsync is not enabled the
two nodes are MASTER and BACKUP as they should be. Immediately after I start
pfsync the BACKUP node becomes MASTER and logs these messages:
Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync bulk
done)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.52: BACKUP -> MASTER
(preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.51: BACKUP -> MASTER
(preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg3: BACKUP -> MASTER (preempting
a slower master)
...but the MASTER also stays MASTER, and chaos ensues, nothing works on the
network. Stopping pfsync doesn't resolve the situation, only a reboot with
pfsync disabled restores normal carp functionality.
I suggest maybe backing out base r342051 while we investigate the cause, if a
fix can't be found quickly. I suspect it could have something to do with the
pfsync carp demotion code, which the log messages above seem to confirm, but I
don't know.
Let me know if further info is needed about my configuration or anything. See
also this thread on -stable
https://lists.freebsd.org/pipermail/freebsd-stable/2019-January/090421.html
which confirms I am not the only one experiencing this.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list