kern/161123: CARP - when preemption is enabled carp interface assumes MASTERship immediately even with higher advbase/advskew

Damien Fleuriot dam at my.gd
Thu Sep 29 15:10:10 UTC 2011


>Number:         161123
>Category:       kern
>Synopsis:       CARP - when preemption is enabled carp interface assumes MASTERship immediately even with higher advbase/advskew
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 29 15:10:09 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator:     Damien Fleuriot
>Release:        8.2-RELEASE
>Organization:
Hi-Media
>Environment:
FreeBSD pf2.multiprojet 8.2-RELEASE FreeBSD 8.2-RELEASE #1: Thu Sep 29 16:11:04 CEST 2011     root at pf2.multiprojet:/usr/obj/usr/src/sys/MULTI  amd64
>Description:
Under normal operating circumstances, a CARP interface goes through the following states:
- INIT : when it's down
- BACKUP : immediately upon being brought up, the interface assumes a BACKUP role and starts its timer to know if it should  claim mastership.
- MASTER : if the delay has expired (advbase * 3) without the interface seeing another master, it assumes mastership.


BUG: When preemption is enabled (net.inet.carp.preempt=1) , the CARP interface immediately assumes MASTERship regardless of its advbase and advskew values.

This causes CARP switchovers when a firewall from a CARP cluster is rebooted, for example.

In our case, this actually led to lost client connections, lost database sessions, developers' daemons crashes because of lost java/db connections...



This is a known problem with OpenBSD 3.8 and lower's implementation of CARP.
This has been fixed as of OpenBSD 3.9.

Refer: my post on -stable
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=368260+0+current/freebsd-stable

>How-To-Repeat:
Set up 2 boxes with a shared CARP IP.
Enable CARP preemption.

Bring down your CARP interface on the BACKUP box.
Bring it up again.
Notice how your interface assumed MASTERship for a short time.
Check with dmesg which confirms that your box actually preempted.
>Fix:
The fix lies in sys/netinet/ip_carp.c in function carp_setrun(struct carp_softc *sc,
sa_family_t af).

All that is needed is to get rid of the code portion which instruct the CARP interface to immediately transition from INIT to MASTER if it has preemption enabled.

Patch attached.


Patch attached with submission follows:

--- sys/netinet/ip_carp.c	2011-09-29 15:00:07.000000000 +0200
+++ sys/netinet/ip_carp.c	2011-09-29 15:01:37.000000000 +0200
@@ -1390,22 +1390,10 @@
 
 	switch (sc->sc_state) {
 	case INIT:
-		if (carp_opts[CARPCTL_PREEMPT] && !carp_suppress_preempt) {
-			carp_send_ad_locked(sc);
-			carp_send_arp(sc);
-#ifdef INET6
-			carp_send_na(sc);
-#endif /* INET6 */
-			CARP_LOG("%s: INIT -> MASTER (preempting)\n",
-			    SC2IFP(sc)->if_xname);
-			carp_set_state(sc, MASTER);
-			carp_setroute(sc, RTM_ADD);
-		} else {
-			CARP_LOG("%s: INIT -> BACKUP\n", SC2IFP(sc)->if_xname);
-			carp_set_state(sc, BACKUP);
-			carp_setroute(sc, RTM_DELETE);
-			carp_setrun(sc, 0);
-		}
+		CARP_LOG("%s: INIT -> BACKUP\n", SC2IFP(sc)->if_xname);
+		carp_set_state(sc, BACKUP);
+		carp_setroute(sc, RTM_DELETE);
+		carp_setrun(sc, 0);
 		break;
 	case BACKUP:
 		callout_stop(&sc->sc_ad_tmo);


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list