openbgpds not talking each other since 8.2-STABLE upgrade
dougb at FreeBSD.org
Tue Jan 3 03:53:38 UTC 2012
We have a pair of physical FreeBSD systems configured as routers
designed to operate in an active/standby CARP configuration. Everything
used to work fine, but since an upgrade to 8.2-STABLE on December 29th
the two routers don't speak BGP to each other anymore. They both
function fine individually, and failover works. It is only the openbgpd
communication between them that's not flowing.
They have OpenBGPd (openbgpd-4.9.20110612_1 from ports) installed. The
active router takes BGP full route feeds from our peers and *should*
feed it to the standby router via a direct connection (crossover cable
between physical em2 ports).
The relative "bgpctl show" reports:
10.0.0.2 12345 0 0 0 Never Active
10.0.0.2 12345 0 0 0 Never Connect
The bgp daemon for the active server periodically reports:
bgpd: neighbor 10.0.0.2: socket error: Operation timed out
There is not a connectivity problem between the two hosts; ssh for
example works fine. Telnet'ing to the bgp port times out, even from the
There is no firewall configured on that interface.
TCP-MD5 is *not* configured on the bgpd side. We did try enabling it
(properly) between the two machines via /etc/ipsec.conf to see if it
would make a difference, but that also had no effect on this problem.
We've tried tcpdump, and both machines can clearly see the TCP SYN and
SYN-ACK setup packets flowing in both directions, but the ACK packet
never happens. In netstat -an, the opening side gets:
tcp4 0 0 10.0.0.2.16797 10.0.0.1.179 SYN_SENT
and the receiving side gets:
tcp4 0 0 10.0.0.1.179 10.0.0.2.16797 SYN_RCVD
Just to make sure pf can't possibly be affecting this, right at the top
of pf.conf on both machines:
## Pass inter-router traffic
pass quick on em2 from 10.0.0.2 to 10.0.0.1
pass quick on em2 from 10.0.0.1 to 10.0.0.2
This is sufficient because we can connect to bgpd with nc:
$ nc -S 10.0.0.2 179
$ netstat -an | fgrep 10.0.0.2
tcp4 0 0 10.0.0.1.25711 10.0.0.2.179 ESTABLISHED
$ netstat -an | fgrep 10.0.0.1
tcp4 0 0 10.0.0.2.179 10.0.0.1.25711 ESTABLISHED
So this appears to be some sort of weird problem specific to openbgpd
and the updated kernel.
At this point I'm at a loss as to how to proceed, so any suggestions on
how to fix, or even debug this will be greatly appreciated.
More information about the freebsd-net