Incorrect ARP table entries

Peter Jeremy peter at rulingia.com
Wed Aug 22 04:02:10 UTC 2012


I've run into a problem where the ARP table on several of my hosts is
apparently spontaneously replacing correct entries with incorrect MAC
addresses.  I've done some digging with tcpdump and can't identify the
cause.  I've tried to look in the code but lost my way since ARP and
IP routing seem to be closely intermingled.  I'm hoping someone might
be able to shed some light on why it is behaving the way it is.

A rough diagram of the relevant part of the network is:

 +---------+            +---------+
 |         | [1]        |         |
 |  local  |============|         |
 |         |            |         |        [3] +---------+
 +----+----+            |         |------------|         |
      H [2]             | switch  |            |  remot  |
 +----+----+            |         |============|         |
 |         |            |         |        [4] +---------+
 |  peer   |============|         |
 |         |            |         |
 +---------+            +---------+

The hosts seeing the problem are running 8.2p2/amd64 ("local") and
8.3p3/i386 ("peer") using pf/carp for failover.  The remote host
("remot") is a HP DL380 running FreeBSD (various between 7 and 10).
My problem is that the entry for the DL380 iLO ([3], in vlan 157) is
randomly having the correct MAC address replaced with the MAC address
of the main interface ([4]).  Note that the iLO is a physically
separate NIC.

'=' are all dual GigE links bonded using lagg/lacp with about 73 vlans
inside them.

[1] MAC "local-nic", IP addresses "local-157" in the iLO vlan (157) and
    "local-ip" in vlan 91.

[2] cross-over cables joining dual FastE NICs bonded using lagg/lacp
    carrying pfsync traffic only.

[3] DL380 iLO connected to a non-trunked switch port in vlan 157.
    MAC "remot-ilo", IP "remo-mgmt".

[4] DL380 bge interfaces.  MAC "remot-nic" and "remot-ip" in vlan 91.
    vlan 157 in configured at the switch end but not used at the host
    end (no interface in vlan 157 is created).

Wheen I run:
 tcpdump -e -i lagg0 'ether host remot-nic or arp or (vlan and arp)'
on "local", the only references to remot-nic or remot-ilo are
(starting following an "arp -d remo-mgmt"):

12:15:41.481523 local-nic > broadcast, vlan 157, ARP, Request who-has remo-mgmt tell local-157
12:15:41.481853 remot-ilo > local-nic, vlan 157, ARP, Reply remo-mgmt is-at remot-ilo

12:15:43.337303 remot-nic > local-nic, vlan  91, IPv4, remot-ip.123 > local-ip.123: NTPv4
12:15:43.337854 local-nic > remot-nic, vlan  91, IPv4, local-ip.123 > remot-ip.123: NTPv4

12:16:46.338434 remot-nic > local-nic, vlan  91, IPv4, remot-ip.123 > local-ip.123: NTPv4
12:16:46.338968 local-nic > remot-nic, vlan  91, IPv4, local-ip.123 > remot-ip.123: NTPv4

12:17:50.338196 remot-nic > local-nic, vlan  91, IPv4, remot-ip.123 > local-ip.123: NTPv4
12:17:50.339027 local-nic > broadcast, vlan  91, ARP, Request who-has remot-ip tell local-ip
12:17:50.339253 remot-nic > local-nic, vlan  91, ARP, Reply remot-ip is-at remot-nic
12:17:50.339272 local-nic > remot-nic, vlan  91, IPv4, local-ip.123 > remot-ip.123: NTPv4

12:17:52.338532 remot-nic > broadcast, vlan  91, ARP, Request who-has other1 tell remot-ip

12:18:01.338517 remot-nic > broadcast, vlan  91, ARP, Request who-has other2 tell remot-ip

12:18:53.337620 remot-nic > local-nic, vlan  91, IPv4, remot-ip.123 > local-ip.123: NTPv4
12:18:53.338145 local-nic > remot-nic, vlan  91, IPv4, local-ip.123 > remot-ip.123: NTPv4

12:19:12.499330 remot-ilo > broadcast, vlan 157, ARP, Request who-has local-157 (local-nic) tell remo-mgmt
12:19:12.499353 local-nic > remot-nic, vlan 157, ARP, Reply local-157 is-at local-nic

And I find in /var/log/messages:
Aug 22 12:19:12 local kernel: arp: remo-mgmt moved from remot-ilo to remot-nic on vlan157

The ARP mapping for remo-mgmt to remot-ilo was correct following the
ARP exchange at 12:15:41 but at 12:19:12, "local" responds to the
wrong MAC address when replying to an ARP request.  In the intervening
period, there are no references to "remot-nic" in vlan 157 or any ARP
requests mentioning remo-mgmt.

--
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20120822/9eaef01c/attachment.pgp


More information about the freebsd-net mailing list