multi-homed systems stop answering ARP on local addresses w/ifconfig aliases

Chris Buechler cbuechler at gmail.com
Mon Apr 26 22:20:48 UTC 2010


2010/4/26 Александр <der at btsig.ru>:
> On Sun, May 17, 2009 at 4:35 PM, Steven Hartland
> <killing at multiplay.co.uk
> <http://lists.freebsd.org/mailman/listinfo/freebsd-net>> wrote:
>>
>> / Silly question but something else on the network isn't doing a arp spoof
>
> />/ attack is it?
> />/
> /
> No, there isn't any ARP at all on that address on the network when
> this is a problem, verified with tcpdump. That also shouldn't impact
> the system's ability to talk to its own IPs.
>
> thanks for the response though!
>
>
>> / ----- Original Message ----- From: "Chris Buechler"
>
> />/ <freebsd at chrisbuechler.com
> <http://lists.freebsd.org/mailman/listinfo/freebsd-net>>
> />/ To: <net at freebsd.org
> <http://lists.freebsd.org/mailman/listinfo/freebsd-net>>
> />/ Sent: Sunday, May 17, 2009 9:08 PM
> />/ Subject: multi-homed systems stop answering ARP on local addresses
> />/ w/ifconfig aliases
> />/
> />/
> />>/ There seems to be a regression between 6.x and 7.0 and 7.1 related to
> />>/ ifconfig aliases on multi-homed hosts. Not sure on anything newer than
> 7.1
> />>/ (this is pfSense, we're just starting to test 7.2 builds). For periods
> of
> />>/ time, the system will stop answering ARP on some of its own addresses
> and
> />>/ hence anything on that network completely stops functioning. The same
> setup
> />>/ worked fine on 6.2.
> />>/
> />>/ The particular system illustrated here is a router on part of an ISP's
> />>/ network. IPs are all public, in the info provided here they've been
> replaced
> />>/ with 10. IPs. The subnets on the inside interfaces are routed to the
> outside
> />>/ interface. When this problem occurs, the IPs assigned locally on the
> system
> />>/ will still respond from the Internet, but the system itself loses all
> />>/ connectivity with that subnet and nothing on that subnet can
> communicate
> />>/ with the host due to the lack of ARP. That makes some sense, I presume
> when
> />>/ routing to a locally assigned address via another interface, the system
> />>/ doesn't need ARP on the address to respond. But while it still responds
> from
> />>/ the Internet, even the host itself can't initiate a ping to that IP. It
> />>/ behaves the same whether pf is enabled or disabled.
> />>/
> />>/ I see two similar issues in the past, one with a PR:
> />>/ http://www.freebsd.org/cgi/query-pr.cgi?pr=121437&cat=
> <http://www.freebsd.org/cgi/query-pr.cgi?pr=121437&cat=>
> />>/ that's exactly the same issue, it's not limited to VLANs, any
> multi-homed
> />>/ host is affected.
> />>/
> />>/ And another:
> />>/ http://thread.gmane.org/gmane.os.freebsd.stable/57125
> />>/
> />>/ fxp0 is the outside interface. It doesn't make any difference whether
> the
> />>/ ifconfig aliases are on the em0 or fxp1 interfaces, they both behave
> the
> />>/ same if they have any ifconfig aliases assigned.
> />>/
> />>/ # ifconfig
> />>/ fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
> 1500
> />>/       options=8<VLAN_MTU>
> />>/       ether 00:90:27:86:8b:9d
> />>/       inet6 fe80::290:27ff:fe86:8b9d%fxp0 prefixlen 64 scopeid 0x1
> />>/       inet 10.11.185.146 netmask 0xfffffff8 broadcast 10.11.185.151
> />>/       media: Ethernet 100baseTX <full-duplex>
> />>/       status: active
> />>/ em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
> 1500
> />>/       options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
> />>/       ether 00:11:43:2c:62:03
> />>/       inet 10.10.0.1 netmask 0xffffff00 broadcast 10.10.0.255
> />>/       inet6 fe80::211:43ff:fe2c:6203%em0 prefixlen 64 scopeid 0x2
> />>/       inet 10.13.40.1 netmask 0xffffff00 broadcast 10.13.40.255
> />>/       inet 10.13.41.1 netmask 0xffffff00 broadcast 10.13.41.255
> />>/       inet 10.13.42.1 netmask 0xffffff00 broadcast 10.13.42.255
> />>/       inet 10.13.43.1 netmask 0xffffff00 broadcast 10.13.43.255
> />>/       inet 10.13.44.1 netmask 0xffffff00 broadcast 10.13.44.255
> />>/       inet 10.13.45.1 netmask 0xffffff00 broadcast 10.13.45.255
> />>/       inet 10.13.46.1 netmask 0xffffff00 broadcast 10.13.46.255
> />>/       inet 10.13.47.1 netmask 0xffffff00 broadcast 10.13.47.255
> />>/       media: Ethernet autoselect (100baseTX <full-duplex>)
> />>/       status: active
> />>/ fxp1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
> 1500
> />>/       options=8<VLAN_MTU>
> />>/       ether 00:d0:b7:5d:25:9f
> />>/       inet 10.1.242.1 netmask 0xffffff00 broadcast 10.1.242.255
> />>/       inet6 fe80::2d0:b7ff:fe5d:259f%fxp1 prefixlen 64 scopeid 0x3
> />>/       inet 10.1.243.1 netmask 0xffffff00 broadcast 10.1.243.255
> />>/       media: Ethernet autoselect (100baseTX <full-duplex>)
> />>/       status: active
> />>/
> />>/
> />>/
> />>/ When the problem is occurring, you can't even ping the affected locally
> />>/ assigned addresses from the box itself:
> />>/ # ping 10.10.0.1
> />>/ PING 10.10.0.1 (10.10.0.1): 56 data bytes
> />>/ ping: sendto: Network is unreachable
> />>/ ping: sendto: Network is unreachable
> />>/ ping: sendto: Network is unreachable
> />>/
> />>/ And when trying to ping something on one of the affected attached
> subnets,
> />>/ you get:
> />>/ # ping 10.10.0.30
> />>/ PING 10.10.0.30 (10.10.0.30): 56 data bytes
> />>/ ping: sendto: Invalid argument
> />>/ ping: sendto: Invalid argument
> />>/
> />>/
> />>/ In the logs, you get a flood of these messages:
> />>/ May 14 02:55:12    kernel: arpresolve: can't allocate route for
> 10.10.0.1
> />>/ May 14 02:55:12    kernel: arplookup 10.10.0.1 failed: host is not on
> />>/ local network
> />>/ May 14 02:55:12    kernel: arpresolve: can't allocate route for
> 10.10.0.1
> />>/ May 14 02:55:12    kernel: arplookup 10.10.0.1 failed: host is not on
> />>/ local network
> />>/
> />>/
> />>/ It happens both with the primary IP assigned to the interface, and the
> />>/ aliases assigned, but not all at once. Some of the addresses will
> continue
> />>/ to work when others are failing. Somehow it thinks IPs that are locally
> />>/ assigned are not on a local network... after a couple minutes, it just
> />>/ starts working again without making any changes or even touching the
> system.
> />>/
> />>/ If I can provide any additional information, please let me know.
> />>/
> />>/ thanks,
> />>/ Chris
> />>/
> //
>
> //The same thing happened to me. FreeBSD 8.0-RELEASE, bge0-bge4, bge0
> configured with two subnets.
> ARP for primary subnet dissappear randomly (~once a day).
>
>
> Do you have any resolution for this?
>

I've yet to test it on 8, but never got a resolution to the above
described issue on 7.2. For the scenario where that box was deployed,
parts of the functionality were replaced by a Linux box, and the part
of the network where it resides was changed so IP aliases are not in
use.


More information about the freebsd-net mailing list