setfib/arpresolve behaviour bug?

Nikolay Denev ndenev at gmail.com
Wed May 16 04:00:30 UTC 2012


Filed as 
misc/167947


On May 12, 2012, at 10:21 AM, Nikolay Denev wrote:

> On Jan 21, 2010, at 6:16 PM, Matt Burke wrote:
> 
>> Box is running 8.0-RELEASE-p2 cvsupped two days ago.
>> 
>> NICs are em bonded with lagg failover and running a few vlan interfaces.
>> 
>> net.my_fibnum: 0
>> net.add_addr_allfibs: 1
>> net.fibs: 4
>> 
>> This is reproducible, but with the lack of (accessible?) documentation on
>> multiple routing tables, I don't know if this is intended behaviour or a bug.
>> 
>> It seems processes using a non-default fib cannot perform arp lookups
>> unless the fib 0 has a routing table entry for the attached network:
>> 
>> [root at host ~]# ifconfig vlan11 a.a.a.92/27
>> [root at host ~]# route delete -net a.a.a.64/27
>> delete net a.a.a.64
>> [root at host ~]# setfib 1 ping a.a.a.65
>> PING a.a.a.65 (a.a.a.65): 56 data bytes
>> ping: sendto: Invalid argument
>> ^C
>> --- a.a.a.65 ping statistics ---
>> 1 packets transmitted, 0 packets received, 100.0% packet loss
>> [root at host ~]# dmesg |tail -1
>> arpresolve: can't allocate llinfo for a.a.a.65
>> 
>> 
>> Putting the entry into the arp cache before removing the route results in
>> success:
>> 
>> [root at host ~]# ifconfig vlan11 a.a.a.92/27
>> [root at host ~]# setfib 1 ping a.a.a.65
>> PING a.a.a.65 (a.a.a.65): 56 data bytes
>> 64 bytes from a.a.a.65: icmp_seq=0 ttl=255 time=1.437 ms
>> ^C
>> --- a.a.a.65 ping statistics ---
>> 1 packets transmitted, 1 packets received, 0.0% packet loss
>> round-trip min/avg/max/stddev = 1.437/1.437/1.437/0.000 ms
>> [root at host ~]# route delete -net a.a.a.64/27
>> delete net a.a.a.64
>> [root at host ~]# setfib 1 ping a.a.a.65
>> PING a.a.a.65 (a.a.a.65): 56 data bytes
>> 64 bytes from a.a.a.65: icmp_seq=0 ttl=255 time=0.762 ms
>> ^C
>> --- a.a.a.65 ping statistics ---
>> 1 packets transmitted, 1 packets received, 0.0% packet loss
>> round-trip min/avg/max/stddev = 0.762/0.762/0.762/0.000 ms
>> 
>> 
>> and deleting it again results in failure:
>> 
>> [root at host ~]# arp -an
>> ? (a.a.a.92) at 00:11:27:00:d7:c4 on vlan11 permanent [vlan]
>> ? (a.a.a.65) at 00:1a:e4:00:60:bf on vlan11 [vlan]
>> ...
>> [root at host ~]# arp -d a.a.a.65
>> delete: cannot locate a.a.a.65
>> [root at host ~]# setfib 1 arp -d a.a.a.65
>> a.a.a.65 (a.a.a.65) deleted
>> [root at host ~]# setfib 1 ping -c1 a.a.a.65
>> PING a.a.a.65 (a.a.a.65): 56 data bytes
>> ping: sendto: Invalid argument
>> ^C
>> --- a.a.a.65 ping statistics ---
>> 1 packets transmitted, 0 packets received, 100.0% packet loss
>> 
>> 
>> This behaviour seems a little inconsistent, with fib 1 requesting arp
>> lookups, fib 0 performing and displaying them, but fib 1 needing to delete
>> them...
>> 
>> 
>> 
>> -- 
>> 
>> The information contained in this message is confidential and is intended for the addressee only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorised use, disclosure, copying or alteration of this message is strictly forbidden. 
>> 
>> Critical Software Ltd. reserves the right to monitor and record e-mail messages sent to and from this address for the purposes of investigating or detecting any unauthorised use of its system and ensuring its effective operation.
>> 
>> Critical Software Ltd. registered in England, 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH.
>> 
>> ------------------------------------------------------------
>> This message has been scanned for security threats by iCritical.
>>   For further information, please visit www.icritical.com
>> ------------------------------------------------------------
> 
> I've encountered exactly the same problem today.
> 
> I have a machine with public addresses, and also a interface for out of band management with private address, and I wanted to use
> a separate FIB for the private interface and it's routes.
> 
> When I've deleted the routes for the private interface form the main FIB, arpresolve stopped working.
> 
> The I've patched sys/netinet/in.c with the following patch :
> 
> --- sys/netinet/in.c.orig	2012-05-12 08:57:17.000000000 +0200
> +++ sys/netinet/in.c	2012-05-12 08:56:43.000000000 +0200
> @@ -1418,21 +1418,21 @@
> 
> static int
> in_lltable_rtcheck(struct ifnet *ifp, u_int flags, const struct sockaddr *l3addr)
> {
> 	struct rtentry *rt;
> 
> 	KASSERT(l3addr->sa_family == AF_INET,
> 	    ("sin_family %d", l3addr->sa_family));
> 
> 	/* XXX rtalloc1 should take a const param */
> -	rt = rtalloc1(__DECONST(struct sockaddr *, l3addr), 0, 0);
> +	rt = rtalloc1_fib(__DECONST(struct sockaddr *, l3addr), 0, 0, ifp->if_fib);
> 
> 	if (rt == NULL)
> 		return (EINVAL);
> 
> 	/*
> 	 * If the gateway for an existing host route matches the target L3
> 	 * address, which is a special route inserted by some implementation
> 	 * such as MANET, and the interface is of the correct type, then
> 	 * allow for ARP to proceed.
> 	 */
> 
> 
> And this seems to fix the issue.
> 
> Now that the multi FIB code is in GENERIC probably this (or similar fix) should be comitted.
> 
> P.S.: I also wonder why the loopback route for an interface address is also installed explicitly in the default FIB?
> 
> 
> 



More information about the freebsd-net mailing list