svn commit: r191305 - head/usr.sbin/ppp

Bjoern A. Zeeb bz at FreeBSD.org
Tue Apr 21 10:45:08 UTC 2009


On Mon, 20 Apr 2009, Qing Li wrote:

Hi,

> Are you really sure backing this change out is the right thing to do ??

yes, I am.

1) if you do not set RTA_GATEWAY but still copy the adress, your kernel
    FIB possibly ends up with an incorrect address for the next field,
    like RTA_NETMASK, RTA_IFP, ..) for example.
    See sys/net/rtsock.c:rt_xaddrs()

2) I had this sample around for a few days now as I have been lerning
    more rtsock.c stuff with the help of Kip and finding bugs (ignore
    those for a moment):

dopt# ifconfig tun2 create inet 192.0.2.100 192.0.2.101 up
dopt# ifconfig tun2
tun2: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
         inet6 fe80::2e0:81ff:fe31:db8c%tun2 prefixlen 64 scopeid 0x8
         inet 192.0.2.100 --> 192.0.2.101 netmask 0xffffff00 
dopt# netstat -rn | grep tun2
192.0.2.101        link#8             UH          0        0   tun2
fe80::%tun2/64                    link#8                        U tun2
ff01:8::/32                       fe80::2e0:81ff:fe31:db8c%tun2 U tun2
ff02::%tun2/32                    fe80::2e0:81ff:fe31:db8c%tun2 U tun2

So querying the FIB for the IPv4 (ignoring the v6 for the moment for
simplicity) route (with something handcrafted) I get back:

rtm_msglen      252
rtm_version     5
rtm_type        RTM_GET
rtm_index       8
rtm_flags       =5<UP,HOST>
rtm_addrs       =b3<DST,GATEWAY,IFP,IFA,BRD>
rtm_pid         0
rtm_seq         0
rtm_errno       0
rtm_fmask       =0<>
rtm_inits       =0<>
rt_metrics rtm_rmx:
  rmx_locks      0
  rmx_mtu        1500
  rmx_hopcount   0
  rmx_expire     0
  rmx_recvpipe   0
  rmx_sendpipe   0
  rmx_ssthresh   0
  rmx_rtt        0
  rmx_rttvar     0
  rmx_pksent     0
  rmx_filler     { 0, 0, 0, 0 }
addresses:
            16          (inet) 192.0.2.101                                 =1<DST>

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

            54          (link)                                             =2<GATEWAY>
0000 36 12 08 00 17 00 00 00 00 00 00 00 00 00 00 00  |6...............|
0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0030 00 00 00 00 00 00                                |......          |

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

            56          (link) tun2                                        =10<IFP>
0000 38 12 08 00 17 04 00 00 74 75 6e 32 00 00 00 00  |8.......tun2....|
0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  |................|
0030 00 00 00 00 00 00 00 00                          |........        |
            16          (inet) 192.0.2.100                                 =20<IFA>
                               (RTA_BRD)                                   =80<BRD>


As you can see, we do get an RTA_GATEWAY from the kernel with AF_LINK.

Decoding the sockaddr_dl:

.sdl_len	= 0x36 (54)
.sdl_family	= 0x12 (18, AF_LINK)
.sdl_index	= 8
.sdl_type	= 0x17 (23, IFT_PPP)
.sdl_nlen	= 0
.sdl_alen	= 0
.sdl_slen	= 0
.sdl_data	= { 0, .. }

So we are clearly getting back an AF_LINK from the kernel and ought to
be able to just stick it back in there as is.


3) To show you that it still works as expected:

: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        link#8             UH          0        0   1500     tun2

: route change 192.0.2.101 -mtu 1300
change host 192.0.2.101
: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        link#8             UH          0        0   1300     tun2

and:

: route change 192.0.2.101 -link -mtu 1400
change host 192.0.2.101
: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        link#8             UH          0        0   1400     tun2

both work (note, even with -link I _think_ route does not pass down
the sockaddr_dl) so that's a bit confusing or rather a bug in route(8)
as we'll see later.

But

: route change 192.0.2.101 192.0.2.100 -mtu 1500
change host 192.0.2.101: gateway 192.0.2.100
: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        192.0.2.100        UGH         0        0   1500     tun2

that changed the gateway address, which is correct,
but also set RTF_GATEWAY, which is not as 192.0.2.101 is directly
connected on the PtP interface.
The same would happen if passing down the sockaddr_dl, just that
the gateway address would stay link#8.


4) RTA_GATEWAY != RTF_GATEWAY

The RTF_GATEWAY cannot be set from userspace for an RTM_CHANGE.
Even if given it would be filtered out because of RTF_FMASK.

So what your commit message in r186308 was saying and trying to work
around the RTF_GATEWAY problem by not setting RTA_GATEWAY this only
broke rt_xaddrs() parsing (see 1)) .

The fact that after an RTM_CHANGE with an RTA_GATEWAY set, RTF_GATEWAY
appears, comes from the change in r167797 (ignore the ordering
change, only the rt->rt_flags |= RTF_GATEWAY; there is relevant).
Which is under discussion as my follow-up commit said:

--------------------
r186308, backed out in r191305, already tried to do that, and in addition
ignore AF_LINK types of gateway addresses to work around a problem that
r167797 had introduced on the kernel side always setting RTF_GATEWAY if a
gateway address was passed into the kernel.
The proper solution for this is still under discussion so I am hesitant to
re-add the special AF_LINK treatment for now.
--------------------


One possible solution I had been theoretically pondering first would
have been:

Index: sys/net/rtsock.c
===================================================================
--- sys/net/rtsock.c    (revision 190341)
+++ sys/net/rtsock.c    (working copy)
@@ -673,25 +673,27 @@ route_output(struct mbuf *m, struct sock
                         if (info.rti_info[RTAX_GATEWAY] != NULL) {
                                 RT_UNLOCK(rt);
                                 RADIX_NODE_HEAD_LOCK(rnh);
                                 RT_LOCK(rt);

                                 error = rt_setgate(rt, rt_key(rt),
                                     info.rti_info[RTAX_GATEWAY]);
                                 RADIX_NODE_HEAD_UNLOCK(rnh);
                                 if (error != 0) {
                                         RT_UNLOCK(rt);
                                         senderr(error);
                                 }
-                               rt->rt_flags |= RTF_GATEWAY;
+                               if (info.rti_info[RTAX_GATEWAY]->sa_family !=
+                                   AF_LINK)
+                                       rt->rt_flags |= RTF_GATEWAY;
                         }
                         if (info.rti_ifa != NULL &&
                             info.rti_ifa != rt->rt_ifa) {
                                 IFAREF(info.rti_ifa);
                                 rt->rt_ifa = info.rti_ifa;
                                 rt->rt_ifp = info.rti_ifp;
                         }
                         /* Allow some flags to be toggled on change. */
                         if (rtm->rtm_fmask & RTF_FMASK)
                                 rt->rt_flags = (rt->rt_flags &
                                     ~rtm->rtm_fmask) |
                                     (rtm->rtm_flags & rtm->rtm_fmask);


But I am almost pretty sure that would not be right in all cases either.


Our routing sockets are currently a bit .. uhm .. fragile so a small
change to fix one thing can break other stuff easily.


If we do not want any AF_LINK addresses to be set in RTM_CHANGE as
RTA_GATEWAY anymore, we should first make sure that we cannot add
them in first place and thus the kernel would not report those back.
But I am pretty sure this will simply break all interfaces.

Also it would not easily be possible to fix the example from 3) anymore.
BTW neither:

: route change 192.0.2.101 -mtu 1500
change host 192.0.2.101
: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        192.0.2.100        UGH         0        0   1500     tun2

nor:

: route change 192.0.2.101 -link -mtu 1500
change host 192.0.2.101
: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        192.0.2.100        UGH         0        0   1500     tun2

does make it a link#%d again. Say what you want, this is a route(8) bug as
a handcrafted update does:

: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        192.0.2.100        UGH         0        0   1500     tun2
: sudo ./a.out
: netstat -Wanrf inet | egrep '(^Dest|tun2)'
Destination        Gateway            Flags    Refs      Use    Mtu    Netif Expire
192.0.2.101        link#8             UGH         0        0   1500     tun2

what it is supposed to do.  The only thing left is the RTM_GATEWAY that the
kernel set, unfortunately, which is NOT the fault of ppp.

So the question is, can we somehow clear it?  The answer to that is "no".



>> -----Original Message-----
>> From: owner-src-committers at FreeBSD.org
>> [mailto:owner-src-committers at FreeBSD.org] On Behalf Of Bjoern A. Zeeb
>> Sent: Monday, April 20, 2009 4:23 AM
>> To: src-committers at freebsd.org; svn-src-all at freebsd.org;
>> svn-src-head at freebsd.org
>> Subject: svn commit: r191305 - head/usr.sbin/ppp
>>
>> Author: bz
>> Date: Mon Apr 20 11:22:51 2009
>> New Revision: 191305
>> URL: http://svn.freebsd.org/changeset/base/191305
>>
>> Log:
>>   Back out r186308:
>>
>>   in case of AF_LINK, which the kernel still returns for an
>> RTAX_GATEWAY
>>   as an empty sockaddr_dl in the classic tun<n> case.
>>   Copying the address into the message payload, but not the
>> RTA_GATEWAY
>>   flag results in rt_xaddrs() in the kernel tripping over
>> that and parsing
>>   the next attribute set with a flag, i.e. RTA_NETMASK, with
>> the gateway
>>   address, resulting in bogus route entry.
>>
>>   MFC after:	3 days
>>
>> Modified:
>>   head/usr.sbin/ppp/route.c
>>
>> Modified: head/usr.sbin/ppp/route.c
>> ==============================================================
>> ================
>> --- head/usr.sbin/ppp/route.c	Mon Apr 20 10:40:42 2009
>> (r191304)
>> +++ head/usr.sbin/ppp/route.c	Mon Apr 20 11:22:51 2009
>> (r191305)
>> @@ -910,10 +910,8 @@ rt_Update(struct bundle *bundle, const s
>>      p += memcpy_roundup(p, dst, dst->sa_len);
>>    }
>>
>> -  if (gw != NULL && (gw->sa_family != AF_LINK))
>> -    rtmes.m_rtm.rtm_addrs |= RTA_GATEWAY;
>> +  rtmes.m_rtm.rtm_addrs |= RTA_GATEWAY;
>>    p += memcpy_roundup(p, gw, gw->sa_len);
>> -
>>    if (mask) {
>>      rtmes.m_rtm.rtm_addrs |= RTA_NETMASK;
>>      p += memcpy_roundup(p, mask, mask->sa_len);
>>
>>
>
>

-- 
Bjoern A. Zeeb                      The greatest risk is not taking one.


More information about the svn-src-all mailing list