kernel panics in in_lltable_lookup (with INVARIANTS)

Brian Somers brian at FreeBSD.org
Sat Aug 22 10:45:56 UTC 2009


On Fri, 21 Aug 2009 23:23:13 -0700 Brian Somers <brian at FreeBSD.org> wrote:
> On Fri, 21 Aug 2009 22:41:34 -0700 Brian Somers <brian at Awfulhak.org> wrote:
> > On Fri, 21 Aug 2009 21:55:03 -0700 Brian Somers <brian at FreeBSD.org> wrote:
> > > On Fri, 21 Aug 2009 17:13:45 -0700 Kip Macy <kmacy at freebsd.org> wrote:
> > > > Try this:
> > > > 
> > > > Index: sys/net/flowtable.c
> > > > ===================================================================
> > > > --- sys/net/flowtable.c (revision 196382)
> > > > +++ sys/net/flowtable.c (working copy)
> > > > @@ -688,6 +688,12 @@
> > > >                 struct rtentry *rt = ro->ro_rt;
> > > >                 struct ifnet *ifp = rt->rt_ifp;
> > > > 
> > > > +               if (ifp->if_flags & IFF_POINTOPOINT) {
> > > > +                       RTFREE(rt);
> > > > +                       ro->ro_rt = NULL;
> > > > +                       return (ENOENT);
> > > > +               }
> > > > +
> > > >                 if (rt->rt_flags & RTF_GATEWAY)
> > > >                         l3addr = rt->rt_gateway;
> > > >                 else
> > > > 
> > > > You'll need to apply this by hand as gmail munges the formatting.
> > > > 
> > > > -Kip
> > > 
> > > Hi,
> > > 
> > > That certainly stops the panic, however data routed to the tun
> > > interface doesn't come out the back end and data written
> > > to the back end doesn't come out the tun interface.
> > [.....]
> > > Maybe this problem isn't a routing problem.  I'll
> > > look into it further and figure out if the packet is getting to the tun
> > > driver and if so, what it thinks it's doing with it.
> > 
> > I wasn't correct - the data *IS* being read out of the back of
> > the tunnel device.  When I send the ICMP, it goes into the tun
> > device and comes out the back end as an AF_LINK packet.  ppp
> > silently discards this (ironically I have a comment noting
> > that I should really track unidentified packet counts).
> > 
> > I'll try to figure out what in if_tun.c is corrupting the family next...
> 
> if_tun.c is fine.  The data passed from if_output() has family
> AF_LINK - hence the original panic from flowtable_lookup().
> 
> So the question is "why is ip_output() sending AF_LINK traffic
> instead of AF_INET traffic?".
> 
> Still looking....

From what I can tell, this is what is happening:

ip_output() is called with ro == NULL.
ip_output() calls flowtable_lookup() with a zeroed 'ro'.
flowtable_lookup() calls ft->ft_rtalloc() (really rtalloc1_fib()) to
initialise 'ro' and ends up with ro->ro_rt->rt_gateway->sa_family
set to AF_LINK.

Your original patch frees ro->ro_rt and fails before calling
llentry_update() with ro->ro_rt->rt_gateway->sa_family !=
AF_INET.

Now, when flowtable_lookup() fails, ro->ro_rt is NULL and
ip_output()s 'dst' gets set up with family AF_INET.  Unfortunately,
right after this, after checking for IP_SENDONES, IP_ROUTETOIF
and IN_MULTICAST, the ip_output() code decides to call
in_rtalloc_ign() (which eventually just calls rtalloc1_fib()) to
initialise ro->ro_rt and then sets dst to be ro->ro_rt->rt_gateway
-- which is *still* an AF_LINK address!

Finally ip_output() calls ifp->if_output() (really tunoutput()) with
dst's family set to AF_LINK, tunoutput() queues it to the tun
character device, ppp reads it and drops it on the floor 'cos it
doesn't know what to do with AF_LINK.

The tun driver is more or less the same as the -stable version,
so it seems that ip_output() is to blame.  The only relevant part
that seems substantially different is rtalloc1_fib(), so right now
I'm guessing that the RTF_CLONING code in -stable always
clones the route with a gw family of AF_INET and expectations
are met after that.

I'll look some more on the weekend...

> > > > On Fri, Aug 21, 2009 at 16:43, Brian Somers<brian at freebsd.org> wrote:
> > > > > Hi,
> > > > >
> > > > > I've been working on a fix to address an issue that came up with
> > > > > our update of openssh-5.  The issue is that openssh-5 now uses
> > > > > pipe() to create stdin/stdout channels between sshd and the server
> > > > > side program where it used to use socketpair().  Because it uses
> > > > > pipe(), stdin is no longer bi-directional and cannot be used for both
> > > > > input and output by a child process.  This breaks the use of ssh
> > > > > as a tunnel with ppp on either end (set device "!ssh -e none host
> > > > > ppp -direct label")
> > > > >
> > > > > I talked with des@ for a while and then with the openssh folks and
> > > > > have not been able to resolve the issues in openssh that made them
> > > > > choose to enforce the use of pipe() over socketpair().  I now have a
> > > > > patch to ppp that makes ppp detect that it's connected via pipe() and
> > > > > causes it to use stdin for input and stdout for output (usually it expects
> > > > > just one descriptor).  Although I'm happy with the patch and planned on
> > > > > requesting permission to commit, I've bumped into a show-stopper
> > > > > that seems unrelated, so I thought I'd ask here if anyone has seen
> > > > > this or has any suggestions as to what the problem might be.
> > > > >
> > > > > The issue....
> > > > >
> > > > > I'm seeing a panic when I send traffic through a ppp link:
> > > > >
> > > > > panic string is: sin_family 18
> > > > > Stack trace starts:
> > > > >    in_lltable_lookup()
> > > > >    llentry_update()
> > > > >    flowtable_lookup()
> > > > >    ip_output()
> > > > >    ....
> > > > >
> > > > > The panic is due to a KASSERT in in_lltable_lookup() that expects the
> > > > > sockaddr to be AF_INET.  Number 18 is AF_LINK.
> > > > >
> > > > > AFAICT this is happening while setting up a temporary route for the
> > > > > first outbound packet.  I haven't been able to do much investigation
> > > > > yet due to other patches in my tree that seem to have broken all my
> > > > > kernel symbols, but once I get a clean rebuild I should be back in
> > > > > business.
> > > > >
> > > > > If anyone has any suggestions, I'm all ears!
> > > > >
> > > > > Cheers.

-- 
Brian Somers                                          <brian at Awfulhak.org>
Don't _EVER_ lose your sense of humour !               <brian at FreeBSD.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 306 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090822/563bdfca/signature.pgp


More information about the freebsd-hackers mailing list