Re: fib6_lookup() returning deleted struct ifnet

From: Zhenlei Huang <zlei_at_FreeBSD.org>
Date: Thu, 26 Oct 2023 01:49:58 UTC

> On Oct 25, 2023, at 11:27 PM, Kristof Provost <kp@FreeBSD.org> wrote:
> 
> Hi,
> 
> Several pfSense users report IPv6-related panics when an interface is deleted.
> The relevant bug reports are https://redmine.pfsense.org/issues/14164 <https://redmine.pfsense.org/issues/14164> and https://redmine.pfsense.org/issues/14431 <https://redmine.pfsense.org/issues/14431>.
> The latest report is for a build that includes commits up to 1a18383a52bc373e316d224cef1298debf6f7e25 (“libcrypto: link engines and the legacy provider to libcrypto”, September 15th).
> 
> I believe all reports are for users running PPPoE, via netgraph, but that might be coincidental, as that’s the most likely way for interfaces to be destroyed (when PPP disconnects and reconnects).
> 
> There are a few different backtraces, but they appear to have the same root cause, so I’ll focus on one of them:
> 
> db:1:pfs> bt
> Tracing pid 2 tid 100041 td 0xfffffe0085264560
> kdb_enter() at kdb_enter+0x32/frame 0xfffffe00850ad910
> vpanic() at vpanic+0x183/frame 0xfffffe00850ad960
> panic() at panic+0x43/frame 0xfffffe00850ad9c0
> trap_fatal() at trap_fatal+0x409/frame 0xfffffe00850ada20
> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00850ada80
> calltrap() at calltrap+0x8/frame 0xfffffe00850ada80
> --- trap 0xc, rip = 0xffffffff80f5a036, rsp = 0xfffffe00850adb50, rbp = 0xfffffe00850adb80 ---
> in6_selecthlim() at in6_selecthlim+0x96/frame 0xfffffe00850adb80
> tcp_default_output() at tcp_default_output+0x1ded/frame 0xfffffe00850add70
> tcp_timer_rexmt() at tcp_timer_rexmt+0x514/frame 0xfffffe00850addd0
> tcp_timer_enter() at tcp_timer_enter+0x102/frame 0xfffffe00850ade10
> softclock_call_cc() at softclock_call_cc+0x13c/frame 0xfffffe00850adec0
> softclock_thread() at softclock_thread+0xe9/frame 0xfffffe00850adef0
> fork_exit() at fork_exit+0x7d/frame 0xfffffe00850adf30
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00850adf30
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> This happens in the TCP output path, where we look up the hop limit for a specific destination. I’ve obtained a core dump for such a crash, and I believe the panic happens on line https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861 <https://cgit.freebsd.org/src/tree/sys/netinet6/in6_src.c#n861>
> The call in tcp_default_output() is in6_selecthlim(int, NULL);, so we don’t get an ifp from the caller, but instead perform a route lookup, and try to obtain the hop limit through ND_IFINFO(nh->nh_ifp). This panics because the afdata[AF_INET6] pointer is NULL. The core dump shows a deleted structure ifnet:
> 
> 

`egrep -r 'if_afdata\[AF_INET6\]\s*[!=]=\s*NULL' sys/netinet6'` shows there're many places do the NULL check. I think we can do it in in6_selecthlim() as a workaround.
 
> (kgdb) p *(struct ifnet *)0xfffff80203712800
> $3 = {
>   if_link = {
>     cstqe_next = 0x0
>   },
>   if_clones = {
>     le_next = 0x0,
>     le_prev = 0x0
>   },
>   if_groups = {
>     cstqh_first = 0x0,
>     cstqh_last = 0xfffff80203712818
>   },
>   if_alloctype = 53 '5',
>   if_numa_domain = 255 '\377',
>   if_softc = 0xfffff80103447a00,
>   if_llsoftc = 0x0,
>   if_l2com = 0x0,
>   if_dname = 0xffffffff81492f70 "ng",
>   if_dunit = 0,
>   if_index = 14,
>   if_idxgen = 2,
>   if_xname = "pppoe0\000\000\000\000\000\000\000\000\000",
>   if_description = 0xfffff8003a5f83d0 "WAN",
>   if_flags = 2132112,
>   if_drv_flags = 0,
>   if_capabilities = 0,
>   if_capabilities2 = 0,
> …
>   if_afdata = {0x0 <repeats 44 times>},
> …
>   if_output = 0xffffffff80e29c60 <ifdead_output>,
>   if_input = 0xffffffff80e29c80 <ifdead_input>,
>   if_bridge_input = 0x0,
>   if_bridge_output = 0x0,
>   if_bridge_linkstate = 0x0,
>   if_start = 0xffffffff80e29c90 <ifdead_start>,
>   if_ioctl = 0xffffffff80e29ca0 <ifdead_ioctl>,
> …
> My understanding is that the fib table should get updated whenever we change the routing table (such as during interface cleanup in if_detach_internal()). Some quick experimentation with epair and dtrace also shows:
> 
>  20  20388           sync_algo_end_cb:entry Stage 1
>               kernel`setup_fd_instance+0x41f
>               kernel`rebuild_fd_flm+0x99
>               kernel`rebuild_fd+0x136
>               kernel`rib_notify+0x50
>               kernel`rt_delete_conditional+0xf1
>               kernel`rib_del_route+0x1fc
>               kernel`rib_handle_ifaddr_info+0xd9
>               kernel`nd6_prefix_offlink+0x1ce
>               kernel`nd6_prefix_del+0x94
>               kernel`if_purgeaddrs+0x148
>               kernel`if_detach_internal+0x1e8
>               kernel`if_detach+0x71
>               if_epair.ko`epair_clone_destroy+0x62
>               kernel`if_clone_destroyif_flags+0x6a
>               kernel`if_clone_destroy+0x100
>               kernel`ifioctl+0x8a5
>               kernel`kern_ioctl+0x286
>               kernel`sys_ioctl+0x152
>               kernel`amd64_syscall+0x153
>               kernel`0xffffffff8102315b
> In other words, when we delete the interface if_detach_internal() purges the interface addresses, which ends up rebuilding the fib (rebuild_fd()) via rib_del_route().
> That ought to ensure that we cannot end up finding this struct ifnet through fib6_lookup(), as the purging of the addresses (and thus the rebuilding of the fib) is done before we if_domdetach() at the end of if_detach_internal(), and the NULL afdata[AF_INET6] demonstrates that we’ve gotten there.
> 
> 

By intuition, fib6_lookup() should not return **INVALID** next hop (with detaching interfaces), unless explicitly requested.
> We’ve also gone through if_free(), as the ifindex_table no longer contains the struct ifnet pointer for the relevant interface.
> We appear to have not yet called if_free_deferred() (and indeed, ifp->if_refcount is 4, so we wouldn’t have called that yet).
> 
> I’m confused as to how this can happen, and would appreciate hints.
> 
> 

I believe Alexander has insight on this.
> Thanks,
> Kristof
>