Re: fib6_lookup() returning deleted struct ifnet

From: Kristof Provost <kp_at_FreeBSD.org>
Date: Thu, 26 Oct 2023 08:29:40 UTC
On 26 Oct 2023, at 3:49, Zhenlei Huang wrote:
>> On Oct 25, 2023, at 11:27 PM, Kristof Provost <kp@FreeBSD.org> wrote:
>> The call in tcp_default_output() is in6_selecthlim(int, NULL);, so we don’t get an ifp from the caller, but instead perform a route lookup, and try to obtain the hop limit through ND_IFINFO(nh->nh_ifp). This panics because the afdata[AF_INET6] pointer is NULL. The core dump shows a deleted structure ifnet:
>>
>>
>
> `egrep -r 'if_afdata\[AF_INET6\]\s*[!=]=\s*NULL' sys/netinet6'` shows there're many places do the NULL check. I think we can do it in in6_selecthlim() as a workaround.
>
We could (either check for if_afdata[AF_INET], or for the absence of IFF_DYING in if_flags), but that feels a lot like hiding the problem rather than fixing it.
As you say, fib6_lookup() should not be returning invalid next hops, so it might make sense to add the check there, but I still want to understand why we end up in this state in the first place.

>> We’ve also gone through if_free(), as the ifindex_table no longer contains the struct ifnet pointer for the relevant interface.
>> We appear to have not yet called if_free_deferred() (and indeed, ifp->if_refcount is 4, so we wouldn’t have called that yet).
>>
>> I’m confused as to how this can happen, and would appreciate hints.
>>
>
> I believe Alexander has insight on this.
>
I’m certainly hoping smarter people than me will know more :)

Best regards,
Kristof