Re: rtentry_free panic

From: Kristof Provost <kp_at_FreeBSD.org>
Date: Wed, 20 Aug 2025 20:48:49 UTC
On 20 Aug 2025, at 18:00, Mark Johnston wrote:
> On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost wrote:
>> Hi,
>>
>> Running the pf tests I very occasional (say 1 out of 10 runs) see 
>> panics
>> freeing an rtentry.
>> This mostly manifests during bricoler test runs, and usually with the 
>> KMSAN
>> kernel config. I assume that’s because there’s a timing factor 
>> involved
>> rather than it being an issue that’s directly detected by 
>> KMSAN/KASAN.
>
> I've seen this before, but not in the past few months.  I'm running 
> with
> the default parallelism of 4 most of the time.
>
I have the distinct impression (but no data to prove it) that it comes 
and goes.

>> We’re panicing because the V_rtzone zone has been cleaned up (in
>> vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make 
>> this
>> more obvious.
>> Note that we failed to completely free all rtentries (`Freed UMA keg
>> (rtentry) was not empty (2 items).  Lost 1 pages of memory.`). 
>> Presumably at
>> least on of those two gets freed later, and that’s the panic we 
>> see.
>>
>> rt_free() queues the actual delete as an epoch callback
>> (`NET_EPOCH_CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);`), and 
>> that’s
>> what we see here: the zone is removed before we’re done freeing all 
>> of the
>> rtentries.
>>
>> vnet_rtzone_destroy() is called from rtables_destroy(), but that 
>> explicitly
>> calls NET_EPOCH_DRAIN_CALLBACKS() first, so I’d expect all of the 
>> pending
>> cleanups to have been done at that point.  The comment block above 
>> does
>> suggest that there may still be nexthop entries pending deletion even 
>> after
>> the we drain the callbacks. I think I can see how that’d happen for
>> nexthops, but I do not see how it can happen for rtentries.
>
> Is it possible that if_detach_internal()->rt_flushifroutes() is 
> running
> after the rtentry zone is being destroyed?  That is, maybe we're
> destroying interfaces too late in the jail teardown process?
>
I don’t think so, I expect all of the if_detach() calls to be done by 
the time we hit rtables_destroy() -> vnet_rtzone_destroy(), because 
that’s SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST.
We should have hit vnet_if_return() (SI_SUB_VNET_DONE/SI_ORDER_ANY) by 
then.

SI_SUB_VNET_DONE is 0xdc00000, SI_SUB_PROTO_DOMAIN is 0x8800000 and the 
vnet_uninit calls are done in descending order, so VNET_DONE should be 
first.

I’m going to kick off a few test runs where I assert that V_rtzone 
hasn’t been freed yet when we’re in if_detach_interal() to confirm, 
because clearly I’m missing *something*, and it could be this.

—
Kristof