Re: git: e9fc0c538264 - main - if_clone: Make ifnet_detach_sxlock opaque to consumers

From: Kristof Provost <kp_at_FreeBSD.org>
Date: Tue, 14 Apr 2026 15:57:41 UTC
On 13 Apr 2026, at 6:39, Zhenlei Huang wrote:
> The branch main has been updated by zlei:
>
> URL: 
> https://cgit.FreeBSD.org/src/commit/?id=e9fc0c538264355bd3fd9120c650078281c2a290
>
> commit e9fc0c538264355bd3fd9120c650078281c2a290
> Author:     Zhenlei Huang <zlei@FreeBSD.org>
> AuthorDate: 2026-04-13 04:38:44 +0000
> Commit:     Zhenlei Huang <zlei@FreeBSD.org>
> CommitDate: 2026-04-13 04:38:44 +0000
>
>     if_clone: Make ifnet_detach_sxlock opaque to consumers
>
>     The change e133271fc1b5e introduced ifnet_detach_sxlock, and 
> change
>     6d2a10d96fb5 widened its coverage, but there are still consumers,
>     net80211 and tuntap e.g., want it. Instead of sprinkling it 
> everywhere,
>     make it opaque to consumers.
>
>     Out of tree drivers shall also benefit from this change.
>
>     Reviewed by:    kp
>     MFC after:      2 weeks
>     Differential Revision:  https://reviews.freebsd.org/D56298

I don’t really understand why, but this commit causes bricoler test 
runs to fail for me.
Or more precisely, it looks like parallel test runs see a number of 
failures in the pf tests.

Reverting only part of this change, specifically this:

	diff --git a/sys/net/if_clone.c b/sys/net/if_clone.c
	index db3db78c1b76..d0fe54b6146b 100644
	--- a/sys/net/if_clone.c
	+++ b/sys/net/if_clone.c
	@@ -683,11 +683,12 @@ if_clone_detach(struct if_clone *ifc)
	        V_if_cloners_count--;
	        IF_CLONERS_UNLOCK();

	-       sx_xlock(&ifnet_detach_sxlock);
	        /* destroy all interfaces for this cloner */
	-       while (!LIST_EMPTY(&ifc->ifc_iflist))
	+       while (!LIST_EMPTY(&ifc->ifc_iflist)) {
	+               sx_xlock(&ifnet_detach_sxlock);
	                if_clone_destroyif_flags(ifc, 
LIST_FIRST(&ifc->ifc_iflist), IFC_F_FORCE);
	-       sx_xunlock(&ifnet_detach_sxlock);
	+               sx_xunlock(&ifnet_detach_sxlock);
	+       }

	        IF_CLONE_REMREF(ifc);
	 }

Seems to be enough to avoid the problem. So it looks like somehow 
deleting all interfaces for a given cloner under that lock (rather than 
releasing it and re-acquiring it) causes the test failures.

The failing tests I see are:

	sys/netpfil/pf/divert-to:in_div  ->  failed: Test case body returned a 
non-ok exit code, but this is not allowed  [36.591s]
	sys/netpfil/pf/divert-to:in_div_in  ->  failed: atf-check failed; see 
the output of the test for details  [30.867s]
	sys/netpfil/pf/divert-to:in_div_in_fwd_out_div_out  ->  failed: 
atf-check failed; see the output of the test for details  [39.705s]
	sys/netpfil/pf/divert-to:in_dn_in_div_in_out_div_out_dn_out  ->  
failed: atf-check failed; see the output of the test for details  
[45.059s]
	sys/netpfil/pf/divert-to:out_div  ->  failed: Test case body returned a 
non-ok exit code, but this is not allowed  [23.629s]
	sys/netpfil/pf/killstate:gateway  ->  failed: Killing with a different 
gateway removed the state.  [32.803s]
	sys/netpfil/pf/killstate:id  ->  failed: Killing a different ID removed 
the state.  [32.789s]
	sys/netpfil/pf/killstate:label  ->  failed: Killing a non-existing 
label removed the state.  [47.749s]
	sys/netpfil/pf/killstate:multilabel  ->  failed: Setting new rules 
removed the state.  [43.570s]
	sys/netpfil/pf/killstate:src_dst  ->  failed: Killing with the wrong 
destination IP removed our state.  [57.819s]
	sys/netpfil/pf/killstate:v6  ->  failed: Killing with the wrong IP 
removed our state.  [54.957s]
	sys/netpfil/pf/nat64:pool  ->  failed: atf-check failed; see the output 
of the test for details  [51.118s]
	sys/netpfil/pf/nat64:sctp_in  ->  failed: Failed to connect to SCTP 
server  [55.495s]
	sys/netpfil/pf/nat64.py:TestNAT64::test_udp_checksum  ->  failed: 
/usr/tests/sys/netpfil/pf/nat64.py:205: AttributeError  [80.001s]
	sys/netpfil/pf/pflog:matches  ->  failed: atf-check failed; see the 
output of the test for details  [61.807s]
	sys/netpfil/pf/pflog:matches_logif  ->  failed: atf-check failed; see 
the output of the test for details  [51.261s]
	sys/netpfil/pf/pflog:unspecified_v6  ->  failed: atf-check failed; see 
the output of the test for details  [36.281s]
	sys/netpfil/pf/rdr:srcport_pass  ->  failed: atf-check failed; see the 
output of the test for details  [76.815s]
	sys/netpfil/pf/sctp:pfsync  ->  failed: Initial SCTP connection failed  
[134.866s]
	sys/netpfil/pf/sctp:related_icmp  ->  failed: SCTP connection failed  
[59.620s]
	sys/netpfil/pf/table:zero_one  ->  failed: atf-check failed; see the 
output of the test for details  [64.125s]
	sys/netpfil/pf/tcp:rst  ->  failed: atf-check failed; see the output of 
the test for details  [80.261s]

Looking at it now, those runtimes seems suspiciously high as well.

Best regards,
Kristof