debug.mpsafenet=1 vs. user/group rules [Re: kern/106805: ...]

Tai-hwa Liang avatar at mmlab.cse.yzu.edu.tw
Tue Jan 9 06:23:20 PST 2007


On Fri, 29 Dec 2006, Christian S.J. Peron wrote:
> Max,
>
> I have replied to this mail and I guess it has been lost, as I have had no 
> response.  Although this technically makes
> the problem harmless, all you are doing is moving the lock order reversal 
> from pf+inpcb to pfil+inpcb. The

   I probably missed something; however, with Max's patch applied, I did not 
see any pf related LOR on a WITNESS + INVARIANT enabled -STABLE box during
last two weeks.

> only major difference is that we can have multiple readers of the pfil lock, 
> making the LOR harmless, in this path.
>
> In IPFW, I changed the chain locks to use rw_lock(9), so we can get rid of 
> the mpsafenet requirement for IPFW.
>
> I've thought about doing this in IPFW (looking up the ucred if there are any 
> uid/gid/jail rules) prior to picking up the
> chain locks, but realized we could remove the lock ordering issue all 
> together if we anihilated the pfil lock.
>
> One idea I had was introducing PFIL_STATIC which requires that modules 
> wishing to register pfil hooks did so at
> boot-time only. Something similar was done for the MAC framework to avoid 
> having to pickup policy list locks
> for each security decision.
>
> This would also have the nice side effect of eliminating a couple of atomic 
> instructions per packet in the fast path.
> Taking this approach along with moving the inpcb lookup prior to the firewall 
> chain locks allows us to actually
> eliminate the lock ordering issues.  However, the layering violation still 
> exists. But it's harmless.
>
> However, having said all that. This works, too.
>
>
> Max Laier wrote:
>> Okay, spoken too quick ... I just had an idea (enlightment you might say - 
>> given the time of year), that might finally get us rid of this symptom (not 
>> of the problem though).
>> 
>> Short recap on why this is happening:  Checking socket credentials on the 
>> IP layer (where pf lives) is a layering violation.  A useful one, you may 
>> argue, but nontheless - it causes a lock order reversal:  In order to walk 
>> the pf rules we need to hold the pf lock, in order to walk the socket hash 
>> we need to hold the "inp" lock.
>> 
>> The attached diff circumvents the problem by **always** doing the 
>> credential lookup *before* walking the pf rules.  This has the benefit, 
>> that it works (at least I think it should), but there is a price to pay. 
>> Now we have to pay for the socket lookup for *every* tcp and udp packet 
>> instead of just for those that really hit uid/gid rules.  That's why I 
>> decided to make is a config option "PF_MPFSAFE_UGID" which you can turn on 
>> if you are running a setup that will benefit.  The patch turns it on for 
>> the module-built by default.
>> 
>> A possible scenario that should benefit is a big iron SMP box running lot 
>> of services that you want to filter using *stateful* uid/gid rules.  For 
>> this setup where a huge percentage of the packets that are not captured by 
>> states eventually match a uid/gid rule, you will even get added parallelism 
>> with this patch.
>> 
>> On every other typical setup, it should be better to avoid user/group rules 
>> or to disable mpsafenet.
>> 
>> In order for this to hit the tree, I need tests confirming that it really 
>> helps and possibly benchmarks that qualify the impact of it.  Thanks.
>>
>>   ------------------------------------------------------------------------
>> 
>> Index: conf/options
>> ===================================================================
>> RCS file: /usr/store/mlaier/fcvs/src/sys/conf/options,v
>> retrieving revision 1.567
>> diff -u -r1.567 options
>> --- conf/options	10 Dec 2006 04:23:23 -0000	1.567
>> +++ conf/options	16 Dec 2006 15:36:08 -0000
>> @@ -349,6 +349,7 @@
>>  DEV_PF			opt_pf.h
>>  DEV_PFLOG		opt_pf.h
>>  DEV_PFSYNC		opt_pf.h
>> +PF_MPSAFE_UGID		opt_pf.h
>>  ETHER_II		opt_ef.h
>>  ETHER_8023		opt_ef.h
>>  ETHER_8022		opt_ef.h
>> Index: contrib/pf/net/pf.c
>> ===================================================================
>> RCS file: /usr/store/mlaier/fcvs/src/sys/contrib/pf/net/pf.c,v
>> retrieving revision 1.42
>> diff -u -r1.42 pf.c
>> --- contrib/pf/net/pf.c	22 Oct 2006 11:52:11 -0000	1.42
>> +++ contrib/pf/net/pf.c	16 Dec 2006 15:34:52 -0000
>> @@ -3032,6 +3032,12 @@
>>  		return (PF_DROP);
>>  	}
>>  +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
>> +	PF_UNLOCK();
>> +	lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
>> +	PF_LOCK();
>> +#endif
>> +
>>  	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
>>   	if (direction == PF_OUT) {
>> @@ -3428,6 +3434,12 @@
>>  		return (PF_DROP);
>>  	}
>>  +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
>> +	PF_UNLOCK();
>> +	lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
>> +	PF_LOCK();
>> +#endif
>> +
>>  	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
>>   	if (direction == PF_OUT) {
>> Index: modules/pf/Makefile
>> ===================================================================
>> RCS file: /usr/store/mlaier/fcvs/src/sys/modules/pf/Makefile,v
>> retrieving revision 1.12
>> diff -u -r1.12 Makefile
>> --- modules/pf/Makefile	12 Sep 2006 04:25:12 -0000	1.12
>> +++ modules/pf/Makefile	16 Dec 2006 15:41:00 -0000
>> @@ -10,7 +10,7 @@
>>  	in4_cksum.c \
>>  	opt_pf.h opt_inet.h opt_inet6.h opt_bpf.h opt_mac.h
>>  -CFLAGS+=  -I${.CURDIR}/../../contrib/pf
>> +CFLAGS+=  -I${.CURDIR}/../../contrib/pf -DPF_MPSAFE_UGID
>>   .if !defined(KERNBUILDDIR)
>>  opt_inet.h:


More information about the freebsd-pf mailing list