debug.mpsafenet=1 vs. user/group rules [Re: kern/106805: ...]

Max Laier max at love2party.net
Fri Dec 29 08:38:35 PST 2006


On Friday 29 December 2006 16:41, Christian S.J. Peron wrote:
> I have replied to this mail and I guess it has been lost, as I have had
> no response.  Although this technically makes
> the problem harmless, all you are doing is moving the lock order
> reversal from pf+inpcb to pfil+inpcb. The
> only major difference is that we can have multiple readers of the pfil
> lock, making the LOR harmless, in this path.

I don't necessarily agree that you can have a LOR with non-exclusive 
locks.  As you rightly point out, it will never result in a deadlock 
situation - as long as there isn't a code path that obtains the exclusive 
version of the rw_lock and the inpcb lock.

> In IPFW, I changed the chain locks to use rw_lock(9), so we can get rid
> of the mpsafenet requirement for IPFW.

I'm not 100% sure IPFW is safe yet.  As we configure via raw sockets there 
might be some common locks that are held when we enter ip_fw_ctl, which 
then picks up the exclusive version of the chain lock.  I'm not sure what 
kind of locks are in play really, though - CC'ing Robert.

> I've thought about doing this in IPFW (looking up the ucred if there
> are any uid/gid/jail rules) prior to picking up the
> chain locks, but realized we could remove the lock ordering issue all
> together if we anihilated the pfil lock.

As long as you are holding the IPFW rw_lock over the lookup there is a 
potential problem as described above.  The pfil rw_lock doesn't have any 
exclusive interaction with any of the inpcb locks, so there is no 
problem - as far as I understand at least.

> One idea I had was introducing PFIL_STATIC which requires that modules
> wishing to register pfil hooks did so at
> boot-time only. Something similar was done for the MAC framework to
> avoid having to pickup policy list locks
> for each security decision.

Hmmm ... how would we implement something like this?  Can we easily keep 
the non-static version?  I'd like to be able to develop using modules ;)

> This would also have the nice side effect of eliminating a couple of
> atomic instructions per packet in the fast path.
> Taking this approach along with moving the inpcb lookup prior to the
> firewall chain locks allows us to actually
> eliminate the lock ordering issues.  However, the layering violation
> still exists. But it's harmless.

From the point of view of the locks the layering violation would be gone, 
that's why it is harmless.

> However, having said all that. This works, too.
>
> Max Laier wrote:
> > Okay, spoken too quick ... I just had an idea (enlightment you might
> > say - given the time of year), that might finally get us rid of this
> > symptom (not of the problem though).
> >
> > Short recap on why this is happening:  Checking socket credentials on
> > the IP layer (where pf lives) is a layering violation.  A useful one,
> > you may argue, but nontheless - it causes a lock order reversal:  In
> > order to walk the pf rules we need to hold the pf lock, in order to
> > walk the socket hash we need to hold the "inp" lock.
> >
> > The attached diff circumvents the problem by **always** doing the
> > credential lookup *before* walking the pf rules.  This has the
> > benefit, that it works (at least I think it should), but there is a
> > price to pay. Now we have to pay for the socket lookup for *every*
> > tcp and udp packet instead of just for those that really hit uid/gid
> > rules.  That's why I decided to make is a config option
> > "PF_MPFSAFE_UGID" which you can turn on if you are running a setup
> > that will benefit.  The patch turns it on for the module-built by
> > default.
> >
> > A possible scenario that should benefit is a big iron SMP box running
> > lot of services that you want to filter using *stateful* uid/gid
> > rules.  For this setup where a huge percentage of the packets that
> > are not captured by states eventually match a uid/gid rule, you will
> > even get added parallelism with this patch.
> >
> > On every other typical setup, it should be better to avoid user/group
> > rules or to disable mpsafenet.
> >
> > In order for this to hit the tree, I need tests confirming that it
> > really helps and possibly benchmarks that qualify the impact of it. 
> > Thanks.
> >
> >
> > ---------------------------------------------------------------------
> >---
> >
> > Index: conf/options
> > ===================================================================
> > RCS file: /usr/store/mlaier/fcvs/src/sys/conf/options,v
> > retrieving revision 1.567
> > diff -u -r1.567 options
> > --- conf/options	10 Dec 2006 04:23:23 -0000	1.567
> > +++ conf/options	16 Dec 2006 15:36:08 -0000
> > @@ -349,6 +349,7 @@
> >  DEV_PF			opt_pf.h
> >  DEV_PFLOG		opt_pf.h
> >  DEV_PFSYNC		opt_pf.h
> > +PF_MPSAFE_UGID		opt_pf.h
> >  ETHER_II		opt_ef.h
> >  ETHER_8023		opt_ef.h
> >  ETHER_8022		opt_ef.h
> > Index: contrib/pf/net/pf.c
> > ===================================================================
> > RCS file: /usr/store/mlaier/fcvs/src/sys/contrib/pf/net/pf.c,v
> > retrieving revision 1.42
> > diff -u -r1.42 pf.c
> > --- contrib/pf/net/pf.c	22 Oct 2006 11:52:11 -0000	1.42
> > +++ contrib/pf/net/pf.c	16 Dec 2006 15:34:52 -0000
> > @@ -3032,6 +3032,12 @@
> >  		return (PF_DROP);
> >  	}
> >
> > +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
> > +	PF_UNLOCK();
> > +	lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
> > +	PF_LOCK();
> > +#endif
> > +
> >  	r =
> > TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
> >
> >  	if (direction == PF_OUT) {
> > @@ -3428,6 +3434,12 @@
> >  		return (PF_DROP);
> >  	}
> >
> > +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
> > +	PF_UNLOCK();
> > +	lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
> > +	PF_LOCK();
> > +#endif
> > +
> >  	r =
> > TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
> >
> >  	if (direction == PF_OUT) {
> > Index: modules/pf/Makefile
> > ===================================================================
> > RCS file: /usr/store/mlaier/fcvs/src/sys/modules/pf/Makefile,v
> > retrieving revision 1.12
> > diff -u -r1.12 Makefile
> > --- modules/pf/Makefile	12 Sep 2006 04:25:12 -0000	1.12
> > +++ modules/pf/Makefile	16 Dec 2006 15:41:00 -0000
> > @@ -10,7 +10,7 @@
> >  	in4_cksum.c \
> >  	opt_pf.h opt_inet.h opt_inet6.h opt_bpf.h opt_mac.h
> >
> > -CFLAGS+=  -I${.CURDIR}/../../contrib/pf
> > +CFLAGS+=  -I${.CURDIR}/../../contrib/pf -DPF_MPSAFE_UGID
> >
> >  .if !defined(KERNBUILDDIR)
> >  opt_inet.h:

-- 
/"\  Best regards,                      | mlaier at freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier at EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-pf/attachments/20061229/cb89b6e5/attachment.pgp


More information about the freebsd-pf mailing list