debug.mpsafenet=1 vs. user/group rules [Re: kern/106805: ...]
Max Laier
max at love2party.net
Fri Dec 29 08:38:35 PST 2006
On Friday 29 December 2006 16:41, Christian S.J. Peron wrote:
> I have replied to this mail and I guess it has been lost, as I have had
> no response. Although this technically makes
> the problem harmless, all you are doing is moving the lock order
> reversal from pf+inpcb to pfil+inpcb. The
> only major difference is that we can have multiple readers of the pfil
> lock, making the LOR harmless, in this path.
I don't necessarily agree that you can have a LOR with non-exclusive
locks. As you rightly point out, it will never result in a deadlock
situation - as long as there isn't a code path that obtains the exclusive
version of the rw_lock and the inpcb lock.
> In IPFW, I changed the chain locks to use rw_lock(9), so we can get rid
> of the mpsafenet requirement for IPFW.
I'm not 100% sure IPFW is safe yet. As we configure via raw sockets there
might be some common locks that are held when we enter ip_fw_ctl, which
then picks up the exclusive version of the chain lock. I'm not sure what
kind of locks are in play really, though - CC'ing Robert.
> I've thought about doing this in IPFW (looking up the ucred if there
> are any uid/gid/jail rules) prior to picking up the
> chain locks, but realized we could remove the lock ordering issue all
> together if we anihilated the pfil lock.
As long as you are holding the IPFW rw_lock over the lookup there is a
potential problem as described above. The pfil rw_lock doesn't have any
exclusive interaction with any of the inpcb locks, so there is no
problem - as far as I understand at least.
> One idea I had was introducing PFIL_STATIC which requires that modules
> wishing to register pfil hooks did so at
> boot-time only. Something similar was done for the MAC framework to
> avoid having to pickup policy list locks
> for each security decision.
Hmmm ... how would we implement something like this? Can we easily keep
the non-static version? I'd like to be able to develop using modules ;)
> This would also have the nice side effect of eliminating a couple of
> atomic instructions per packet in the fast path.
> Taking this approach along with moving the inpcb lookup prior to the
> firewall chain locks allows us to actually
> eliminate the lock ordering issues. However, the layering violation
> still exists. But it's harmless.
From the point of view of the locks the layering violation would be gone,
that's why it is harmless.
> However, having said all that. This works, too.
>
> Max Laier wrote:
> > Okay, spoken too quick ... I just had an idea (enlightment you might
> > say - given the time of year), that might finally get us rid of this
> > symptom (not of the problem though).
> >
> > Short recap on why this is happening: Checking socket credentials on
> > the IP layer (where pf lives) is a layering violation. A useful one,
> > you may argue, but nontheless - it causes a lock order reversal: In
> > order to walk the pf rules we need to hold the pf lock, in order to
> > walk the socket hash we need to hold the "inp" lock.
> >
> > The attached diff circumvents the problem by **always** doing the
> > credential lookup *before* walking the pf rules. This has the
> > benefit, that it works (at least I think it should), but there is a
> > price to pay. Now we have to pay for the socket lookup for *every*
> > tcp and udp packet instead of just for those that really hit uid/gid
> > rules. That's why I decided to make is a config option
> > "PF_MPFSAFE_UGID" which you can turn on if you are running a setup
> > that will benefit. The patch turns it on for the module-built by
> > default.
> >
> > A possible scenario that should benefit is a big iron SMP box running
> > lot of services that you want to filter using *stateful* uid/gid
> > rules. For this setup where a huge percentage of the packets that
> > are not captured by states eventually match a uid/gid rule, you will
> > even get added parallelism with this patch.
> >
> > On every other typical setup, it should be better to avoid user/group
> > rules or to disable mpsafenet.
> >
> > In order for this to hit the tree, I need tests confirming that it
> > really helps and possibly benchmarks that qualify the impact of it.
> > Thanks.
> >
> >
> > ---------------------------------------------------------------------
> >---
> >
> > Index: conf/options
> > ===================================================================
> > RCS file: /usr/store/mlaier/fcvs/src/sys/conf/options,v
> > retrieving revision 1.567
> > diff -u -r1.567 options
> > --- conf/options 10 Dec 2006 04:23:23 -0000 1.567
> > +++ conf/options 16 Dec 2006 15:36:08 -0000
> > @@ -349,6 +349,7 @@
> > DEV_PF opt_pf.h
> > DEV_PFLOG opt_pf.h
> > DEV_PFSYNC opt_pf.h
> > +PF_MPSAFE_UGID opt_pf.h
> > ETHER_II opt_ef.h
> > ETHER_8023 opt_ef.h
> > ETHER_8022 opt_ef.h
> > Index: contrib/pf/net/pf.c
> > ===================================================================
> > RCS file: /usr/store/mlaier/fcvs/src/sys/contrib/pf/net/pf.c,v
> > retrieving revision 1.42
> > diff -u -r1.42 pf.c
> > --- contrib/pf/net/pf.c 22 Oct 2006 11:52:11 -0000 1.42
> > +++ contrib/pf/net/pf.c 16 Dec 2006 15:34:52 -0000
> > @@ -3032,6 +3032,12 @@
> > return (PF_DROP);
> > }
> >
> > +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
> > + PF_UNLOCK();
> > + lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
> > + PF_LOCK();
> > +#endif
> > +
> > r =
> > TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
> >
> > if (direction == PF_OUT) {
> > @@ -3428,6 +3434,12 @@
> > return (PF_DROP);
> > }
> >
> > +#if defined(__FreeBSD__) && defined(PF_MPSAFE_UGID)
> > + PF_UNLOCK();
> > + lookup = pf_socket_lookup(&uid, &gid, direction, pd, inp);
> > + PF_LOCK();
> > +#endif
> > +
> > r =
> > TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
> >
> > if (direction == PF_OUT) {
> > Index: modules/pf/Makefile
> > ===================================================================
> > RCS file: /usr/store/mlaier/fcvs/src/sys/modules/pf/Makefile,v
> > retrieving revision 1.12
> > diff -u -r1.12 Makefile
> > --- modules/pf/Makefile 12 Sep 2006 04:25:12 -0000 1.12
> > +++ modules/pf/Makefile 16 Dec 2006 15:41:00 -0000
> > @@ -10,7 +10,7 @@
> > in4_cksum.c \
> > opt_pf.h opt_inet.h opt_inet6.h opt_bpf.h opt_mac.h
> >
> > -CFLAGS+= -I${.CURDIR}/../../contrib/pf
> > +CFLAGS+= -I${.CURDIR}/../../contrib/pf -DPF_MPSAFE_UGID
> >
> > .if !defined(KERNBUILDDIR)
> > opt_inet.h:
--
/"\ Best regards, | mlaier at freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier at EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-pf/attachments/20061229/cb89b6e5/attachment.pgp
More information about the freebsd-pf
mailing list