[CFT] SMP-friendly pf

Ermal Luçi eri at freebsd.org
Fri Jun 8 10:39:45 UTC 2012

On Fri, Jun 8, 2012 at 8:17 AM, Gleb Smirnoff <glebius at freebsd.org> wrote:
>  Hello, networkers!
>  [net@ in Cc, but further discussion should go on pf@]
>  As you already probably know, or some may be don't yet know, the pf(4)
> subsystem in FreeBSD is currently working under a single mutex. This mutex
> is acquired right at the beginning of any packet processing, and is dropped
> at the end. While one thread is in pf(4) all other threads are blocked on
> that mutex.
>  Meanwhile modern computers are getting more and more cores, and modern
> network cards getting more MSI interrupts, each serviced by a separate kernel
> thread in FreeBSD. So the single pf lock, which I call "the pf Giant" :), is
> getting a point of hard contention.
>  Three and a half months ago I've started on a project "SMP-friendly pf",
> which recently have entered alpha stage. As you see from the subject of this
> mail, this is call for testing.
>  Willing to test?

As i already asked in private wihtout a documentation/schema
describing how you protect the various elements in pf(4) this is very
hard to review.
- What do you do to allow correctness on statistics?
- What do you with tables protection, are they under same lock as rules...?
- How is if-bound versus floating states maintained?
- What is protecting scrub ruleset?
- What is protecting nat ruleset?
- How you solved synproxy ? Is it scalable?
- Do you think you have introduced possiblity of security issues with
taskqueues you introduce?

There are many how? in this implementation that are difficult to see
without you telling!

>  The code lives in projects/pf/head branch in the SVN, and can be checked
> out with:
>  svn checkout http://svn.freebsd.org/base/projects/pf/head pflock
> , where argument "pflock" is just directory name for checked out sources.
>  Then you need to build world and kernel from that branch and install them.
> The branch projects/pf/head gets head merged to it quite often, so if you
> run head world with a revision equal (or at least close) to last merge, then
> you don't need to install world, however rebuilding pfctl and snmp_pf from
> that branch is necessary.
>  If you are about to run this alpha pf on any important box, then you
> definitely need to establish safety measures: have a second box running
> stable/9 or head as carp(4) backup, ready to kick in, in case if new pf
> panics. pfsync(4) connection should also be established between new and
> backup boxes. pfsync(4) in the new code is wire compatible with stable/9
> or head.
>  I'm already running it on routers with 100k - 200k state entries, and
> forwarding 20k - 40k pps. If you are brave, you should try, too :) Good
> luck and report any problems to me!
>  Interested in details?
>  From the very beginning of the project it was clear, that code is going
> to diverge significantly from original OpenBSD code. OpenBSD has always
> developed pf without taking into account that code can ever get
> multithreaded, thus quite a lot needed to be changed. Thus, I've started
> with removing the "#ifdef __FreeBSD__" from the code, and later I didn't
> hesitate even a fraction of second if I wanted to toss some code. The pros
> is that now code is much more readable and understandible then in head,
> the cons is that diff between us and OpenBSD is huge, although amount
> of shared code is huge, too. So, later on only manual merging of features
> from OpenBSD is possible and bulk imports of entire pf into FreeBSD are
> no longer possible.
>  The locking scheme is the following:
> - There is an rwlock(9) that protects rules and all kind of data that isn't
>  modified by forwarding threads. Forwarding threads reader lock it, ioctl()
>  and other reconfiguring events write lock it.
> - The states and key states storage had moved from RB-trees to hashes, with
>  separate mutexes per hash slot. This should give us decent parallelism
>  when forwarding packets.
> - Source nodes storage moved to hash with per-slot locking.
> - pfsync(4) got separate mutex.
> - fragment reassembly got separate mutex.
>  Apart from the above key changes, many other optimisations and fixes done.
> The entire diff is 22k lines large. You can view the projects history here:
> http://svnweb.freebsd.org/base/projects/pf/head/?view=log
> (the beginning is on page 2 now, at r232042) I had tried to make informative
> commit messages.
> --
> Totus tuus, Glebius.
> _______________________________________________
> freebsd-pf at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-pf
> To unsubscribe, send any mail to "freebsd-pf-unsubscribe at freebsd.org"


More information about the freebsd-pf mailing list