I am preparing a test of different FreeBSD firewalls in our lab, before
doing so I am trying to push maximum 2 gbps of traffic through the machine
with a simple routed on it in the most optimal way.

The lab setup is as following:

4 x traffic generators machines: Dual Opteron, generic FreeBSD 6.1 / AMD64
kernel + iperf 2.02. The iperf between the machines directly is almost
always around ~930 megabit, which is fine (See table referenced later for
detailed results).
1 x Firewall machine, which is a Dell 2650 Server, for detailed specs
please see:


HZ and Pooling values in those config files have been changed by me during
test several times as you will see in results table.
The kernels have pf compiled in but it is not turned on at this time.

The network topo is:
And here are the results:

My questions are:

* Single stream / single thread is always slower then in direct machine to
machine communication, full throughput is reached only with multiple
threads. Why?
* In polling mode, there seems to be a "magic wall" between 1.3 and
1.7gbps where INT CPU usage suddenly jumps up from almost nothing to over
45+ and throughput stops there, Why? Can this be changed?
* Any other ideas on improving performance of this box?

Thanks ahead for help!


