forwarding/ipfw/pf evolution (in pps) on -current

Wed Apr 24 10:45:53 UTC 2013

Hi all,

here is the result of my simple-and-dummy bench script regarding
forwarding/ipfw/pf performance evolution on -current on a single-core
server with one flow only.
It's the result of more than 810 bench tests (including reboot between
each) done twice for validating my methodology.

# Disclaimer #

1. It's not a "max performance" bench: The purpose is to graph the
variation of the performance only.
2. I know that using a single-core server in 2013 is a stupid idea but
it's all I've got on my lab :-(

# Why all these benchs ? #

I've found performance regression regarding packet forwarding/ipfw/pf
speed on -current comparing to 9.1 on my old server.
glebius@ ask me to do some bisection hunting on different -current
revision for spotting the culprit commit.
But as a lazy guy, in place of doing bisection, I've choose about 50
svn revision and graph them all: It's a lot's more easy to script this
than a bisection algorithm :-)
And the result is interesting…

# The results #

The gnuplot diagram in png format with some confirmed specifics spots
is available here:
http://gugus69.free.fr/freebsd/benchs/current/current-pps.png

A confirmed spot is a measurable change between revision N-1 and revision N.

=> Remember that I'm used a single-core before reading the result!
The "regression" of the new SMP pf is not really a regression: The
system is now usable during this high PPS bench and it was not the
case before this improvement.

## gnuplot data ##

Available here: http://gugus69.free.fr/freebsd/benchs/current/plot/
It's the data and plot file used for generating the graph: You can use
them for zooming on it.

## ministat data ##

Available here: http://gugus69.free.fr/freebsd/benchs/current/ministat/

You can use it for comparing result between 2 revision, like as example:
ministat -s 242160.ipfw 242161.ipfw

## raw data ##

Outpout of pkg-gen during all tests:
http://gugus69.free.fr/freebsd/benchs/current/raw/

## nanobsd images #

All binary mages used for these benchs are here:
http://gugus69.free.fr/freebsd/benchs/current/nanobsd-images/

There is only one "full" image to be used for the first installation,
and all other are "upgrade" image.
They use the serial port as default console too.

# Methodology used #

## First step: building a small lab ##

I've used 3 old unused servers and a good switch:
- One server as netmap pkt-gen packet generator (1.38Mpps of minimum
size packet);
- One server as netmap pkt-gen receiver;
- One server with 2 NIC in the middle as a router/firewall, serial
connection, and nanobsd image on it (very easy to upgrade): IBM
eServer xSeries 306m with one core (Intel Pentium4 3.00GHz,
hyper-threading disabled) and a dual NIC 82546GB connected to the
PCI-X Bus;
- a Cisco Catalyst switch for connecting all (its own statistics can
be used as a tie breaker if I've got a doubt regarding the result
given by netmap pkt-gen).

All servers have another NIC for the admin network (bench script send
SSH commands and nanobsd image upgrade over this dedicated NIC).

I've used netmap pkt-gen for generating smallest packet size from the
generator to the receiver like that:
pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10
Results was collected on the pkt-gen receiver.

## Second step: building small nanobsd images ##

Now we need lot's of small nanobsd images generated from the svn
revision number selected for the bench: cf script [1].
About 50 revisions were selected between 236884 to 249506: Candidate
chosen by reading the svn commit log.

## Third step: auto-bench script ##

This auto-bench script [2] do these tasks:
1. Upgrading the server to the release to be tested;
2.   Uploading configuration set to be tested (forwarding-only, ipfw
or pf) & reboot;
3.     Start the bench test, collecting the result, and reboot: 5
times for each configuration-set;
4    Loop to next configuration set;
5. Loop to next release.

## Last step: converting result for ministat and gnuplot ##

I've used a last script for interpreting the output of pkt-gen
receiver for ministat and gnuplot [3].

Because I'm not sure if I've used the good method for preparing my
data, here is how I've generated the ministat and gnuplot graph:

For just one test, the output of pkt-gen in receive mode is lot's of
lines like that:
main [1085] 400198 pps
main [1085] 400287 pps
main [1085] 400240 pps
main [1085] 400235 pps
main [1085] 400245 pps
...

I've calculated the median value [3] (thanks ministat) all these
results: This give me only one number for the test.
=> I did the same for each of the 5 same bench tests (same
configuration-set, just a reboot between them). And I've put these 5
numbers in the file named SVN-REV.CONFIG-SET.
=> From these 5 numbers, I've calculated the "median" value again:
This give me a unique performance number that I've used as gnuplot
data file.

## Bisection ##

>From this first result, I've selected others svn revision to
generated: The goal was to spot the exact commit that brings the
change.
But it was not feasible for all regression spotted, because of
unbuildable source or non-bootable resulting nanobsd image.

## Final: a full re-run ##

Once all my benchs done, I've wait few days and re-started all tests a
second time: Before to publish my result, I would to check that all my
results were reproducible.

# Annexes #

## configuration sets ##

### common to all configuration ###
Forwarding enabled
Ethernet flow-control disabled (dev.em.0.fc=0 and/or dev.em.0.flow_control=0)
NIC drivers tunned:
  hw.em.rx_process_limit: 500
  hw.em.txd: 4096
  hw.em.rxd: 4096
static ARP entry configured on all server and static MAC/Pport entry
on the switch too (prevent the switch to age out the packet receiver's
MAC address).

### forwarding ###
nothing special

### ipfw ###

/etc/ipfw.rules:
  #!/bin/sh
  fwcmd="/sbin/ipfw"
  # Flush out the list before we begin.
  ${fwcmd} -f flush
  ${fwcmd} add 3000 allow ip from any to any

### pf ###

/etc/pf.conf:
  set skip on lo0
  pass

[1] http://sourceforge.net/p/bsdrp/code/HEAD/tree/trunk/BSDRP/tools/bisection-gen.sh
[2] http://sourceforge.net/p/bsdrp/code/HEAD/tree/trunk/BSDRP/tools/bench-lab.sh
[3] http://sourceforge.net/p/bsdrp/code/HEAD/tree/trunk/BSDRP/tools/bench-lab-ministat.sh