Scheduler + IPC performance on FreeBSD 7.4, 8.2, 9.0 and -CURRENT

Arnaud Lacombe lacombar at gmail.com
Thu Apr 5 18:03:02 UTC 2012


Hi folks,

Over the past months, I ran on a couple of unused box the
`hackbench'[HACKBENCH] benchmark used by the Linux folks for tracking
down various kind of regression/improvement. `hackbench' is a
scheduler + IPC test (socket xor pipe). It creates producers/consumers
groups and let a variable quantity of small messages flow happily.
Producers and consumers are either processes xor threads.

Tested platforms were
 - Atom D510, Intel, (incomplete)
 - Core 2 Quad Q9560, Intel
 - Soekris net5501, AMD (incomplete)
 - Xeon E5645, Intel (incomplete)
 - Xeon E5620 (dual package), Intel
 - Xeon E5-1650 (pending completion)
 - Vortex86, DMP

Tested kernel were:
 - FreeBSD 7.4-RELEASE
 - FreeBSD 8.2-RELEASE
 - FreeBSD 9.0-RC3 and FreeBSD 9.0-RELEASE
 - FreeBSD 10-CURRENT as of r231573

on the following architecture:
 - amd64 (if supported, incomplete)
 - i386

1) DISCLAMER

Let me start by pointing out something important:

 [I] "I am _not_ interested in testing released version FOO with
feature BAR enabled, if enabling BAR require a kernel rebuild."

All tests for release kernels were made for the kernel shipped
officially, it is the developers responsability to decide whether or
not enable a feature by default, not mine. If option BAR gives a 20%
performance improvement, enable it, don't complain the test to be 20%
slower.

 [II] "I am _not_ interested in altering any hints, tunables or
sysctl, unless they prevent the execution of the test."

The exception added to the above rule is due to limitation introduced
by `kern.threads.max_threads_per_proc' and `kern.ipc.maxpipekva'.
Those were respectively set to 8192 and between 16M/64M.

note: rule [I] is alleviated for -CURRENT kernels, which were built
with the same alteration made to GENERIC during the CURRENT->RELEASE
transition (ie. WITNESS and a couple of other option disabled).


2) Tests description

`hackbench' has the following tunable:
 - IPC to use for messaging, either `pipe' or `socket'.
 - Threading model, either `thread' or `process'
 - Number of iteration to run
 - Number of group to create

The tests covered all of these adjustments more or less heavily
depending on the platform capability.


3) Scripts

Test scripts are available in the `master' branch of the git repository at:

https://github.com/lacombar/hackbench

in the `hackbench/' directory.


4) Results

Full results are available in the `runs/*' branches of the GitHub repository.


5) Quick results summary

 * UP case

FreeBSD 9.0 behaves better than FreeBSD 8.2 in process mode,
especially with sockets. Results are comparable with thread. 9.0-RC3
shows a 10% hits in thread/socket mode on the LX800, this will need
confirmation.

Linux is stable and scales linearly in all situation. It is only
beaten by FreeBSD 8.2-RELEASE with thread/socket.

 * MP case

These is a pretty bad regression with FreeBSD 9.0 in thread/pipe mode,
which scale almost in O(N^2), ending up in way worse performance than
FreeBSD 7.4 or 8.2 on the Core 2 Quad. Beside that, it is really
difficult to draw a general trends, ie. whether FreeBSD 9.0 behaves
better than FreeBSD 8.2, or the other way around. Pretty much all
situation arises, FreeBSD 9.0 can beat FreeBSD 8.2 on some workload,
behave the same, or be beaten on others. None really scales regularly
either. Pretty much every runs shows thresholds where scheduling
decision change and/or became erratic.

6) Anticipated question and remarks

Q1: "You should truly enable kick-ass feature BAZ in the kernel."
R1: "I'm lazy. Do your job as a developer to integrate the feature. If
it should be the default, make it the default."

Q2: "You should set `kern.vm.whatever' to 42, or enable feature BAZ in
the kernel, to get full performance from the Warp engine on
Constellation-class starship."
R2: "Would you ask Lt. Worf to re-aligh plasma injectors or would ask
Lt. Commander La Forge to plan an assault, seriously ?"

Q3: "You built the Linux kernel, why can't you rebuild FreeBSD's ?"
For a couple of reason:

 - the Linux kernel does not provide binary release per-se.

 - the Linux kernel was not the focus of the tests, but merely a
comparative of what others-can-do.

 - I did not tweak the Linux kernel configuration. The kernels
configuration tested derived from the `defconfig', with very few
amendment[0], mostly about hardware support not enabled by default

Q3: "Could you post all the graph ?"
R3: I could, but there is really tons of them, so posting a subset of
them would be subjective, all the materials is available on the git
repository.

Q4: "So, how can I get all the graph ?"
R4: All you need is git, a posix shell, a couple of utility (find,
sort, ...), a recent gnuplot, and a ruby interpreter.

Comments and suggestions will be greatly appreciated.

 - Arnaud

[HACKBENCH]: http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c

[0]: the exact list is:

# CONFIG_KERNEL_GZIP is not set
CONFIG_KERNEL_XZ=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_MODULES is not set
CONFIG_X86_BIGSMP=y
CONFIG_NR_CPUS=32
CONFIG_PATA_IT8213=y
CONFIG_PATA_IT821X=y
CONFIG_IGB=y
CONFIG_IGBVF=y
CONFIG_IXGB=y
CONFIG_IXGBE=y
CONFIG_IXGBEVF=y
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y


More information about the freebsd-performance mailing list