sysbench / fileio - Linux vs. FreeBSD
Matthew Dillon
dillon at apollo.backplane.com
Sun Jun 6 17:20:22 UTC 2010
:All of these tests have been apples vs. oranges for years.
:
:The following seems to be true, though:
:
:a) FreeBSD sequential write performance in UFS has always been less than
:optimal.
If there's no read activity sequential write performance should be
maximal with UFS. The keyphrase here is "no read activity".
UFS's main problem, easily demonstrated by running something like
blogbench --iterations=100, is that read I/O is given such a huge
precedence over write I/O it can cause the write I/O to come to
a complete grinding halt once the system caches are blown out and
the reads start having to go to disk.
Another big issue with filesystem benchmarks is the data footprint
size of the benchmark. Many benchmarks do not have a sufficiently large
data footprint and wind up simply testing how much memory the kernel
is willing to give over to cache the benchmark's tests, instead of
testing disk performance. Bonnie++ is a really good example of the
latter problem.
That said, all the BSDs have stall issues with parallel read & write
activity on the same file. It essentially comes down to the vnode
lock held during writes which can cause reads on the same file to
stall even when those reads could be satisfied from the VM/BUF cache.
Linux might appear to work better in such benchmarks because Linux
essentially allows infininte write buffering, up to the point where
system memory is exhausted, and the BSDs do not. Infinite write
buffering might make a benchmark look good but it creates horrible
stalls and inconsistencies on production systems.
I noticed that FreeBSD's ZFS implementation issues VOP_WRITE's
with a shared lock instead of an exclusive lock, thus avoiding
this particular problem. It would be possible to do this with UFS
too with some work to prevent file size changes from colliding during
concurrent writes, or even using a separate serializer for
modifying/write operations so read operations can continue to run
concurrently.
blogbench is a good way to test read/write interference during the
system-cache phase of blogbench's operation (that would be the first
500-800 or so blogs on a 4G system). If working properly both read and
write operations should be maximal during this phase. That is, the
disk should be 100% saturated with writes while all reads are still
fully satisfiable from the buffer cache / VM system, and at the same
time the read rate should not suffer (not be seen to stall).
It would be interesting to see a blogbench comparison between UFS
and ZFS on the same hw/disk.
-Matt
More information about the freebsd-hackers
mailing list