[CFT] Improved ZFS metaslab code (faster write speed)

Martin Matuska mm at FreeBSD.org
Sun Aug 22 15:15:04 UTC 2010

Dear FreeBSD community,

many of our [2] (and Solaris [3]) users today are complaining about slow
ZFS writes. One of the causes for these writes is the selection of the
proper allocation method for allocation of new blocks [3] [4]. Another
issue a write slowdown during TXG sync times.

Solaris 10 (and OpenSolaris up to november 2009) have the
following scenario:

- pool has more than 30% free space: use first fit method [1]
- pool has less than 30% free space: use best fit method [1]

This causes a major slowdown of the writes if we go below 30% of free
space. On large pools, 30% may be terabytes of free space.

OpenSolaris has changed this in November 2009 and the Oracle Storage
Appliances also included the new code in Q1/2010 [1].

The source [1] states, that with this change they archieved a speedup
of: "50% Improved OLTP Performance, 70% Reduced Variability, 200%
Improvement on MS Exchange"

I would like to issue a Call For Testing for the following 9-CURRENT patch:

To apply the patch against 8-STABLE, you need to apply the v15 update first:

The patch includes the following OpenSolaris onnv revisions:
10921 (partial), 11146, 11728, 12047

And covers the following Bug IDs:
6826241 Sync write IOPS drops dramatically during TXG sync
6869229 zfs should switch to shiny new metaslabs more frequently
6917066 zfs block picking can be improved
6918420 zdb -m has issues printing metaslab statistics

[1] http://blogs.sun.com/roch/entry/doubling_exchange_performance
[2] http://forums.freebsd.org/showthread.php?t=8270
[4] http://blogs.sun.com/bonwick/entry/zfs_block_allocation
[5] http://blogs.sun.com/bonwick/entry/space_maps

More information about the freebsd-current mailing list