ZFS Few Questions

Tom Evans tevans.uk at googlemail.com
Fri Nov 18 12:19:09 UTC 2011


On Thu, Nov 17, 2011 at 9:36 PM, Matthew Seaman
<m.seaman at infracaninophile.co.uk> wrote:
> On 17/11/2011 19:04, Mark Felder wrote:
>>> Question 3:
>>> Anyone Recommend for MySQL server? (Performance)
>>
>> No idea; I haven't run any SQL servers on ZFS
>
> The sort of randomly located small IOs that RDBMSes do is the hardest
> sort of IO pattern for ZFS (or any filesystem for that matter) to
> manage.  ZFS has a particular problem in that its default storage unit
> is a 128kB block -- and the copy-on-write semantics mean that the
> filesystem layer can in principle end up doing a 128kB read, altering a
> few bytes, then doing a 128kB write to get that data back on disk.
>
> You can get pretty reasonable DB performance on ZFS, but it takes quite
> a bit of tuning.
>
>   * ZFS needs plenty of RAM.  The DB needs plenty of RAM.  Exactly
>     what the balance should be is hard to predict -- dependent on
>     specific workloads -- so expect to spend some time benchmarking
>     and experimenting with different settings.
>
>   * Putting the ARC (Adaptive Replacement Cache) on a separate, fast
>     device will make a big difference to performance.  SSD cards are
>     popular for this purpose.  (Be aware though that SSDs have a
>     limited lifetime, and tend to fail suddenly and completely when
>     they do wear out.  You will need multiple layers of resilience and
>     very good backups...)  While SSD cards are intrinsically faster
>     than individual rotating magnetic media, they are no match for a
>     large disk array that can spread the IO over lots of spindles.
>     But that costs a very great deal of money...
>
>   * Reducing the ZFS block size (the recordsize property when creating
>     a zfs) to match the IO size of your DB system can help a lot.  Do
>     this before creating the database.
>
>   * Separating the DB's data and transaction logging onto separate ZFS
>     pools helps.
>
> See http://www.solarisinternals.com/wiki/index.php/ZFS_for_Databases for
> more details.  Just about everything on that page applies equally to
> FreeBSD as it does to Solaris.
>
>        Cheers,
>
>        Matthew
>

If you are running a write heavy database, in addition to what Matthew
has said, you will definitely want a separate ZIL from your pool.

To speed up reads, you will want to allocate as much to ARC as you can
spare from your applications.
L1 ARC is RAM; set vfs.zfs.arc_max in loader.conf to control the
maximum amount of RAM you want to use.
L2 ARC is optional, to add it you need to add cache devices to your
pool. You can lose the L2 ARC from the pool without side effects, so
just add some ssds like so:
  zpool tank add cache ada0 ada2

To speed up synchronous writes, you need to add a dedicated ZFS Intent
Log (ZIL). If you don't specify a separate ZIL, then part of the pool
is used to be  the ZIL. Some versions of ZFS would complain loudly
(panic) if the ZIL disappeared, I think in 9.0 it does not, but you
should use a mirror anyway:
  zpool tank add log mirror ada1 ada3

Rather than adding extra drives, you can use PCIe SSD plugin cards,
which are super fast. The ones we use present two drives per device
rather than one, we put two cards in each machine, and we use one
drive on each device for L2ARC and ZIL. It's only in testing so far -
we're waiting for 8.3 to be released - but it works nicely.

Cheers

Tom


More information about the freebsd-performance mailing list