ZFS: How to enable cache and logs.

Jeremy Chadwick freebsd at jdc.parodius.com
Wed May 11 10:51:21 UTC 2011


On Wed, May 11, 2011 at 01:37:03PM +0300, Daniel Kalchev wrote:
> On 11.05.11 13:06, Jeremy Chadwick wrote:
> >On Wed, May 11, 2011 at 07:25:52PM +1000, Danny Carroll wrote:
> >>When I move to v28 I will probably wish to enable a L2Arc and also
> >>perhaps dedicated log devices.
> >>
> >In the case of ZFS intent logs, you definitely want a mirror.  If you
> >have a single log device, loss of that device can/will result in full
> >data loss of the pool which makes use of the log device.
> 
> This is true for v15 pools, not true for v28 pools. In ZFS v28 you
> can remove log devices and in the case of sudden loss of log device
> (or whatever) roll back the pool to a 'good' state. Therefore, for
> most installations single log device might be sufficient. If you
> value your data, you will of course use mirrored log devices,
> possibly in hot-swap configuration and .. have a backup :)

Has anyone actually *tested* this on FreeBSD?  Set up a single log
device on classic (non-CAM/non-ahci.ko) ATA, then literally yank the
disk out to induce a very bad/rude failure?  Does the kernel panic or
anything weird happen?

I fully acknowledge that in ZFS pool v19 and higher the issue is fixed
(at least on Solaris/OpenSolaris), but at this point in time the RELEASE
and STABLE branches are running pool version 15.

There are numerous ongoing discussions about the ZFS v28 patches right
now with regards to STABLE specifically.  Recent threads:

- Patch did not apply correctly (errors/rejections)
- Patch applied correctly but build failed (use "patch -E" I believe?)
- Discussion about when v28 is *truly* coming to RELENG_8 and if it's
  truly ready for RELENG_8

And finally, there's the one thing that people often forget/miss: if you
upgrade your pool from v15 to v28 (needed to address the log removal
stuff you mention), you cannot roll back without recreating all of your
pools.  Folks considering v28 need to take that into consideration.

> By the way, the SLOG (separate LOG) does not have to be SSD at all.
> Separate rotating disk(s) will also suffice -- it all depends on the
> type of workload. SSDs are better, for the higher end, because of
> the low latency (but not all SSDs are low latency when writing!).

I didn't state log devices should be SSDs.  I stated cache devices
(L2ARC) should be SSDs.  :-)  A non-high-end SSD for a log device is
probably a very bad idea given the sub-par write speeds, agreed.  A
FusionIO card/setup on the other hand would probably work wonderfully,
but that's much more expensive (you cover that below).

> The idea of the SLOG is to separate the ZIL records from the main
> data pool. ZIL records are small, even smaller in v28, but will
> cause unnecessary head movements if kept in the main pool. The SLOG
> is "write once, read on failure" media and is written sequentially.
> Almost all current HDDs offer reasonable sequential write
> performance for small to medium pools.
> 
> The L2ARC needs to be fast reading SSD. It is populated slowly, few
> MB/sec so there is no point to have fast and high-bandwidth
> write-optimized SSD. The benefit from L2ARC is the low latency. Sort
> of slower RAM.

Agreed, and the overall point to L2ARC is to help with improved random
reads, if I remember right.  The concept is that it's a 2nd layer
of caching that shouldn't hurt or hinder performance when used/put in
place, but can greatly help when the "layer 1" ARC lacks an entry.

> It is bad idea to use the same SSD for both SLOG and L2ARC, because
> most SSDs behave poorly if you present them with high read and high
> write loads. More expensive units might behave, but then... if you
> pay few k$ for a SSD, you know what you need :)

Again, agreed.

Furthermore, TRIM support doesn't exist with ZFS on FreeBSD, so folks
should also keep that in mind when putting an SSD into use in this
fashion.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-fs mailing list