ZFS L2ARC statistics interpretation

Sami Halabi sodynet1 at gmail.com
Fri Aug 21 13:20:57 UTC 2015


Will there be a patch for 10.2 ?
בתאריך 21 באוג׳ 2015 15:33,‏ "Andriy Gapon" <avg at freebsd.org> כתב:

> On 20/08/2015 10:34, Andriy Gapon wrote:
> > On 20/08/2015 03:29, Gary Palmer wrote:
> >> On Wed, Aug 19, 2015 at 04:08:47PM -0700, Wim Lewis wrote:
> >>> I'm trying to understand some problems we've been having with our ZFS
> systems, in particular their L2ARC performance. Before I make too many
> guesses about what's going on, I'm hoping someone can clarify what some of
> the ZFS statistics actually mean, or point me to documentation if any
> exists.
> >>>
> >>> In particular, I'm hoping someone can tell me the interpretation of:
> >>>
> >>> Errors:
> >>>    kstat.zfs.misc.arcstats.l2_cksum_bad
> >>>    kstat.zfs.misc.arcstats.l2_io_error
> >>>
> >>> Other than problems with the underlying disk (or controller or cable
> or...), are there reasons for these counters to be nonzero? On some of our
> systems, they increase fairly rapidly (20000/day). Is this considered
> normal, or does it indicate a problem? If a problem, what should I be
> looking at?
> >>>
> >>> Size:
> >>>    kstat.zfs.misc.arcstats.l2_size
> >>>    kstat.zfs.misc.arcstats.l2_asize
> >>>
> >>> What does l2_size/l2_asize measure? Compressed or uncompressed size?
> It sometimes tops out at roughly the size of my L2ARC device, and sometimes
> just continually grows (e.g., one of my systems has an l2_size of about
> 1.3T but a 190G L2ARC; I doubt I'm getting nearly 7:1 compression on my
> dataset! But maybe I am? How can I tell?)
> >>>
> >>> There are reports over the last few years [1,2,3,4] that suggest that
> there's a ZFS bug that attempts to use space past the end of the L2ARC,
> resulting both in l2_size being larger than is possible and also in
> io_errors and bad cksums (when the nonexistent sectors are read back). But
> given that this behavior has been reported off and on for several years
> now, and many of the threads devolve into supposition and folklore, I'm
> hoping to get an informed answer about what these statistics mean, whether
> the numbers I'm seeing indicate a problem or not, and be able to make a
> judgment about whether a given fix in FreeBSD might solve the problem.
> >>>
> >>> FWIW, I'm seeing these problems on FreeBSD 10.0 and 10.1; I'm not
> seeing them on 9.2.
> >>>
> >>>
> >>> [1]
> https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html
> >>> [2] https://forums.freebsd.org/threads/l2arc-degraded.47540/
> >>> [3]
> https://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020256.html
> >>> [4] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198242
> >>
> >>
> >> I think the checksum/IO problems as well as the huge reported size
> >> of your L2ARC are both a result of a problem described at the following
> >> url
> >>
> >> https://reviews.freebsd.org/D2764
> >>
> >> Not sure if a fix is in 10.2 or not yet.
> >
> > The fix is not in head yet.
> > And the patch needs to be rebased after the recent large imports of the
> > upstream code.
>
> An updated patch for head is here
> https://reviews.freebsd.org/D2764?download=true
> Testers are welcome!
>
>
> --
> Andriy Gapon
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>


More information about the freebsd-fs mailing list